Thoughts on the user-friendly design of Conversational User Interfaces
“Hey Computer, We Need to Talk”
One of the reasons why we start talking to computers is as simple as it’s obvious: because we can. The technical capabilities to recognize voice and understand natural language have improved dramatically during the past few years. Equally important is the rise of the Internet of Things (IoT), drawing in more and more objects from our immediate environment.
“The Internet of Things makes objects intelligent so they can exchange information over the internet. The virtual world is merging with the real world.” 
IoT-enabled devices have their own identity and are becoming small computers in their own right. But that doesn’t mean they’re similar to a desktop with a keyboard and monitor. Many of those devices don’t even have a display and are often used in a context where a keyboard or display make no sense or are impossible to use because the human can’t touch or look at them. Take the car, a scenario that demands the driver focus on the road instead of typing something or manipulating a control. The same is true in the kitchen. You might have both hands stuck in a dough and don’t want to smudge the keyboard, let alone read up on a recipe while flambéing something. Sometimes it comes down to sheer convenience. It’s simply easier to change the channel on your TV with a voice command because the dog has yet again carried off the remote.
Full (and completely new) control
All these novel devices with their new capabilities call for a completely new interaction paradigm to control them. Gestures would be one option, yet it’s much easier and effective to use your voice -- one of the strongest interaction tools we have. We learn and practice how to speak to get something or to get something done from an early age. Voice also affords us a broader range of expression than gestures, and we don’t need to establish a visual connection but instead can utter commands while we’re doing something else. There are different types of Conversational User Interfaces, yet they all have their advantages and downsides depending on the respective use case.
Voice interfaces like the ones built into hardware such as the Amazon Echo are one example, and they can be used for service hotlines, too. The problem is that some people hesitate to use them in public, since it feels odd to dictate to your smartphone iwthin earshot of strangers: “Siri, add toilet paper, beer and cream cheese to my shopping list.” We simply don’t like to share too many details of our private lives. What’s more, many people have privacy concerns -- for instance when they’re forced to authenticate themselves via voice to their bank.
The amount of information that can be presented by voice is also limited. It simply takes too long to read longer strings of text. It’s also strange for our heavily visually oriented world to listen to something without seeing complementary images or graphics.
The best of two worlds
That’s where multimodal user interfaces like Siri come into play. They combine the best of two worlds. In most cases, the device can be used hands-free, but it also serves up visual results which are very helpful when dealing with longer lists or more complex searches. The multimodal helpers have a downside, though, since they require an app or similar hardware-software combo to work. Pure chatbots are clearly more flexible.
Chatbots use natural language and work intuitively, but the user has to type in his or her request. That takes time and effort, as well as our full attention. On the other hand, they can easily be integrated into any existing channel. Channels such as Facebook or Telegram usually suffice to reach a customer, yet there’s a catch. To generate tangible value for the target audience and the enterprise one has to take a close look at the dialog. A traditional user interface may be the better choice if it’s about simply transferring regular business processes into a chat format.
Agile methods are a good approach to designing Conversational User Interfaces since they provide the opportunity to iterate by trial and error and make improvements. Just having a chat going back and forth shouldn’t be the be-all and end-all. The focus has to be on the user because a conversation is never a straight line. In order to find the right and natural flow, we have to understand the user and based on those insights improve usability.
Here, then, are the seven golden rules for successful design:
- Make a detailed plan how you want to engage the user.
- The project has to be designed in a way to allow iterative work.
- Use methods such as Google Ventures Sprint, agile work, scrum and design thinking.
- Use the right tools to rapidly build and test prototypes.
- Prototype, prototype, and don’t forget to prototype!
- Test early and invite users. They can be part of the creative process.
- Constantly improve your product and don’t be afraid to release something “unfinished” (this one may be particularly hard for large German companies).
Dreaming up solutions: design thinking
Design thinking is the tried and tested technique to help you pick the right use case. It’s a method that enables you to effectively explore an idea and a product as much as possible. The timeframe is flexible, depending on whether you’re after an analytical deep-dive over several weeks or a process that can be done in a week or a single day. At the outset, you need to define what problem needs to be solved before you move on to prototyping and testing. And then the cycle starts again since testing is not the end but the beginning of continuous improvements on the concept.
Your route is being calculated: Customer Journey Mapping
There’s one outrageous question you should ask before you start: Does it make sense to even build a chatbot? To build one just to have one is a waste. In many cases, a traditional app or website can do the job. In order to find out if and where exactly a Conversational User Interface makes sense it’s a good idea to embark on a customer journey map. What’s the target audience? At what point along their customer journey should they come into contact with a digital assistant? What are their pain points and problems?
It’s easier to master the real challenges and problems once you’ve clearly defined your target audiences and made them tangible with personas, but never lose sight of the business value of the chatbots you want to create. It’s pointless to design a digital assistant who doesn’t solve a real-world business problem for your enterprise.
Always on site: Place-onas
A key point to think about are the location and situation in which a user will engage with an interface. Building on the concept of personas, Bill Buxton has coined the expression “place-onas” to highlight specific user needs. The place-ona for cooking, for instance, takes into account that dirty hands prevent you from using a touchscreen, but do let you use your eyes, ears and voice. In a public library setting, on the other hand, a user can use hands and eyes without restrictions while it’s taboo to use her ears and voice.
What a character!
Giving a new digital assistant a personality is another important consideration. People expect their virtual partner to have character. It has to be consistent and must be able to guide the user, for example by asking the correct, appropriate questions without provoking him. The assistant’s personality turns him or her into a brand ambassador, therefore it has to match a company’s brand values.
Enough with the conceptualizing, already. Now it’s time to test prototypes. Since we’re dealing with Conversational Interfaces, models scribbled on a notepad have little value. A team member can play the part of the talking AI to solicit first user reactions. Then you have to jump on the iteration carrousel to improve on the idea. Know and accept that after testing comes more testing. Only living and breathing iteration leads to success.
The final stretch, perhaps
Even after the digital assistant is finally released, it’s wise to continually improve upon it from a user-centric viewpoint. The world will belong to the bot that best satisfies user needs, not the one with the most bells and whistles.
 Fraunhofer IML: FAQ zum Internet der Dinge, (accessed 2017-08-02)