Boost your voice applications Multimodal solutions

Voice solutions are not just about voice. This article compares applications based on voice first, only voice and voice added.

Voice applications are not just about voice. This article compares applications based on voice first, only voice and voice added.

Single mode

A solution with a unique mode of operation or interfacing only offers one way of interaction. For instance, a Google Home or an Alexa Echo Dot only allows voice as input and output. You could argue that these devices also present some visual cues to highlight when it is listening, muted or processing a request converting them into multimodal. Let’s now dive into multiple modes of operation or interfacing.

Voice Applications - Amazon Echo Dot
Figure 1. An Amazon Echo Dot is an example of single mode (voice) with visual cues

Introducing multimodal

The different ways to input and output are called modes. Devices have voice, display, buttons and visual cues to interact with the user. When you combine all these different modes into a single application, we call it a multimodal solution.
Each mode can serve a different purpose. For instance, voice is great for short back and forth pieces of information like “Is it going to rain today? There is a chance of few showers.”; and for streaming music, news, and so on. A visual display is great for collating large amount of data into a single display. For instance, in the Alexa voice app called 4mates for a device with screen, it shows the different types of empanadas available in a scrollable screen. This would be difficult to achieve using only voice.
Solutions need to be ready to work on one or more modes because Google Assistant and Alexa can run on different devices that support only voice, voice first and voice added modes. I coined this “Responsive Voice Design”.

Figure 2. The Google Nest Hub Max provides a multimodal experience combining voice-first, touchscreen, visual cues and buttons

Voice only devices

These devices such as the Google Home and Amazon Echo Dot only accept voice input and output. However, the user experience is enriched with visual cues such as listening and processing spinning light, and mute button and respective visual light.

Voice first devices

Google Home Hub or Alexa Echo Show are examples of devices that are activated via voice and enhance the user experience with screens, lights, buttons, etc. User selections can be made using touch screens or voice activation, creating endless opportunities for engaging solutions.

Voice added

Mobile phones and laptops are good examples of voice added devices. In this case, the main way of interfacing is either keyboard or touch screen; however voice is added to enhance the experience and provide accessibility.

Figure 3. Mobile phones are a good example of voice added functionalities

Responsive Voice Design

In the same way that Responsive Web Design is the approach in which the design and development responds to the user’s behaviour and environment based on screen size, platform and orientation.
A Responsive Voice Design is the approach in which the design and development responds to the user’s behaviour and device based on the mode, voice first, only voice and voice added.