How are Alexa voice applications designed?

Alexa voice applications are designed using serverless architectures. They are different to other apps in the sense that they are voice-first applications, so voice interactions are the primary concern and visuals are helping with usability.

Alexa voice applications are designed using serverless architectures. They are different to other apps in the sense that they are voice-first applications, so voice interactions are the primary concern and visuals are helping with usability.

Serverless Voice applications reference architectures follow the AWS Well Architected framework [1] and provide guidelines on how to design your solutions. Some of the pillars we follow are:

  • Decoupling between the Alexa Skill and the Lambda function. The skill understands what the user is saying and the Lambda function knows how to resolve the response.
  • Based on APIs to optimize interoperability and development across availability zones, regions and languages
  • Based on elastic resources to meet unpredictable user traffic
  • Secure and adhering to privacy and compliance requirements

PentaTech voice applications

PentaTech has built and published several voice applications available in the Amazon Alexa Skills marketplace for free. An interesting example of how voice applications are architected is Super Hero Battle. It was within the top finalists of the Alexa Skills Challenge: Alexa Conversations.

Super Hero Battle Alexa Skill allows you to create battles between your favorite superheroes. Choose two superheroes and a power item, guess who is going to win and get ready. You will get points every time your guess is correct.

How does the Alexa voice application work?

Super Hero Battle Alexa Skills Serverless Architecture
Super Hero Battle Alexa Skills Serverless Architecture

The journey begins and ends with the customer, the Alexa user, and goes as follows:

  1. Alexa users speak (or even allowed to type in other cases like mobile phones) asking for what they want, for instance, open Super Hero Battle.
  2. Alexa enabled devices such as Smart TVs, assistants such as the echo, echo dot or mobile phones with the Alexa application installed are able to listen for a wake word and activate as soon as one is recognized.
  3. The Amazon Alexa Service performs common Speech Language Understanding (SLU) processing on behalf of the Alexa Skill, including Automated Speech Recognition (ASR), Natural Language Understanding (NLU), and Text to Speech (TTS) conversion.
  4. The Super Hero Battle Custom Skill, based on the Alexa Skills Kit, controls the user experience, including a custom interaction model, intents and Alexa Conversations.
  5. The Super Hero Battle Lambda function has the brains on the architecture. It processes different types of requests sent from the Alexa Service and builds speech responses. It leverages the Super Hero API to understand each Super Hero abilities. We also store images and special audio effects in S3.
  6. Super Hero Database is a Dynamo DB (NoSQL data store) used to persist user state and sessions

For more details on how serverless voice applications work, please contact us. We can help to architect, design, implement, test and release your voice application using the latest technologies and methodologies.

References

  • [1] Serverless Applications Lens – AWS Well-Architected Framework
  • [2] Super Hero Battle DevPost Hackathlon
  • [3] About Alexa Conversations