In May 2020, OpenAI released ChatGPT and received a lot of press due to the size of the innovation and the people involved in the creation such as Elon Musk and Sam Altman. Not so long ago, Amazon released Alexa in 2014. ChatGPT has a pre-trained model offered on chats and APIs, meanwhile, Alexa is a virtual assistant. What are the differences between them? How do they compare? In this article, I will address this question in detail.
What is ChatGPT? And how does it work?
ChatGPT is a member of the generative pre-trained transformer (GPT) family of language models. Its language model has been supervised, trained, and refined over time with many dialogues by using reinforcement learning from human feedback (RLHF).
The main feature of ChatGPT is to have conversations via a chat interface, however, it is a lot more versatile than that. For instance, in the SEO field, it can create outlines for articles or research keywords. Conversely, text-based interfaces can make it difficult to interpret the intention behind a complex question or conversation accurately.
Furthermore, ChatGPT offers APIs. And this is one of the most exciting offerings, in my opinion. The range of offerings includes:
- Text completion: prompt a limitless number of text completion tasks and ChatGPT will figure out how to resolve it
- Code completion: turn comments into code, complete next line, rewrite code for efficiency, comment code, etc.
- Chat completion: draft an email, complete code, tutor in a range of subjects, create conversational agents, etc.
- Image generation: it can create based on a text, edit based on a prompt or make variations to an existing image
- Embedding for:
- Classification (where text strings are classified by their most similar label)
- Topic clustering (where text strings are grouped by similarity)
- Search (where results are ranked by relevance to a query string)
- Recommendations (where items with related text strings are recommended)
- Anomaly detection (where outliers with little relatedness are identified)
- Speech to text: it can transcribe audio into whatever language the audio is in and translate and transcribe the audio into English
- Moderation: it can classify into different categories (hate, threatening, self-harm, violence, etc.)
What is Amazon Alexa? And how does it work?
Amazon Alexa is a voice assistant. Voice assistants are software that recognizes voice, process language using specialized algorithms, and are able to synthesize voice to listen to specific voice commands and return relevant information or perform specific functions as requested by the user.
Under the hood, voice assistants are composed of several components that put together do the magic of talking back to you and holding a conversation. For instance, one of PentaTech’s voice applications called Super Hero Battle has the following design:
Breaking down responsibilities into different sub-systems allows each component to specialise in what they do best. For instance, the Amazon Alexa service takes care of speech recognition, natural language understanding and text-to-speech; the Alexa Skills Kit allows to build custom-made applications like the ones in PentaTech portfolio and finally, a back-end to resolve the application logic.
What is more, Amazon Alexa architecture allows us to make calls to ChatGPT APIs to produce a conversation and enrich it based on needs. In this way, it could achieve the best of both worlds; ChatGPT full flexibility and conversation capabilities and domain and goal driven from Amazon Alexa.
Table of comparison between ChatGPT and Amazon Alexa
|Core built-in functionalities
|Chatbot that mimics a human conversationalist. Versatile capabilities in a wide range of scenarios: write and debug computer programs, compose music, teleplays, fairy tales, and student essays; answer test questions; write poetry and song lyrics; simulate an entire chat room; play games like tic-tac-toe; and simulate an ATM.
Also offers APIs to access a variety of functionalities.
|Voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information, such as news
|Control of other devices
|Not possible to control other devices at the moment
|Control several smart devices using itself as a home automation system
|Custom functionality – extensibility
|Extensibility is not possible at the moment. However, you can compose using APIs
|Support custom skills that are developed by third parties and available via marketplaces
It can call ChatGPT APIs
|+ Can handle a wide range of conversational topics and can ‘adapt’ to new ones
+ Coherent responses in a sequence of interactions (conversation)
+ Powerful and flexible to build Natural Language Processing (NLP) systems
|+ Follows a predetermined plan so the output is more predictable
+ Efficient use of computing resources
+ Good for specific use cases/goals, more control over the outcome
|– Specific uses cases can be hard to fine tune
– Irrelevant responses
– Large number of resources to run
|– Limited to the predetermined plan
– Not able to ‘adapt’ to new conversational topics
– Harder to get a sequence of interactions (conversational like)