The launch of Amazon Echo and its voice service, Alexa, brought virtual assistants out of our smartphones and into our homes and offices. While the Echo is a solid product, Alexa as a voice platform is where the real value is.
After starting off with 100 things the Echo could do, the number of available Alexa Skills now tops 7,000. CES 2017 showed how eager tech companies are to integrate Alexa, as the Amazon virtual assistant was everywhere at CES, despite the fact that neither the Echo or Alexa had booth space on the show floor.
As such, the interest in developing tools for the platform has skyrocketed, with many developers eager to jump into the ecosystem. To help developers and companies better understand how to get started working with Alexa and its related services, we’ve pulled together the most important details and resources.
Executive summary (TL;DR)
What is the Alexa developer platform: Alexa is the voice service that powers the connected speaker called Amazon Echo. Developers can create skills for the Echo using the Alexa Skills Kit, or integrate Alexa into an existing product through its API via the Alexa Voice Service.
Why it matters: Alexa is one of the most popular virtual assistants available today, and it helped catalyze the current market for standalone, voice-powered assistants. It is also increasingly being integrated into a host of popular products.
Who does this affect: This affects developers who want to get started working with a voice interface and writing Alexa skills, as well as businesses that wish to utilize the power of Alexa through its available API.
When is this happening: Alexa launched in tandem with the Amazon Echo in late 2014, but updates and fixes are delivered regularly, and Amazon is making a concerted effort to reach out to developers.
How to take advantage of Alexa as a developer: Developers can write Skills for Alexa using the Alexa Skills Kit (ASK), add intelligent voice control to additional connected products with the Alexa Voice Service (AVS), or use the Amazon Lex service to build conversational bots.
What is the Alexa developer program
As noted, Alexa is a service that allows for a user interface to leverage a human voice command to perform tasks. The Amazon Echo is one of the primary ways through which these tasks are accessed, which is often set up through a companion app, but the Alexa Voice Service (AVS) can be integrated into a host of other products, as long as they have a microphone and speaker. Amazon also offers the Amazon Lex service, which allows developers to build conversational bots using the same technology that Alexa is based on.
Terren Peterson, an Alexa Champion and the vice president of platform engineering for retail and direct bank at Capital One, said that many people see Alexa as simply a talking speaker. However, Peterson said that the real value of working with Alexa isn’t just the ability to talk back to the speaker, but “the ability to be able to change things with your voice.”
Users interact with Alexa through “skills,” which are created by developers to enable a specific experience through the Alexa Skills Kit (ASK). Currently, developers can create three types of skills: Smart Home Skills for home automation, Flash Briefing Skills for information and news, and Custom Skills for any other kind of request.
Sarah Sobolewski, who works on the PR team for Alexa, said that it is free to use both the ASK to develop skills and the AVS to integrate Alexa. Sobolewski also said that Alexa will continue to play into Amazon’s overall business strategy and its work with developers.
“Much like mobile was a decade ago, we see natural user interfaces like speech as a major shift in computing,” Sobolewski said. “We’re excited by the customer response so far, but it’s still very early and think there’s a lot of potential in this space.”
Why it matters
Alexa has helped to drive interest in the use of voice technology. While other services such as Apple Siri, Google Now, and Microsoft Cortana launched before Alexa, it’s the Amazon service that has become synonymous with voice assistants.
Peterson said that, for software engineering professionals, the popularity of Alexa also raises questions about how professionals should be thinking about user voice as an interface. When our hands are stuck on the keyboard, or consumed with our phone, Alexa gives users access to information and services through hands-free operation, without them having to give up that dexterity.
“Voice provides an entirely new way of interacting with technology that we believe will fundamentally change and improve people’s lives. While the space is relatively new, we’re seeing tons of momentum, with tens of thousands of developers getting in early,” Sobolewski said.
Being that more and more companies are beginning to implement Alexa, and similar services, it also opens the door for new products, and potentially for even more new jobs for software developers. When considering working with the platform, Peterson said that businesses should ask: “What are the things that I can be doing with a voice platform that actually drives value?”
Who does this affect?
Any developers or businesses that want to build out and utilize intelligent, voice-powered services will be affected by advances and changes that are being driven by Amazon Alexa.
Alexa is built using artificial intelligence (AI) technologies, but Sobolewski said that would-be developers don’t need a background in natural language understanding or speech recognition to get started. Additionally, there are beginner tutorials available as well, so even very junior software engineers can start working with the platform.
Brian Donohue, another Alexa Champion and a product engineer at Pinterest, noted that businesses building skills should keep in mind that the platform is new and sometimes skill discoverability can present some challenges. “Also, there are currently no options for directly monetizing your skills (e.g., via purchase),” Donohue said. Skills are not sold like a mobile app is. They are free, and Amazon has said it has no intentions of directly monetizing the platform any time soon. So, developers need to think of using Alexa services to drive value in other parts of their business—at least for now.
Startups that want to build new products and services with AVS or ASK also have the opportunity to apply to the Alexa Fund, a $100 million fund for investing in new voice technologies.
When is this happening?
Amazon originally debuted Alexa alongside the original Echo in 2014. While Echo was impressive in its own right, the ecosystem around Alexa has grown tremendously over the past few years.
Amazon and its partners have rapidly multiplied the number for Skills for Alexa from a handful when it Echo launched to the thousands available today—with more being added almost daily. Amazon has also continued to advance the product with new services like its Alexa-powered Music, and new form factors like the low-cost Echo Dot, which have helped drive additional interest in the product.
Outside of Amazon’s proprietary hardware, Alexa is also showing up in new and interesting integrations for major brands around the world. At CES 2017, for example, Ford automobiles, robots, kitchen appliances, and even other brands of smart speakers showed off Alexa integrations. Additionally, some hobbyists are even creating their own versions of the Echo using a Raspberry Pi.
How to take advantage of Alexa as a developer
To get started with Alexa as a developer you must first understand the different ways you can approach the platform. In the ASK, the three aforementioned skills are all developed slightly differently. Both the Flash Briefing and Smart Home skills can be built using a provided API.
“These APIs give less control over the user’s experience, but simplify development since Amazon has already done the legwork to create the voice user interface,” Sobolewski, said.
To understand these two skills in more detail. Donohue explained that the Flash Briefing skills “use either an RSS or JSON feed containing the daily items that would be part of the flash briefing.” However, using the Smart Home skills API “requires an Amazon Lambda function that acts as an adapter for the integration, and an account-linking integration that allows the end user to link their Amazon Alexa account with the smart home appliance account for authenticated control of smart home devices,” Donohue said. One example of a Smart Home skill would be using Alexa to control a Philips Hue lightbulb.
Outside of those two, there is the Custom Skills option. “This is the most flexible kind of skill, but also the most complex, since the developer will need to provide the interaction model,” Sobolewski said. “The interaction model is essentially the ‘conversation’ between Alexa and the user. It maps the various ways users make their request, how Alexa collects more information from the user, how the user can respond, and how Alexa completes the user’s request.”
Custom Skills can use either Amazon Lambda or a custom HTTPS-enabled web server for the integration, Donohue said. Although, a “complicated certificate verification that’s enforced by Amazon makes developing with Amazon Lambda generally easier,” Donohue added. Custom Skills also support custom slot type syntax, allowing developers to go beyond Amazon’s built in types. One custom skill was developed to tell users the status of the BART transit system in the Bay Area, offering information like when a train is leaving Balboa Park or North Berkeley, for example.
Developers who opt to use Lambda can author the functions in Node.js, Java, or Python, Sobolewski said, while a web service can be built in any appropriate language.
Even if you don’t plan on utilizing Lambda, it would be worthwhile to know as you begin to experiment in the ecosystem. Additionally, Peterson recommends that would-be Alexa developers learn Speech Synthesis Markup Language (SSML). There is documentation for it provided by Amazon, and it is worth diving into, Peterson said.
“If I were to do it all over again, I would have probably not skipped by the [SSML] chapter, if you will,” Peterson said. “Amazon provides that documentation, I think that I just glossed over it.”
Another thing to keep in mind with custom skills is that there are specific formats for the response that your service returns. For example, a JSON response is limited to 8000 characters in its output speech and 24kB.
As with any application, understanding how it will be used and what features will be critical is important to do before starting the building process. Unfortunately, Peterson said, analytics for the platform are weak, so potential builders won’t be able to glean that many insights. It is also very important to learn some of the tenets of voice interface design.
“The ideal scenario is to avoid what some at Alexa have called ‘unhappy paths.’ Remember that you don’t have an ‘X’ in the upper right corner to click, so if someone goes down a path of no return, they’ll get frustrated and never use your skill again,” said Joel Evans, an Alexa Champion and the co-founder of Mobiquity.
To get started at the basic level, Amazon provides a tutorial for building a trivial skill in less than one hour. Amazon also provides a host of Skills templates and training tutorials in the Alexa Developer Portal. Additionally, developers can tune into live webinar office hours to get answers to technical questions and learn best practices, and pursue more advanced training and certification through Big Nerd Ranch.