This past weekend on Saturday, November 12th, I attended the Amazon Alexa + Echo Hackathon hosted at General Assembly’s New York City campus. Our challenge was to design and build a customized skill for Alexa, the voice service that is powered through the Amazon Echo for users to interact with the device. This hackathon was unique in the sense that the interface I worked on was voice rather than a physical screen. Here is a recap of the event, the process that my team and I took and the final takeaways.
ABOUT & PRE-HACK:
The Amazon Alexa + Echo Hackathon brought together UX designers, product managers, developers and data scientists to collaborate and build a new capability or skill for the Alexa Voice Service (AVS), powered through the Amazon Echo. The Echo is a hands-free speaker system that is controlled with your voice. It connects to the AVS, which lives in a cloud, and has various capabilities such as playing music, providing information, sharing news and much more. In order to interact with the device, users just need to say the wake word “Alexa” and she will respond immediately. Anybody can build an Alexa skill so the potential of what she can deliver to users is limitless.
The AVS is activated by downloading the mobile app and connecting the Echo through wifi. She lives in the cloud so all of the user’s commands are processed there rather than the device itself. Since the Amazon Echo was released two years ago in 2014, they haven’t updated the actual hardware since it is simply a speaker system and a 360-degree microphone with wifi. What’s amazing is that the device can identify where the user’s voice comes from and is able to filter out other, external and irrelevant sounds through the microphone. The AVS can also be accessed through the Amazon Tap and Echo Dot devices. Here is a detailed breakdown of the devices.
The morning started out with an intensive Alexa skill building bootcamp led by Robert McCauley, a Bostonian Alexa Developer from Amazon. In this workshop, we learned how to build a custom skill with Amazon Developer and Amazon Web Services (Lambda), using existing templates from GitHub. It was incredibly challenging to keep up with the workshop. But it was definitely helpful to understand the limitations of the device and the code. This was my first exposure to working with code so this experience helped me understand what developers deal with when it comes to translating designs into working products.
First, we discussed the various ways that human-machine user interface has developed over the years. We started out with the very analog punch cards and evolved into using the keyboard and the mouse. Now, we mostly interact with the touch screen but the speech interface is picking up as the next big thing. The voice user interface (VUI) is very unique. The experience needs to be designed in a way that is ubiquitous, frictionless and ultimately, natural to a normal, human conversation. It was up to us to define what that means.
During the workshop, we learned that the initial voice command for Alexa is broken down into three parts: the wake word, the launch and the skill. For example, if we say “Alexa, open piano facts,” the wake word is Alexa, the launch word is open and the skill is piano facts. Alternatively, Alexa also responds to “Amazon” and “Echo” as her wake words. Other examples of launch words include begin, start and launch. The skill phrase can be anything but the more unique and identifiable it is, the better. The interactive dialogue with Alexa also has different components, which will be covered later when I discuss the process of building her skill.
To wrap up, Rob encouraged us to keep our skill simple and build something that can pass as a minimal viable product. Rather than go all out and build a car, we should start by creating a product that is metaphorically like the skateboard or scooter that serves the basic function and keeps users happy. If we have time, then we could continue to build on that and expand the skill. Once the skill buinding bootcamp portion of the event was finished, I found a group and filled myself up with pizza and coffee. It was time to start hacking!
My group consisted of all user experience designers and product managers. I worked with Christina Crichlow, Nick Tejada, Iren Gurarye, Tyler Pruyn, Andrea Ruz and Stanley Tatum. Despite the lack of developers and data scientists on our team, we determined to utilize our strengths and work together to compensate for skills we didn’t have. The advantage to an all UX designer team was that everyone in the group was willing to learn, compromise and collaborate to create a skill that would add value to users’ lives.
We started out by brainstorming various ideas. Since we knew the challenge before attending the event, we each brought our own ideas and shared them with one another. As UX designers, we were all goal-oriented so we settled on creating a workout skill. We wanted to take advantage of interacting with the Alexa system to help users build their morning exercise routine so we called our skill, “Pump it Up.” People are likelier to get a workout in when someone holds them accountable and we wanted Alexa to provide that for our users.
First, we focused on creating a conceptual user flow that represented the interactive dialogue. We wanted to keep the interaction as simple as possible so that it is intuitive and easy to adapt as part of any user’s routine. In order to get a feel of how the user will interact with Alexa through this skill, we created a video as our initial low-fidelity test to see what works and what doesn’t. This helped us get an idea of what a natural flow of the dialogue would sound like.
Following the video, we solidified a basic user flow of the dialogue and figured out the best way to make the interaction feel natural and engaging. When it comes to designing the dialogue between Alexa and the user, there are three things we need to consider: the intent, custom slots and sample utterances. The intent is what fulfills the user’s specific request. The sample utterances are the initial conversation starters. Once we’ve activated Alexa with the wake word, we use utterances to take us to the skill. Slots are the phrases in between that activates the next action. We can also code a set of custom slot values. In our case, it was a set of small workouts that the user can do.
We had several constraints for the AVS that we needed to keep in mind. First, she needs to be activated or awakened. She only responds to Alexa, Amazon or Echo (depending on the settings). After awakening her, she only listens for up to eight seconds and once that time passes, we’re back to the beginning and she must be awakened again (I make it sound like we’re awakening ancestors from Mulan). As a human, I stop listening after two seconds so it’s quite amazing how patient Alexa is. She is attentive to short and straightforward sentences. Don’t tell her your life story because she’ll probably stop listening to that too.
We had many ideas on how to expand this skill but ultimately, we decided to keep the function as simple as possible for the purpose of this project. We came up with a list of assumptions about the user. Some of those assumptions are that the user has approximately 20 minutes in the morning to get his or her routine in, has moderate athletic ability and knows how to perform each of the activities. If the user didn’t know, we could program a feature where Alexa sends a card to the mobile app and the user can visually see how the exercise is performed but that might need to happen post-hackathon.
Since we were a large group, we decided to split up the work based on each person’s strengths and collaborate through a Google Drive folder. Tyler and Iren took up the toughest part: learning and coding the skill. Stanley, Andrea and Christina focused on the utterances and building the exact dialogue between Alexa and the user. Nick and I worked on communicating our ideas visually through the presentation. After a few hours, we managed to design a dialogue between the user and Alexa, code the skill and prepare a visual presentation of our process.
This is the interaction we designed*:
User: Alexa, open pump it up.
Alexa: Opening pump it up. Ready for your workout, you lazy bum?
Alexa: Let’s do 10 push ups.
User: Done, next exercise.
Alexa: Nice, now do 15 crunches.
User: End workout.
Alexa: Good job. Nice abs!
* It is highly recommended that Push It by Salt-N-Pepa is playing in the background as the interaction is happening.
RESULTS & REFLECTIONS:
After the hacking portion was completed at 4:00pm, all the groups presented a two to four-minute pitch of what they created. Many groups came up with creative and innovative new skills for Alexa. I was impressed at what a lot of people managed to accomplish in such a short time span. Some skills were practical while others were entertaining. I love that most groups added spunk and sass to Alexa, making any interaction with her that much more real and joyful. Team Pump it Up was the last group to go and we talked about how our skill helps users build a stable morning exercise routine.
I’m incredibly pleased with the way my group came together to build this capability for Alexa. Leading up to this event, I was doubtful of my abilities as a UX designer and thought of every excuse that I can to not attend. What I gained, however, is the courage to challenge the things that I’m not good at and create something that I believe is of value. Being surrounded by designers who were all dedicated to learning a new skill on the spot was very encouraging for me. We all complemented each other really well. Next time, I’d like to dive in much deeper and go for the coding that I wouldn’t have touched otherwise.
The group that won the hackathon and walked away with the Amazon Echo was Team “Name That Tune.” With their skill, Alexa plays a series of song clips and the user must correctly identify the artist. It was executed flawlessly and the interaction felt natural and compelling. It was an added plus that the creators of this skill have good taste in music. Congratulations! And finally, the event closed out with a chill happy hour to reward ourselves for working hard all day.
Working with a voice user interface was a brand new experience for me. I did learn, however, that the design process can easily be applied to this medium to create a captivating user experience. Learning to code was definitely the most challenging aspect since my focus has been UX research for the past couple of months but it was also eye-opening in terms of what I could accomplish with it. I would definitely love to take more time to master the coding behind building Alexa skills because there are so many things I want to build. This experience makes me excited about the future of voice user interface and am looking forward to seeing where it goes.