Low-stakes experiments are so important to the start of a project. And by low, I mean tests you can reject, delete, or literally throw away without tears or hurt feelings.
For apps, websites, or any visual medium, one option has always ruled—paper. Draw it out and have people pretend to swipe, stab at buttons, or scroll. You get a sense for what works and what doesn’t early on, and these tests set you up for more advanced work.
But how do you test an interactive—and invisible—audio experience for Alexa, Google Home, and the like, without committing huge tech resources?
Some early tools are out there, like Sayspring, but since you’re beholden to the architecture it supports, and since it deploys to Amazon Echo, a whole host of things could go wrong and get in the way of your ability to test what you want to test.
When I’m designing voice conversations and flows, I want to be able to test the concept before getting mired in technical challenges. I want to be able to test the vision, not the current technical constraints.
So, how do you fake a voice experience? With audio buttons.
The solution we’re working with comes from my previous work—it’s a dead-simple idea that emerged from a sprawling conversation with Joe Germuska, chief nerd at the Knight Lab. During that project, I wanted to gather more information about how people behave when talking to bots. I needed to make people think they were talking to Alexa, when actually they weren’t. I needed a way to pretend.
What we made turned out to be a great tool for user testing as well. It’s an HTML page of buttons that play pre-recorded Alexa responses. The human at the computer assumes the role of the Alexa software, listening to what people say and playing the proper responses. You’re able to focus on the interaction, before approaching the difficulties of natural language processing.
We’ve been using this “play board” to test some of our initial voice experiments in the Studio, and it’s effective. It even passes a reverse Turing test—when giving feedback, people say things like, “I think it didn’t hear me,” not realizing it was a human error.
For all the time I spent fretting about how to simulate the experience—what if people realize it’s me, so their interactions are tuned to a human instead of Alexa?—turns out, no one bats an eye if you just say you’ll “run Alexa” from a computer.
Fake it ’til you make it—Alexa version
First up: Prep
Generally speaking, you want to have your script ready, plus a couple of required responses for error-handling, like what should happen if a user doesn’t respond after a certain amount of time, or says something totally unexpected. (Which they almost always do. People like being smarter than bots.)
You also want a solid understanding of how the thing “works,” or how the user flows through the experience. (i.e., If they answer “no,” play this clip. If “yes,” play this one. If “pizza,” play the error message.)
Accounts and tools you’ll need
- An Amazon developer account
- A way to record audio from your browser, like Audio Hijack
- Audio editing software, like Audacity
- HTML with buttons to play audio. You can use our template on Github or remix our example on Glitch.
- A low-stakes way to test and hone a concept before building anything
How to Make it Yourself: Step by Step
Get a robot to talk for you.
For the real Alexa, you’ll need to set up a test skill in the Amazon developer portal. None of this initial setup matters for now—what you’re after is the Alexa voice simulator in the test section.
For for a quicker but less accurate simulation, use text to speech on your machine, usually under the accessibility options.
Get your recording setup ready (i.e., if you’re using Audio Hijack, prep the software to capture audio from your browser)
Paste your script—or pieces of your script—into the simulator to hear Alexa read it.
Realize immediately that some of the words will need to change because Alexa (or your Alexa stand-in) is fussy. If you’re using the Alexa simulator, you can also try your hand at the ssml tags to improve the delivery.
Once you’re happy, record the speech, edit, and save as small audio clips.
- Make html buttons to play your audio files. If you download our Github template, put your audio files into the “sample audio” folder. Then open the “index.html” file in a text editor and swap out the name of the source audio files in the HTML. If you remix the Glitch project, drag your audio into the “assets” box. Each file will get a long asset URL, which you can see by clicking on the file. Then click on “index.html” and replace each data-url with the full URL of the audio you want to play. In either system, you can add more rows by copying and pasting the large “row” sections in “index.html.” (Don’t worry about breaking anything—you can always start from a fresh copy of the project and try again.)
Happy experimenting! We’d love to hear your methods for audio-bot testing. Reach out to us with any questions, comments, ideas, or feedback at email@example.com.
Updated 2/19/2019 to include the Glitch option.