Speech Lead, UX Engineer II, Microsoft, 2010-2011
I lead the speech design efforts for making the entire Xbox 360 Dashboard and partner applications a natural user interface (NUI), meaning the entire system would work end-to-end with voice and gesture.
The Problem
The initial release of Kinect supported NUI, but only in the Kinect Hub. It was a second-class citizen in the overall UI and was cordoned off from the regular Dashboard interaction.
Following the success of Kinect, the team decided to redesign the entire Xbox 360 system to support voice, gesture, and controller.
The original Xbox 360 Dashboard was created for the controller. It did not work well for NUI.
The new Dashboard used the existing hardware, had similar constraints (we could only recognize 20 commands at a time), and was allotted 1 year so it would finish in time for Holiday 2011. I was the dedicated speech designer on an agile development team, meeting daily for scrum, collaborating with engineering and PM, and sharing updates with management of Xbox and the separate Speech team.
What I Did
Voice Interaction Flow
I created a framework to explain to all the different Xbox 360 teams the components of a good speech interaction. I shared this with the core Xbox design team and with as many PM and engineering counterparts as I could find.
Speech Style Guide
I wrote a style guide for speech design for partner teams to ensure there would be consistency in the interaction between the Dashboard (OS) and the applications. I also stayed in close contact with the partner tools team, who was developing the core assets, to make sure their design also aligned with the frequently-changing design of the Dashboard.
What Shipped
At launch, the Xbox 360 Dashboard was navigable via voice from end-to-end, meaning you could find content, launch it, and continue to use your voice inside the game or application. It was a huge improvement from the previous system, where NUI functionality was relegated to a small area with limited functionality. It was well received, especially the new search functionality.
Insights
We found that speech use for search was not as high as some had hoped. In retrospect, we had underestimated the power of the autocomplete functionality with the controller.
It was a good reminder of a lesson in multi-modal: always consider the strengths and weaknesses of each competing modality. While searching with speech is one of the best use cases for voice, it has to be considered in context of gesture and the controller. The controller keyboard overall was cumbersome. But, if you can get a newly released game just by typing two characters, that can be faster and easier than saying the name of the title. Users understood this tradeoff and voted with their thumbs.