Xbox AURA: AI-User Research Assistant
TL;DR
Xbox Research as an organization conducts player research to understand player behavior, preferences, and experiences within games. However, Xbox historically has focused on understanding players within Seattle, USA. With Xbox aiming to expand its research efforts globally, it faces the challenge to conduct UX studies across languages and agnostic of timezones. In this case-study, we discuss AURA Toolkit - AI User Reserach Assistants Toolkit, an effort initiated, designed and developed by me to increase the adaption of AI within Xbox Research and improve scale, speed at which UX reserach is conducted across Xbox.
Key Takeways
- All Participants found the AURA Interviews and AURA Surveys to be more efficient in providing playtest feedback compared to regular surveys and unmoderated Interviews.
- AURA Canvas was effective in helping researchers visualize their data, arrange their data and identify themes from their data, resulting in saving an average of 8 hours time per reserach study/playtest.
- AURA Toolkit helped reducing reliency on vendor support for data annotation, data collection, and subscription fees for tools like Dscout, UserTesting, and contract vendors saving around $1 million annually.
Co-designing with Researchers
While AI offers a lot of promise, it is critical to understand how AI can best help UX Researchers in an organization from the researchers themselves. To understand the needs of researchers, I conducted 8 contextual inquiries and 5 semi-structured interviews with researchers across Xbox Research. Through these sessions, I identified key pain points and opportunities for AI integration in their workflows.
Key Insights
- Time Between Raw Data and Oraganized Data: Researcers spend a significant amout of time waiting for vendors to clean-up data, arrange data and identify important recordings, clipping and snippets from interviews and suverys
- Limited Scope of Data due to georgraphical and economical constraints: As Xbox expands to south-east asia and other markets, it becomes increasingly challenging to hire researcehrs fluent in specific languages and cultural contexts. Further, timezones make it challenging to conduct interviews and playtests across timezones.
- Need for Real-time Collaboration: Researchers expressed a desire for tools that facilitate real-time collaboration and feedback, especially when working with remote teams across different time zones.
- Data Visualization and Synthesis: Researchers often struggle with synthesizing large amounts of qualitative data into actionable insights. They expressed a need for tools that can help visualize and organize data effectively.
AURA Toolkit
Based on the insights gathered from researchers, I conceptualized and developed AURA Toolkit - AI User Research Assistants Toolkit, a suite of AI-powered tools designed to assist researchers in various stages of the research process. The toolkit includes the following main components:
- AURA Interviews: AI-assisted interviews that streamline the process of conducting and analyzing user interviews agnostic of timezone and language.
- AURA Surveys: Intelligent surveys that adapt questions based on user responses, improving data quality and relevance.
- AURA Canvas: A visual collaboration tool that helps researchers organize and synthesize data in real-time and uses LLMs to take a Human In the Loop approach towards running grounded theory based analysis.
- AURA Coder: A qualitative analysis tool to help collaborative qualitative coding, allowing users to develop or recycle codebooks on their data. The tool leverages Copilot to help users in running qualitative analysis.
Evaluation
Due to NDA, I do not share exact statistical numbers with regards to outcomes.To evaluate the effectiveness of various tools across the AURA Toolkit, I conducted a mixed-methods experiments, below I outline the methods used to evaluate, and the key findings associated with experiments for each tool.
AURA Interviews
To evaluate the effectiveness of AURA Interviews, I conducted a within-subjects experiment with 12 participants. Each participant completed two playtests of a game, one using traditional unmoderated interviews and the other using AURA Interviews. The order of conditions was counterbalanced to control for order effects. After each playtest, participants completed a survey assessing their experience and provided qualitative feedback through semi-structured interviews.
Key Findings
- Participants reported that AURA Interviews were more efficient in providing playtest feedback compared to traditional unmoderated methods.
- The adaptive questioning approach of AURA Interviews led to more in-depth responses from participants.
- Overall, participants expressed a preference for the AURA Interviews due to their streamlined nature and improved user experience.
AURA Surveys
To evaluate the effectiveness of AURA Surveys, I conducted a within-subjects experiment with 43 participants. Each participant completed two surveys about their gaming experience, one using a traditional static survey and the other using AURA Surveys. The order of conditions was counterbalanced to control for order effects. After each survey, participants completed a follow-up survey assessing their experience and provided qualitative feedback through semi-structured interviews.
- Participants found AURA Surveys to be more engaging and interactive compared to traditional static surveys.
- The dynamic question generation of AURA Surveys led to more relevant and personalized questions for participants.
- Overall, participants expressed a preference for AURA Surveys due to their improved user experience and ability to capture nuanced feedback.
AURA Canvas
To evaluate the effectiveness of AURA Canvas, I conducted a between-subjects experiment with 4 researchers from Xbox Research. Each researcher completed a playtest analysis using either traditional methods or AURA Canvas. The order of conditions was counterbalanced to control for order effects. After each analysis, researchers provided qualitative feedback through semi-structured interviews.
- Participants reported that AURA Canvas facilitated better organization and synthesis of data compared to traditional methods.
- The real-time collaboration features of AURA Canvas led to more dynamic discussions among researchers.
- Overall, participants expressed a preference for AURA Canvas due to its innovative approach to qualitative analysis.
AURA Coder
To evaluate the effectiveness of AURA Coder, I conducted a between-subjects experiment with 6 researchers from Xbox Research. Each researcher completed a qualitative coding task using either traditional methods or AURA Coder. The order of conditions was counterbalanced to control for order effects. After each coding task, researchers provided qualitative feedback through semi-structured interviews.
- Participants reported that AURA Coder streamlined the coding process compared to traditional methods.
- The AI-assisted coding features of AURA Coder led to more consistent application of codes among researchers.
- Overall, participants expressed a preference for AURA Coder due to its efficiency and effectiveness in qualitative analysis.
Impact
- AURA Toolkit has been adopted by multiple teams within Xbox Research, leading to increased efficiency and effectiveness in user research processes.
- The toolkit has facilitated more inclusive and diverse research practices by enabling studies across languages and time zones.
- Overall, AURA Toolkit has contributed to a more innovative and user-centered approach to research within Xbox.
Drawbacks
- Some participants shared that the use of AI in UX research felt intrusive and raised concerns about data privacy despite all data being internal to Microsoft and withing regulations.
- Some researchers expressed discomfort with the AI-generated insights, feeling that they lacked the depth and nuance of human analysis. They shared that these insights are great first drafts but still require human interpretation and refinement, even though it does save the initial hassle of finding obvious patterns.
- There were instances where the AI tools misinterpreted user responses, leading to irrelevant or off-topic questions in AURA Interviews and AURA Surveys. This highlighted the need for ongoing refinement of the AI algorithms to improve accuracy and relevance.