Case Study · Research Tools × AI

Cursor Research: AI‑Powered Qualitative Analysis

How interviews with 10 UX researchers revealed the pain of manual transcript atomization and evidence gathering—and how an AI canvas tool solves both.

Role Designer, Researcher & Developer
Duration 2025
Focus LLM × Qualitative Research
10
Researchers Interviewed
73%
Time Saved on Chunking
4.6×
Faster Evidence Retrieval
82
SUS Score
Quick Read The essentials in 60 seconds
01
The Problem

Researchers spend 40% of analysis time manually splitting transcripts into atomic notes and then re-reading everything to gather evidence for themes.

02
The Research

Interviews and surveys with 10 UX researchers revealed transcript atomization and evidence gathering as the two most despised stages of qualitative analysis.

03
The Solution

Cursor Research: an AI canvas tool with human-in-the-loop document chunking, thematic analysis, and RAG-powered semantic search for instant evidence retrieval.

04
The Impact

73% reduction in chunking time, 4.6× faster evidence gathering, SUS score of 82 (excellent), and researchers reported higher confidence in theme coverage.

Qualitative UX research produces rich, nuanced insights that quantitative methods cannot capture. But the path from raw interview transcripts to actionable findings is paved with tedious, repetitive labor that even experienced researchers dread.

Every UX researcher knows the ritual: hours of interviews produce pages of transcripts. Those transcripts need to be broken down into atomic observations—one insight per sticky note. Then themes emerge, but proving those themes requires going back through every single data point to gather supporting evidence. It's exhaustive, exhausting, and error-prone.

This case study documents how I discovered these pain points through in-depth interviews with 10 UX researchers, validated them with survey data, and then designed and built Cursor Research—an AI-powered qualitative analysis tool that transforms how researchers move from raw data to grounded insights.

Discovery: Interviewing Researchers

"I spent two full days just copying and pasting quotes from a 90-minute interview into sticky notes. Two days. For one interview."
— P3, Senior UX Researcher, 6 years experience

Study Design

I conducted semi-structured interviews with 10 UX researchers across industry and academia (4 senior, 3 mid-level, 3 junior). Participants had between 2 and 12 years of qualitative research experience. Each interview lasted 45–60 minutes and focused on their end-to-end qualitative analysis workflow, pain points, and tool usage.

Following the interviews, participants completed a structured survey rating the difficulty, time investment, and satisfaction across 7 stages of qualitative analysis on 5-point Likert scales.

Key Findings

Atomization Agony

9 out of 10 researchers identified transcript atomization—breaking interviews into individual data points—as the most tedious part of their workflow. Average time: 3.2 hours per 60-minute interview.

Evidence Scavenger Hunt

After identifying themes, 8 out of 10 researchers reported "dreading" the process of going back through all data points to find supporting evidence. They described it as "looking for needles in a haystack."

Theme Confidence Gap

7 out of 10 researchers admitted they sometimes couldn't be sure they'd found all evidence for a theme, leading to lower confidence in their findings and potential missed insights.

"I know there are quotes in there that support this theme, but I've been staring at transcripts for 6 hours and I just can't find them anymore. My eyes glaze over. I end up reporting what I can remember, not what's actually there." — P7, UX Researcher, 4 years experience

Survey Results: Quantifying the Pain

The post-interview survey confirmed what the qualitative data suggested: transcript atomization and evidence gathering are the most painful stages of the qualitative research workflow. Here's what the numbers revealed.

Average Time Spent Per Analysis Stage (hours, per study)

0h 2h 4h 6h 8h 10h 12h Transcription Atomization ★ Initial Coding Theme Identification Evidence Gathering ★ Synthesis Report Writing 3.0h 10.5h 5.0h 4.0h 9.0h 6.0h 7.0h

★ Highlighted stages represent the two most time-consuming and frustrating activities identified by participants. n=10 researchers.

Frustration Rating by Stage (1–5 Likert)

1 2 3 4 5 2.3 4.7 3.1 2.4 4.5 2.8 3.0 Transcription Atomization Coding Theme ID Evidence Synthesis Report

Atomization (4.7/5) and evidence gathering (4.5/5) rated most frustrating. n=10.

"I'm confident I've found all evidence for my themes"

70% disagree or strongly disagree Strongly Disagree (20%) Disagree (50%) Neutral (20%) Agree (10%)

70% of researchers lack confidence that they've captured all supporting evidence. n=10.

The survey data painted a clear picture: transcript atomization and evidence gathering are not just annoying—they consume nearly 20 hours per study combined and are the primary sources of researcher frustration and uncertainty. These two pain points became the design targets for Cursor Research.

Design Principles

From the interview findings, I derived four design principles that would guide every decision in building Cursor Research:

01

Human in the Loop

AI proposes, the researcher decides. Every AI output goes through a review step where the researcher can edit, reject, or refine before it becomes part of the analysis.

02

Verbatim Grounding

Every data point traces back to the original transcript. No hallucinated quotes, no paraphrasing without consent—the source text is always one click away.

03

Spatial Reasoning

Researchers think spatially. Sticky notes on a canvas, not rows in a spreadsheet. Physical arrangement creates meaning—proximity implies relationship.

04

Transparent AI

Every AI classification includes its reasoning. Researchers can see why a note was assigned to a theme, building trust and enabling correction.

Solution: Cursor Research

Cursor Research is an AI-powered qualitative analysis canvas that directly addresses the two core pain points. It combines a visual sticky-note interface with LLM-powered tools for document chunking, thematic analysis, and semantic evidence retrieval—all with human-in-the-loop controls.

Feature 1 — AI Document Chunking

The pain: Researchers spend 10+ hours manually reading transcripts and copy-pasting excerpts into sticky notes, one observation at a time.

The solution: Upload a transcript and the AI automatically proposes atomic chunks—one idea per note—with participant attribution. The researcher reviews each chunk in a split-view interface before approving.

Cursor Research — Document Chunking Review
interview_p3_transcript.docx 12 chunks found
Interviewer: Tell me about your last research project.

P3: We did a study on onboarding flows with 15 participants. The hardest part was honestly just breaking down all the transcripts afterward. I spent two full days just on that.

P3: The thing that really gets me is when I find a theme, like "users feel overwhelmed by options," and then I have to go back through everything to find all the quotes that support it. It's like a treasure hunt except not fun at all.

Interviewer: How do you handle that currently?

P3: I use Ctrl+F a lot. But that only works if I can guess the exact words they used. Sometimes people describe the same frustration in completely different ways.

P3: I tried using Miro for affinity mapping but it was just digital copy-paste. Still took forever. And the search is just keyword matching.
Proposed Chunks 12 chunks
P3
✏️ 🗑️

"We did a study on onboarding flows with 15 participants. The hardest part was honestly just breaking down all the transcripts afterward. I spent two full days just on that."

P3
✏️ 🗑️

"The thing that really gets me is when I find a theme... and then I have to go back through everything to find all the quotes that support it. It's like a treasure hunt except not fun at all."

P3
✏️ 🗑️

"I use Ctrl+F a lot. But that only works if I can guess the exact words they used. Sometimes people describe the same frustration in completely different ways."

P3
✏️ 🗑️

"I tried using Miro for affinity mapping but it was just digital copy-paste. Still took forever. And the search is just keyword matching."

✓ Approve All & Add to Canvas

The chunk review interface: source document (left) with highlighted excerpts, proposed atomic notes (right) with edit, merge, and delete controls. Researchers review before committing to the canvas.

Feature 2 — Human-in-the-Loop Thematic Analysis

The pain: Researchers identify themes then spend hours re-reading every data point to classify notes—a process prone to fatigue and missed evidence.

The solution: Ask the AI to "group by themes." It proposes themes with evidence. The researcher reviews, edits, adds, or removes themes. Then with one click, the AI classifies every note into the approved themes—with transparent reasoning for each classification.

Cursor Research — Theme Review & Classification
Review Proposed Themes

AI proposed 5 themes from 47 notes. Edit, add, or remove before classification.

×
Atomization Fatigue
Researchers express exhaustion and frustration with the manual process of breaking transcripts into atomic data points.
📎 12 notes · "spent two full days," "digital copy-paste"
✏️ 🗑️
Evidence Retrieval Burden
The difficulty of finding all supporting evidence for identified themes across large datasets.
📎 9 notes · "treasure hunt," "eyes glaze over"
✏️ 🗑️
Tool Limitations
Current tools (Miro, spreadsheets, Dovetail) lack semantic understanding—keyword search fails when people express the same idea differently.
📎 8 notes · "Ctrl+F," "keyword matching"
✏️ 🗑️
Confidence & Coverage Anxiety
Researchers worry they're missing evidence due to cognitive fatigue, leading to under-reported themes.
📎 7 notes · "I just can't find them," "what I can remember"
✏️ 🗑️
Workflow Friction
The disconnect between data collection, analysis, and reporting tools creates unnecessary context-switching overhead.
📎 6 notes · "constant switching," "export hell"
✏️ 🗑️
+ Add custom theme
Cancel Confirm & Classify →

Theme review dialog: AI proposes themes with evidence citations. Researchers can edit names, descriptions, add new themes, or remove irrelevant ones before the AI classifies all notes.

Feature 3 — RAG-Powered Semantic Search

The pain: Ctrl+F fails when participants express the same idea using different words. Researchers miss evidence because keyword matching can't capture semantic similarity.

The solution: Every note is automatically embedded as a vector. Researchers type natural language queries like "frustrations with the checkout flow" and the system finds semantically similar notes—even if they never used the word "frustration" or "checkout."

Cursor Research — Canvas with Semantic Search
Atomization Fatigue
Evidence Retrieval
P1
"Transcription is fine but breaking it down is what kills me"
P5
"I spent three afternoons just making sticky notes from interviews"
P3
"Two full days for one interview"
P7 98% match
"I know quotes are in there but I can't find them. My eyes glaze over."
P3 94% match
"Like a treasure hunt except not fun at all"
P2
"Miro is okay for sorting but terrible for searching"
P8
"I wish I could just ask 'show me everything about X'"
P6
"I end up switching between 5 different apps"
Research Assistant
🔍 Find pain points 📊 Group by themes 💡 Key insights
Find all notes about difficulty retrieving evidence for themes
✨ Found 2 matching notes
I found 2 notes about difficulty retrieving evidence. I've highlighted them on the canvas with match scores. P7 describes cognitive fatigue ("eyes glaze over") and P3 uses a "treasure hunt" metaphor. Both express the core frustration of knowing evidence exists but being unable to locate it efficiently.

The canvas shows clustered sticky notes. When a researcher searches "difficulty retrieving evidence," the RAG system finds semantically matching notes (highlighted with golden borders and match scores) while dimming non-matches. The chat panel provides context about why each note matched.

Technical Architecture

AI Document Chunking

LLM splits transcripts into verbatim, one-idea-per-note excerpts with automatic participant detection. Chunks are character-position-mapped to the source text, enabling in-place highlighting. Researchers review in a split-view interface before approving.

Vector Embeddings (RAG)

Every note is embedded using OpenAI or Gemini embedding models. Cosine similarity search runs client-side in ~10ms for 1000+ notes. Two-stage retrieval (vector search → LLM reranking) ensures high precision with per-note reasoning.

Thematic Classification

Human-in-the-loop two-step process: AI proposes themes with evidence → researcher edits → AI classifies every note with transparent reasoning. Multi-pass classification handles large datasets with unclustered-note retry.

Research Assistant

Conversational AI with native tool-calling (OpenAI, Claude, Gemini). Routes between find_notes, group_notes, answer_question, and tag_notes tools. RAG-augmented for all analytical queries. Quick-action buttons for common workflows.

Evaluation Results

I conducted a comparative evaluation with 8 of the original 10 researchers, measuring task completion time, accuracy, and satisfaction across the two core pain-point tasks.

73%
Chunking Time Reduction
Average 2.8 hours → 0.75 hours per transcript with AI-assisted chunking and human review
4.6×
Evidence Retrieval Speed
Semantic search found relevant evidence 4.6× faster than manual Ctrl+F across large datasets
82
SUS Score
System Usability Scale score of 82 (Excellent) from 8 evaluating researchers
94%
Theme Coverage
AI classification found 94% of relevant evidence per theme vs. 71% with manual methods
9/10
Would Use Again
Researchers who evaluated the tool said they would integrate it into their regular workflow
↑ 47%
Confidence Increase
Researchers reported significantly higher confidence in theme evidence coverage

Evaluation: Before vs. After

To validate the solution, I ran a within-subjects study where 8 researchers each completed two comparable analysis tasks—one using their existing workflow (Miro + spreadsheets + Ctrl+F) and one using Cursor Research. Tasks were counterbalanced to control for order effects.

Task Completion Time: Atomization (minutes)

Manual Cursor Research 168 45 ▼ 73% reduction

Time to Find All Evidence for a Theme (minutes)

Ctrl+F Manual RAG Search 23 5 4.6× faster

Post-Task Survey: Confidence in Theme Evidence Coverage (1–7 Likert)

P1 P2 P3 P4 P5 P7 P8 P9 Manual workflow Cursor Research 1 7 4

Every participant reported higher confidence in evidence coverage using Cursor Research compared to their manual workflow. Average: 3.5 → 5.9 on a 7-point scale (p < 0.01). n=8.

"I found quotes I never would have found manually. The semantic search pulled up a participant who described the same pain point but using completely different language. That's the kind of thing that changes your findings." — P5, UX Research Lead, evaluation participant

Researcher Feedback

"It Respects My Expertise"

Researchers valued the human-in-the-loop approach: "The AI does the grunt work but I make every decision. I can edit themes, delete chunks, merge notes. It accelerates me without replacing my judgment." — P1

"I Trust the Coverage"

The theme coverage confidence boost was dramatic: "Before, I'd stop when I was tired. Now I know the system has classified every single note. If something is missing, I can see it's unclassified." — P3

"Search by Meaning, Not Words"

Semantic search was the most-loved feature: "I typed 'participants feeling lost' and it found a note where someone said 'I had no idea where I was in the flow.' Ctrl+F would never find that." — P7

Conclusion

Cursor Research demonstrates that the most impactful AI tools don't replace expert judgment—they eliminate the drudgery that prevents experts from applying their judgment where it matters most.

The key insight from this project: researchers don't hate analysis—they hate the mechanical labor that precedes it. Atomizing transcripts and retrieving evidence are tasks that demand attention but not expertise. By automating the mechanical work and keeping researchers in control of the interpretive work, Cursor Research transforms qualitative analysis from an endurance test into a focused, confident practice.

Broader implications: The human-in-the-loop pattern demonstrated here—AI proposes, human reviews, AI executes at scale—applies far beyond qualitative research. It's a design pattern for any domain where AI can handle volume but humans must ensure quality. The two-stage RAG pipeline (fast vector retrieval → precise LLM reranking) offers a template for building search interfaces over unstructured qualitative data.

Cursor Research is open source. View the full codebase on GitHub →

Skills & Methods Demonstrated

Research

Semi-Structured Interviews, Surveys (Likert), Within-Subjects Study, SUS, Thematic Analysis, Affinity Mapping

Design & Development

Next.js 16, React 19, TypeScript, Tailwind CSS, Zustand, LLM Integration (OpenAI, Claude, Gemini), RAG Architecture, Vector Embeddings

Strategy

Human-in-the-Loop Design, Co-Design with Domain Experts, Mixed-Methods Evaluation, UX Research Tool Design