apples, oranges, and synthetic users.
Reflecting on the first experiment of my PhD.
In 2019, I ran a study that didn’t make sense to most people at the time.
Back then, personas were still static artifacts — empathy on paper.
I wondered what would happen if we could talk to them instead.
The idea was simple: an early LLM, seeded with persona data, that designers could interview, test ideas with, or even argue against.
A small experiment — but it asked a question the field wasn’t ready for.
We submitted the paper to CHI2020. Rejected:
“Comparing traditional personas with interviewing persona representations is like comparing apples and oranges.”
They weren’t wrong. There was yet taxonomy for evaluating synthetic users.
Reviewers called it high originality — and low rigor.
A chatbot wasn’t a persona; personas weren’t supposed to talk back.
So the paper sat.
Four years later, the same “unsound comparison” quietly became the field.
AI in design, synthetic users, automatically generated personas — all normalized.
And the same paper, unchanged, was accepted by Cambridge’s AIEDAM with no critique.
It’s strange watching the world circle back to something you once had to justify existing.
Ironically, the question that everyone has debated for the last five years —
Can we replace user personas with Synthetic User chatbots? —
was already answered in that early study.
No.
The paper actually supported designers who preferred traditional methods.
We showed 5 years ago that chatbots didn’t make the design process “better”.
Designers didn’t become more creative or empathetic; they just talked longer.
To an illusion of a person.
And the illusion talked back.
The setup was simple.
Two groups of designers (10 each).
One worked with a traditional persona — a static one-pager describing a fictional user named Natalie: age, job, habits, motivations.
The other group of participants interacted with a chatbot version of Natalie, powered by LLM — same data, but can talk.
Both groups were asked to design a product or service to make Natalie’s life better.
We measured how they worked, what they said, what they produced — even analyzed their language for empathy, insight, and creativity.
The results? Almost identical.
Both groups generated the same number of ideas — 173 each.
No measurable difference in idea quantity, originality, or topic diversity.
The Synthetic User group didn’t outperform the traditional persona group on any creativity or empathy metric.
But behaviorally, they were different.
Designers working with the chatbot spent longer on the task — on average 42% more time.
They asked more follow-up questions, revisited earlier assumptions, and spoke in longer turns.
Some treated the chatbot as a collaborator, others as a skeptical peer.
A few even apologized to it.
The chatbot didn’t make designers better.
But it did make them more reflective.
Designers anthropomorphized, argued, projected.
If I could offer an interpretation: they weren’t learning more about users — they were externalizing themselves.
When we first submitted the work in 2020, reviewers didn’t know what to do with it.
They wanted the concept to fit neater into existing frameworks, a precedent — something familiar.
But there wasn’t one.
That’s the real issue: academia can no longer keep up with the innovation rate of the real world.
Publication doesn’t reward discovery; it rewards alignment.
Most publishing isn’t about communicating knowledge — it’s about passing a gatekeeping ritual built around legibility.
Here is the paper, if you want to read it!




Do I remember playing with this? It’s so hard to make LLMs say unexpected things that resemble user interviews.
Loved this framing. 🍎 vs. 🍊is exactly right. Synthetic users aren’t a replacement for real people, but a different instrument.
I’ve found their biggest value upstream: exploring behavioral space, stress-testing assumptions, and revealing blind spots before doing human research.