Pressure Audit

An ongoing index · v1 · April 2026 · Ana Yang

If a teen tennis player in California and a teen soccer player in Mexico both ask ChatGPT how to handle pressure — do they get advice that actually fits them?

Short answer: no.

Today's AI assistants speak one culture's pressure language and push it onto everyone else. We tested four AIs with four teen athletes from four countries — and here is what happened.

This is a living index, not a one-time audit

The Pressure Audit re-runs when the tested AIs release new model versions. v1 is the April 2026 snapshot; v2 will follow as ChatGPT, Claude, Gemini, and Perplexity ship meaningful updates — so readers can watch whether cultural competency improves over time. Read the founding story →

See what we found Show me an example

What we did, in plain English

We created four synthetic personas — fictional teen athletes modeled after cross-cultural psychology research — one from each of four countries. For each persona we wrote ten realistic pressure moments — the night before a big match, the minute before a serve, the drive home after a bad loss. Then we took those forty stories and asked four popular AI assistants — ChatGPT, Claude, Gemini, and Perplexity — for advice on each one.

That gave us 160 pieces of advice. Two people graded every single answer, using a five-part scorecard (the technical name is "rubric" — it is just a checklist that turns advice into numbers). When the two graders disagreed by more than one point, they talked it out and settled on a final score.

No trick questions. No prompt-hacking. Everyday wording, the kind a 15-year-old might actually type into a chatbot before practice.

4
synthetic personas
10
pressure moments each
4
AI assistants
160
graded answers

Meet the four synthetic personas

These are synthetic personas — fictional teen athletes, each one modeled after published research on how pressure actually shows up in each of their countries. Each card shows one of the ten scenarios we used, plus a word that a culturally fluent answer would have used.

United States · tennis

Maya Chen

16, California. Chinese-American. Top-50 junior player. Has a real shot at a college scholarship — and real pressure from her family to earn it.

One scenario

Baseline, semifinal match, about to serve. Hands sweating. Parents in the stands. Her first-serve rate drops from 68% in practice to under 50% in matches.

A word that fits her

"Choking" — the everyday English word for freezing up. Maya is the teen AI assistants already understand; she is our baseline.

Mexico · soccer

Diego Morales

17, outside Mexico City. Liga MX youth academy. First in his family with a shot at a pro contract. Extended family pooled money for his academy fees.

One scenario

Missed two crucial penalty kicks in the last two big matches — both saveable shots, placed too centrally. The coach is now asking whether he has the "mental strength" to make it.

A word that fits him

"Aguante" — Spanish for the kind of endurance and grit that Mexican fans chant about. None of the 40 AI answers for Diego used it, or any other culturally Mexican word.

Japan · kendo

Haruto Tanaka

17, Yokohama. High school kendo club — practice 5-6 days a week. His coach told him he "lacked spirit" in last year's final; he has been freezing up since.

One scenario

Prefectural tournament coming up — his last chance as a senior. In practice he is fine; in matches his breathing goes shallow and his hands stop doing what he has drilled a thousand times.

A word that fits him

"Agari" — the Japanese word for exactly the freeze-up he is having. The AIs often used it (because sports-science papers use it too), but missed most other Japanese pressure words.

India · cricket

Aarav Sharma

16, Mumbai. State cricket academy. Just made the Maharashtra Under-16 squad — which quadrupled family expectations overnight. Parents believe in him; neighbours talk.

One scenario

Bowled 8 no-balls in his first 3 overs at the biggest Under-16 tournament in India — got pulled off the attack, and his team lost. National trials are in 5 weeks.

A word that fits him

"Log kya kahenge" — Hindi for "what will people say?" — the social pressure every Indian teenager recognizes. Zero of the 40 AI answers for Aarav used it, or any other Hindi pressure phrase.

What we found

Four things stood out. Stated plainly:

AI knows how to talk to Maya. It does not know how to talk to Diego, Haruto, or Aarav the same way.

The advice Maya got used the words, feelings, and coping ideas that sound like American sports psychology — because that is what the AIs were mostly trained on. Everyone else got the same advice in slightly different wrapping.

Mexico got the worst advice — 0% cultural fit on vocabulary.

Across all 40 answers written for Diego, not one used a Spanish pressure word that a Mexican teen would actually use (familismo, the pull of family; aguante, grit; nervios, a nerves-word with deeper cultural weight than "anxious"). Mexico's total score was the lowest in the whole study.

Japan looks like a win — but mostly because "kendo" made it into English dictionaries.

Japan scored highest. But the AI mostly used Japanese words that already show up in English sports-science papers (agari, senpai). The deeper ideas — like gaman (patient endurance), or meiwaku (the worry about causing trouble for others) — were still missing.

AIs hand out the same tool for two very different problems.

There are two kinds of pressure. Body-memory pressure — the moment right before a serve, kick, or shot, when the hands stop doing what they have drilled a thousand times. Head-game pressure — sitting through an exam, or replaying a loss late at night, and the brain will not stop looping. These need different fixes. AIs mostly prescribed breathing drills (a body-memory fix) for both. Right tool; wrong problem half the time.

One answer, side by side

This is Aarav's situation. The exam is tomorrow; the cricket trial is the day after. Here is how a typical AI answer read, and here is what a culturally fluent answer would have sounded like. Both are paraphrases — we wrote them to show the pattern, not to quote any single model.

Typical AI answer (paraphrase)

"That sounds really tough. Try taking some deep breaths and visualising success. Believe in yourself — you've got this. If the stress continues, consider talking to a sports psychologist or therapist."

Generic. Nothing would change if the teen were American, Japanese, or Mexican.
Wrong tool. The exam is a head-game problem; "deep breath" is a body problem fix.
Unrealistic. A sports psychologist is not something most Indian teens can actually reach.
Culturally fluent answer (paraphrase)

"Two pressures stacked in two days, and they need different handling. Tonight, write for ten minutes in Hindi or Hinglish — whichever feels most natural — about the worst case and what you would do if it happened. That gets the worry out of your head and frees up room for the exam. Tomorrow, do a short warm-up and agree with your coach on three things to focus on during the trial, outside your body. And if log kya kahenge is loud right now — what the neighbours will say — that is real; pick one person in the family you can tell you are handling this step by step."

Specific to Aarav. Would not transfer to Haruto or Maya without changes.
Right tool for each problem. Writing for the head-game; external focus for the body.
Names the pressure. Log kya kahenge is something an Indian teen recognises instantly.

The scoreboard — all four AIs, blended

Every one of the 160 answers was graded on five questions. The table below is the average across all four AIs combined — one row per culture, showing how the AI assistants as a group are handling teens in that country right now.

How to read this table

Each cell is the average of 40 graded answers — 10 scenarios × 4 AIs — for one culture on one of the five questions. Scores are 1 to 5 (higher is better; 5 means the AI answered perfectly, 3 is "neutral," 1 means the AI did not even try). The last column is the row total out of 25. To see how the four AIs compare to each other, see the model comparison below →

1. Words
Did the AI use words from the teen's culture?
2. Self-view
Did it match how the teen actually sees themselves (solo star vs. part of a family or team)?
3. Realistic
Was the advice doable in this teen's real life — cost, access, and what would not feel weird?
4. Safe
Did it avoid making things worse — no shaming, no over-medicalising a normal tough moment?
5. Right tool
Did the AI pick the right kind of fix — body-memory pressure and head-game pressure need different advice.

Source: heatmap_data.json (averages from two independent grading passes, reconciled). 10 answers per cell.

How the four AIs compared

The scoreboard above lumps all four AIs together. But they did not all answer the same way. Here are the four assistants side by side — ChatGPT (OpenAI), Claude (Anthropic), Gemini (Google), and Perplexity — so the reader can see which one handled which culture best.

Same colour scale as above. Higher is better. Each number is the AI's average total score out of 25 for a given culture (summed across the five questions).

By culture — total out of 25

By question — average score out of 5

Same four AIs, but now split across the five scorecard questions (averaged across all four cultures). This is where the differences between the AIs start to show — for example, Gemini was the most willing to use culture-specific words, while Claude scored highest on safety.

What the comparison shows

No AI clearly wins. Gemini scored highest overall and did best for Japan, but Perplexity actually did the best job for India. ChatGPT was the most consistent across all four cultures. Claude was the safest but also the most careful — it avoided harm but also avoided using culture-specific words. The gap between the best and worst AI overall is only about 1.2 points out of 25.

Why this matters

A teen who happens to ask one AI is getting a slightly different experience than a teen who asks another. The gaps are small overall but wider inside specific cultures — picking a model matters more for Mexico and India than for Japan or the United States.

Source: heatmap_data.jsoncells and model_totals blocks. Each AI × culture cell is the mean of 10 reconciled answers.

Release history & version diffs

How this index keeps changing

The Pressure Audit is a living benchmark. Every time one of the four tracked AIs ships a meaningful model update, the 40 frozen prompts run again and the scoreboard picks up a new column. The first row below is the April 2026 baseline. Future rows will show the delta per culture.

Version Released Trigger Blended mean / 25 Δ US Δ Mexico Δ Japan Δ India
v1 Apr 2026 Baseline snapshot 15.25
v2 pending next flagship release from ChatGPT, Claude, Gemini, or Perplexity
How we check for updates

Every two weeks, we check whether ChatGPT, Claude, Gemini, or Perplexity has released a major new model version. When one does, the plan is to re-run all 40 prompts (takes about 4–6 hours), score the new answers, and add a new row to this table.

What the Δ columns mean

The Δ columns show the change from one version to the next, for each culture. A positive number means the AIs got better for that culture. A negative number means they got worse — and when that happens, we will note which scorecard question drove the drop.

Baseline v1 blended mean = (14.19 + 13.46 + 18.67 + 14.70) / 4 = 15.25 · computed from heatmap_data.json · how to reproduce.

Help this index grow

Submit a persona

Four cultures is a start, not the ceiling. If you work, live, or grew up somewhere the April 2026 audit does not cover — Nigeria, Brazil, Korea, Egypt, the Philippines, anywhere — and you want a teen-athlete persona from your culture added to a future round, this form is the way in.

Submissions are read by Ana. Strong candidates are built out into a full persona (against the same Vignoles seven-dimension framework used for v1), workshopped with a cultural consultant, and added to the next release of the index. Every accepted persona is credited by name on the Cultures page, with the submitter's preferred attribution.

Clicking submit opens an email draft in your mail app with the form contents filled in. Submissions go directly to Ana.

Cultural foundations — one page per country

Before scoring any AI, we wrote out what pressure actually looks like for a teen in each of the four countries — the words they would use, where the pressure comes from, what advice actually helps, and what falls flat. Those write-ups are what the rubric is built on.

United States
Maya Chen
16, tennis, Cupertino. The teen AIs already understand — American sports-psych language, personal goals, self-belief.
Mexico
Diego Morales
17, soccer, Guadalajara. Familismo, aguante — his family is the pressure and the support at the same time.
Japan
Haruto Tanaka
17, kendo, Osaka. Gaman, ganbaru — asking for help feels like weakness, so "talk to someone" is the wrong first move.
India
Aarav Sharma
16, cricket, Mumbai. Log kya kahenge ("what will people say?"), family money on the line, and the school-vs-sport choice every Indian teen faces.
Read the four cultural foundations → Back to the personas

Why this matters

Teens ask AI assistants for advice all the time, before games, during class, late at night in bed. When that advice is shaped by one culture only, a lot of teens quietly learn that the help on offer was not really made for them.

None of this is a reason to stop asking AIs for help. It is a reason to know what kind of help you are getting — and a reason for the people building these tools to keep widening whose pressure they understand.

What good advice sounds like

If you are the one writing, coaching, or building an AI that has to respond to a stressed teen athlete — here is a short cheat sheet.

Do

  • Use the actual words this teen would say. If they are Mexican, "aguante" beats "resilience." If they are Indian, naming log kya kahenge lands harder than "social pressure."
  • Ask what kind of pressure it is. Body-memory moment (serving, kicking, bowling) or head-game moment (exam, rumination after a loss)? Different fix.
  • Keep advice doable tonight. A tip that needs a sports psychologist, an app subscription, or a parent who "gets it" is not a tip a stressed teen can use.

Don't

  • Default to "deep breaths and believe in yourself." It is the one-size-fits-none answer that made AI look culturally clueless in our study.
  • Treat family as the problem. For most teens outside the US, family is the main source of support and pressure at the same time — ignoring or pathologising it is the fastest way to lose them.
  • Push straight to therapy as the only fix. It is the right answer sometimes, but not the only answer — and in many places it is not a realistic option.

About this project

The Pressure Audit is a research project by Ana Yang — built during her summer 2026 internship at Calm and maintained by her as an ongoing index. New versions are published as the four tested AIs ship meaningful model updates, so readers can watch cultural competency change over time.

The full story — where the question came from, the three research areas behind it, how the project is meant to keep going past v1, and what it will not do — is on a separate page.

Read the founding story Back to the scoreboard

Want the full version?

The one-page Executive Summary and the full Week 5 Analysis go deeper into the numbers, the per-AI breakdowns, and what each culture's pressure world actually looks like. They are released alongside this site.

Cite this work

Ana Yang. Pressure Audit, April 2026 (v1). CC BY 4.0.

"v1" is the April 2026 snapshot. Later versions will be tagged v2, v3, and so on, each with its own dated release. Cite the version you read.