Why we built custom AI scenarios | StreetTongue

Two years ago I had a dentist appointment in Roma Norte.

I’d been living in Mexico City for eight months at that point. I could order at a taquería, ask for directions, chat with the guy at the corner store. My Spanish was functional. I was making progress.

Then I got a toothache.

My upper left molar had been sensitive to cold for about two weeks. Nothing unbearable. But enough that I made an appointment.

I opened every language app I had. Duolingo. Babbel. An older Rosetta Stone subscription I’d never fully used. I spent an hour the night before trying to find something that would help me explain what was happening. I knew how to say “I have a toothache.” I didn’t know how to say that the sensitivity was specifically to cold, that it was the upper left molar, that it had started two weeks ago and was getting slightly worse.

None of the apps had what I needed. Not even close.

I showed up to that appointment with a screenshot of a Google Translate output and a lot of pointing. The dentist was patient. It worked out. But I left feeling like I’d failed at something I should have been able to handle by now.

The Problem With Fixed Libraries

Every language app is built around a content library. Someone at the company writes a script for a doctor visit. Someone else writes one for a restaurant, an airport, a hotel check-in. They record audio, build exercises, ship it.

This works fine for getting started. You learn the basics. You build vocabulary.

But it breaks the moment real life shows up.

Real life doesn’t give you the script the app wrote. Real life gives you a dentist who asks a follow-up question the app didn’t prepare you for. A landlord who negotiates differently than the textbook version of a landlord. A first date who uses slang you’ve never seen in any lesson.

The content teams at language apps are smart and work hard, but they can’t write every scenario. No one can. The moment you need something specific, you’re on your own.

I kept waiting for someone to solve this. I downloaded new apps every few months hoping something had changed. It hadn’t.

The Insight

At some point it became obvious that this was an AI problem.

The reason content libraries run out is that humans have to write every piece of content. That’s a hard constraint that AI removes.

The harder question was: can you get AI to generate conversations that actually sound like how people talk in a specific city? Not generic textbook Spanish. Not “correct” Spanish. Real Mexico City Spanish, the kind you’d hear from a receptionist in a Roma Norte clinic, from a landlord in Tepito, from someone on a first date in Condesa.

The answer is yes, with work.

The key was grounding the AI in the same dialect data that powers the rest of StreetTongue. Chilango slang, neighborhood context, local registers, how formality works in CDMX, what phrases mark you as an outsider. Feed that in correctly, with the right prompting structure, and the generated output starts to feel real.

We spent months tuning this. We tested it on situations we’d actually faced. We had people who live in Mexico City read the outputs and tell us when something felt off. We fixed it. We ran it again.

What We Built

Here’s how it works.

You open the app, go to custom scenarios, and type what you want to practice. It can be anything. A medical situation, a housing negotiation, a job interview, explaining a dietary restriction at a restaurant. Whatever you’re actually facing.

You tap generate. StreetTongue builds a multi-turn practice conversation with a local character who responds the way a real person in that neighborhood would respond. The conversation has structure: clear goals, a realistic time frame, success criteria. It’s not just a chat.

Then you practice it. You record each of your lines. You get pronunciation feedback the same way you do with any other phrase in the app. Word-by-word. You can hear the reference audio. You can replay it as many times as you want.

If you don’t like the scenario, discard it. If it’s exactly what you needed, save it. Build a library of the specific situations that matter to you.

Three examples that came up during testing:

The dentist situation I described above was actually the first one I tested. The generated conversation started with a receptionist asking how they could help, me explaining the location of the pain, the receptionist asking how long it had been bothering me. Four turns in, I’d said everything I needed to say. I ran it three times before the actual appointment. It went fine.

The landlord negotiation one was trickier. I needed to push back on a rent increase while staying on good terms with someone I was going to keep living next to. The generated scenario got the register right. Respectful but direct. Specific to the social dynamic in CDMX, where tenant-landlord relationships have a particular texture that isn’t captured in any app I’ve used.

The food allergy one happened because a friend was coming to visit and had a serious allergy. We practiced how to explain it at restaurants in a way that would be taken seriously. That’s a situation with real stakes. The app won’t save your life, but practicing the words ahead of time so they come out clearly when it matters, that has value.

Why This Isn’t Just ChatGPT

I know what some of you are thinking. You’re thinking: can’t I just use ChatGPT for this?

You can, sort of. You can prompt ChatGPT to generate a practice conversation. You might even get something decent.

But there are three things StreetTongue does that ChatGPT doesn’t.

First, the structure. A practice scenario isn’t just dialogue. It has goals you’re trying to accomplish, turns that progress toward a realistic outcome, and a clear end point. ChatGPT gives you chat. We give you something you can actually practice against.

Second, the dialect tuning. Telling ChatGPT “respond in Mexico City Spanish” produces something that sounds like Mexico City Spanish to someone who doesn’t live in Mexico City. Grounding the generation in our actual phrase library and cultural data produces something that sounds like Mexico City Spanish to someone who does.

Third, the integration. After you generate a scenario, you practice it in the same app where you’ve been building your pronunciation. Every line gets the same word-by-word scoring. Everything saves to the same library. You don’t leave the flow to go use another tool and come back.

Who This Is For

Honestly, this feature is for anyone who’s ever opened a language app the night before something important and not found what they needed.

It’s for the person with the job interview in a language they’re still learning. The person who needs to explain a medical situation to a specialist. The person who wants to have a real conversation on a first date instead of retreating to tourist-level phrases.

The feature is available with Complete ($149), Premium ($249), and All Cities Lifetime ($449). Complete gives you 5 custom scenarios per month. Premium gives you 30. All Cities Lifetime is unlimited.

If you’re not sure which tier is right for you, most people start with Complete. It covers everything you need for the situations that actually come up. If you find yourself generating scenarios constantly, upgrading to Premium is easy.

You can learn more at streettongueapp.com/custom-scenarios or go straight to pricing if you want to compare tiers.

The thing I wanted from a language app for years is now in the app. I’m glad it’s finally there.

Dax

Why we built custom AI scenarios (and why no other language app has)

The Problem With Fixed Libraries

The Insight

What We Built

Why This Isn’t Just ChatGPT

Who This Is For

Comments

Leave a comment

Related Posts

Why Duolingo Won't Make You Fluent in Mexico City Spanish

The Best Ways to Use StreetTongue to Actually Learn a Language

Buenos Aires Spanish vs Mexico City Spanish: What Expats Need to Know

How to Order Street Food in Mexico City Like a Local