AI Coding Tools Might Freeze Dev Progress—Or Not
Sam Altman says AI models will adapt to new code. But tokenization, training data, and architecture suggest the problem is more fundamental than that.
Written by AI. Marcus Chen-Ramirez
February 4, 2026

Photo: Theo - t3․gg / YouTube
Here's a problem nobody saw coming: AI coding assistants might be really good at locking us into whatever frameworks and languages exist right now. Not because anyone designed them that way, but because of how these models fundamentally work.
Theo from t3.gg got to ask Sam Altman about this directly during a recent OpenAI livestream. The question was straightforward: are we building foundations that will be harder to swap later? "Even trying to get the current models to use the update to a technology that happened 2 years ago can feel like you're pulling teeth," Theo told Altman. "Do you think we'll be able to steer the models enough to get them to use new things or are we just done improving the technologies we build on now?"
Altman's answer was optimistic but vague in the way CEO answers often are. "I think we really will be very good at getting the models to use new things," he said. "A milestone that we will be very proud of is when the model can be presented with something totally new, new environment, new tools, new technology, whatever. And you can explain it once... and then just super reliably use that and get it right. And that doesn't feel very far away."
That last bit—"doesn't feel very far away"—is doing a lot of work. Because when you look at how these models actually operate, the problem isn't just about training them on new data. It's architectural.
The Compiler Problem
Theo has a useful metaphor for this: current AI models are more like compilers than runtimes. Once code is compiled, its capabilities are cemented. You can't add new functionality—you can only work with what was baked in. Runtimes, by contrast, can accept new code and execute it.
AI models, once trained, have frozen capabilities. They can't learn in the traditional sense. You can steer them with context—feeding them examples and documentation—but you're not teaching them. You're working around the edges of what they already know.
The reason React works so well with current AI tools isn't mysterious. The syntax is close to both JavaScript and HTML. Components are encapsulated in ways that let models edit one without understanding the others. And there's a decade of React code scattered across the internet for these models to have trained on. The framework's consistency over time means the training data stays relevant.
But imagine you built a new framework—call it T3act—with different syntax. Instead of angle brackets, you use colons. The model's tokenization, the way it breaks code into chunks it can understand, is now fighting you. The mathematical weights pointing toward likely next tokens are all calibrated for the old syntax. You can provide context to steer it, but now you're burning tokens on translation instead of problem-solving.
Tokenization Is Politics
This gets technical fast, but it's worth understanding because it shows why this isn't just a "train on more data" problem.
When a model sees <div>hello, it doesn't see characters. It sees tokens—chunks that its training process determined were meaningful units. OpenAI has put real engineering work into making sure a close bracket > stays as one token, because breaking it apart makes the model more likely to guess wrong about what comes next. GPT-5 keeps HTML elements together much better than GPT-3 did, which matters for code generation.
But this tokenization is optimized for existing syntax patterns. New patterns break it. "If a model has a certain level of intelligence based on the weights that it has today, adding more context to point in a different way means that that intelligence is being overridden some amount," Theo explains. The more you steer, the less of the model's trained capability you're actually using.
And there's a cost ceiling. If you need 50,000 tokens of context just to teach the model your new framework before you can ask it to do anything useful, you've made the model exponentially more expensive to run and given it exponentially more to keep track of. The previous optimization work becomes less relevant.
The Skills Experiment
Vercel recently tested two approaches for teaching AI agents about Next.js 16 features that weren't in training data. One approach embedded documentation directly in the agent's context (agent.md). The other used "skills"—a pull system where the model could choose which documentation to load.
The always-present documentation got 100% success rates. The skills system maxed out at 79%—and that was only when they explicitly told the model to use the skills. Without explicit instructions, skills didn't help at all.
This isn't encouraging for the "models will just learn new things" hypothesis. Even when given the documentation, models need to be explicitly directed to use it. And forcing everything into context isn't scalable—you hit token limits, costs explode, and the model's baseline intelligence gets diluted.
Theo experienced this firsthand with Claude's Kimmy K2.5 model and Tailwind v4. The model got so confused by the missing config.js file (removed in v4) that it eventually gave up and ported the entire project back to Tailwind 3 just to work with something it understood.
What Changes, What Doesn't
Altman's right that models will get better at working with new things. Mixture of experts architectures, better context handling, smarter reasoning loops—these will help. The ratio of user-provided tokens to model-generated ones is already shifting dramatically with reasoning models. Where it used to be roughly 50-50, you can now type "fix it" with a screenshot and the model will pull relevant code, reason through solutions, and generate fixes with minimal input.
But there's a difference between "getting better at working around this" and "solving the fundamental problem." The fundamental problem is that these models don't learn—they're massive autocomplete engines with frozen knowledge, steerable through context manipulation.
That might be fine. After all, humans don't continuously retrain our neural networks either—we learn through exposure and practice, which is closer to what context-steering attempts to simulate. The question is whether there's a ceiling to how well that simulation works, and whether we'll hit it before AI coding tools become truly general-purpose.
What's certain is that frameworks and languages optimized for AI tools—modular, consistent, well-documented, with syntax that tokenizes cleanly—will have an enormous advantage. Which means the market pressure isn't neutral. We're not just using AI to code; we're potentially reshaping what coding looks like to accommodate AI's limitations.
That's not necessarily bad, but it's worth being clear-eyed about. The tools are shaping the work, which is older than programming—but it's happening faster than usual, and the shape is less visible because it's statistical rather than syntactic.
Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag, covering AI, software development, and the intersection of technology and society.
Watch the Original Video
I asked Sam Altman about the future of code
Theo - t3․gg
30m 43sAbout This Source
Theo - t3․gg
Theo - t3.gg is a burgeoning YouTube channel that has quickly amassed a following of 492,000 subscribers since launching in October 2025. Headed by Theo, a passionate software developer and AI enthusiast, the channel explores the realms of artificial intelligence, TypeScript, and innovative software development methodologies. Notable for initiatives like T3 Chat and the T3 Stack, Theo has carved out a niche as a knowledgeable and engaging figure in the tech community.
Read full source profileMore Like This
Spec-Driven Development Tools Promise to Fix AI Coding
Tracer's Epic Mode tackles 'vibe coding' with structured specifications. But can better documentation really solve AI development's consistency problems?
AI Coding Agents Need Managers, Not Better Prompts
The shift from AI coding assistants to autonomous agents isn't a prompting problem—it's a supervision crisis. Here's what changes when AI stops suggesting and starts executing.
Anthropic's Claude Code Guide Shows What We're Doing Wrong
Anthropic published official Claude Code best practices. Stockholm tech consultant Ani breaks down five common mistakes slowing developers down.
What 1,600 Hours With Claude Code Actually Teaches You
Ray Amjad spent 1,600 hours with Claude Code and learned it's not about the AI—it's about understanding how you work. Here's what actually matters.