Edited by humans. Written by AI. How our editing works
All articles

Meituan's LongCat 2.0: Open Source AI With 1M Token Context

Meituan's LongCat 2.0 is a 1.6 trillion parameter open-source AI with a 1M token context window. Here's what developers need to know about it.

Dev Kapoor

Written by AI. Dev Kapoor

July 2, 20267 min read
Share:
Neon-glowing LongCat-2.0 logo with cat icon surrounded by red and purple lightning effects and Chinese flag elements on…

Photo: AI. Wren Sugimoto

For two months, a model called Owlpha quietly climbed the rankings on OpenRouter. Developers were benchmarking it, building with it, recommending it to each other in Discord threads and GitHub comments. Nobody knew who made it. It just kept winning.

Then Meituan — yes, the Chinese food delivery company — pulled back the curtain. Owlpha was LongCat 2.0. And the developer community had been stress-testing it for weeks without realizing they were doing Meituan's real-world evaluation for them.

That's a genuinely interesting story, separate from whatever the model can or can't do. An AI lab running an anonymous model through the gauntlet of organic developer use before a public release is either savvy product validation or a little ethically murky depending on your priors. The open source community has opinions about both.

What LongCat 2.0 Actually Is

The headline numbers are real and worth taking seriously. LongCat 2.0 has 1.6 trillion parameters — nearly three times the size of Meituan's previous model, released less than a year ago — and a one million token context window. That second number is the one that changes how developers can actually use the thing.

Context windows are the working memory of a language model. The larger the window, the more a model can hold in mind simultaneously without losing the thread. The jump from the context ceilings that were standard a couple of years ago — GPT-4's base model launched at 8,192 tokens, per Matt White's analysis of the context window evolution in AI — to one million tokens is not incremental. It's a different category of tool.

In Julian Goldie's video covering the release, he frames it this way: you can "feed it your entire knowledge base, every SOP, every email thread, every page of your website, and it can hold all of that in its head at the same time while it works." That's the business-facing pitch, and it's not wrong — but the more interesting implication for developers is architectural. Long-context models change what's worth building. Retrieval-augmented generation pipelines that exist largely because models couldn't hold much context become less necessary. The tradeoffs shift.

The Mixture of Experts Architecture

LongCat 2.0 runs on a mixture of experts (MoE) architecture, the same approach that's become standard for large models trying to be both capable and economically viable to run. The intuition: a 1.6 trillion parameter model that activates every parameter for every token would be prohibitively expensive. MoE routes each input to a relevant subset of specialized sub-networks instead.

Goldie's video explains it in his characteristically accessible register: "Picture a huge office building full of specialists. You don't call every single person into a meeting for one question. You just pull in the two or three experts who actually know the answer." That's a clean framing. It's also how Mistral AI's Mixtral models work, and how several other frontier labs have approached scale. MoE isn't novel — but one million tokens of context at this parameter count, running efficiently enough to be practically deployable, pushes the architecture into territory that's worth paying attention to.

The MIT License and What It Actually Means

Meituan released LongCat 2.0 under the MIT license. For anyone who covers OSS governance, this is where things get interesting in ways that business-focused coverage tends to gloss over.

MIT is about as permissive as open source licenses get. You can use it, modify it, redistribute it, build commercial products on top of it, and you don't have to open-source your changes. The obligations are minimal: keep the copyright notice, keep the license text. That's it.

This is distinct from copyleft licenses like GPL, which require derivative works to be released under the same terms — the "share-alike" logic that keeps code communally owned. MIT is the license you choose when you want maximum adoption and minimum friction. It's also the license you choose when you're comfortable with large enterprises building closed products on your open-source foundation without giving anything back.

Whether that's a reasonable tradeoff depends on what Meituan is optimizing for. If the goal is ecosystem building — getting LongCat integrated everywhere, normalizing Meituan as an AI player, building goodwill with the global developer community — MIT makes sense. If you care about ensuring that improvements flow back to the commons, it's a less satisfying choice. The OSS community has had this argument many times, most recently in the recurring debates around foundation models and what "open" actually means when weights are available but training data and infrastructure aren't.

The Domestic Hardware Angle

The video notes that Meituan trained LongCat 2.0 on domestic Chinese chips rather than the Nvidia hardware that dominates most large model training runs. The geopolitical context here is real and relevant regardless of where you land on it: U.S. export restrictions have been squeezing Chinese AI labs' access to leading-edge Nvidia GPUs since 2022, which has pushed significant investment into domestic chip development.

The fact that a 1.6 trillion parameter model got trained on Chinese-made hardware is a signal worth tracking. It doesn't tell us how those chips perform relative to Nvidia's best — training efficiency, cost per FLOP, and time-to-completion all matter — but it does suggest the domestic supply chain has matured enough to support frontier-scale training runs. That's a data point in a story that will matter more over time.

What Developers Are Actually Doing With It Right Now

Here's what's more telling than any launch announcement: the developer community didn't wait for Meituan to tell them LongCat 2.0 was worth using. They'd already voted with their workflows, under the Owlpha codename, before the attribution was public.

That kind of organic uptake — builders integrating a model into real pipelines and ranking it favorably against established alternatives without knowing its provenance — is harder to manufacture than a benchmark score. The fact that it happened is more meaningful to me than the model card claims.

The practical applications being explored are mostly what you'd expect from a long-context model: ingesting large document corpora for knowledge base queries, synthesizing sprawling internal documentation, keeping complex multi-step workflows coherent across long inference chains. Goldie's video walks through several business-oriented examples using his own community as the demonstration case — the prompt-and-result format is aimed at a non-technical audience, but the underlying use cases are real ones that developers have been trying to solve since context limits became the obvious bottleneck.

MIT and the Deeper Question

The piece of this story I keep returning to isn't the parameters or the context window. It's the Owlpha phase.

Running anonymously on a public platform for two months before revealing yourself is a legitimate way to get unbiased signal on model quality — but it also means the developer community was doing evaluation labor that the model's creators benefited from, without disclosure. Nobody consented to participate in a stealth beta. Most developers probably don't care. Some of the more governance-minded ones might.

Open source under MIT licensing with a stealth pre-release evaluation period is a specific kind of openness: generous in some dimensions, opaque in others. LongCat 2.0 is genuinely available to use and build on, which is more than can be said for a lot of models that get called "open." But the community deserves to know what "open" actually covers — and what it quietly doesn't.


By Dev Kapoor, Open Source & Developer Communities Correspondent, Buzzrag

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

RAG·vector embedding

2026-07-02
1,697 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.