
For two months, a model called “Owl Alpha” quietly dominated OpenRouter’s usage charts — consistently ranking ahead of well-known proprietary models by actual call volume. Nobody disclosed who built it. On June 29, Meituan stepped forward: Owl Alpha was their LongCat-2.0-Preview, a 1.6 trillion parameter agentic coding model trained entirely on Chinese-made chips. The next day, they open-sourced the full release under MIT. Ten trillion tokens processed during stealth. Now it’s free to use and integrate.
What LongCat-2.0 Actually Is
LongCat-2.0 is a sparse Mixture-of-Experts (MoE) model built specifically for agentic coding tasks — code generation, repository-level edits, multi-step task execution, and long-horizon workflows. The headline number is 1.6 trillion total parameters, but only 33B–56B activate per token, so inference cost is far more manageable than the size implies. It supports a native 1 million token context window via LongCat Sparse Attention, which reduces attention computation from quadratic to linear complexity.
The model ships under MIT license: commercial use, fine-tuning, and redistribution are all permitted. Weights are on GitHub and Hugging Face. No special agreements, no regional restrictions.
Why the Stealth Period Matters More Than Any Benchmark
Meituan’s approach was deliberate. Rather than releasing LongCat-2.0 with a press release and self-reported scores, they ran the model anonymously on OpenRouter for approximately two months. During that period, Owl Alpha accumulated 10.1 trillion monthly tokens — averaging 559 billion tokens per day — with 242% month-over-month growth. It ranked first on the Hermes Agent workspace, second on Claude Code, and third on OpenClaw, all by actual call volume.
That is a different kind of validation than a leaderboard screenshot. Developers chose this model over labeled alternatives, repeatedly, without knowing who made it. When Meituan stepped forward, the adoption data was already there — and that is considerably harder to fabricate than a benchmark score.
What the Benchmarks Actually Show
On SWE-bench Pro — the most demanding public measure of real-world coding agent performance — LongCat-2.0 scores 59.5 against GPT-5.5’s 58.6. That 0.9-point gap is within evaluation noise. “Competitive” is the right read, not “clearly better.” It also posts 70.8 on Terminal-Bench 2.1 and 77.3 on SWE-bench Multilingual. Claude Opus 4.7 and 4.8 still lead on SWE-bench Pro. These are Meituan’s own numbers and have not been independently replicated.
The benchmark story is this: LongCat-2.0 runs with the top coding models. It does not blow them away.
How to Start Using It
For developers already using Claude Code, integration is a three-variable config change. The LongCat API is fully compatible with Anthropic’s API format:
export ANTHROPIC_AUTH_TOKEN="your-longcat-api-key"
export ANTHROPIC_BASE_URL="https://api.longcat.chat/anthropic"
export ANTHROPIC_MODEL="longcat-2-0"
Pricing is $0.75 per million input tokens and $2.95 per million output tokens, with cached reads free. Compare that to GPT-5.6 Terra at $2.50/$15.00 or Sol at $5.00/$30.00. For high-volume agentic coding workloads, that cost difference is not marginal. LongCat-2.0 is also available directly on OpenRouter under its real name, and through AIMLAPI.
| Model | Input ($/M tokens) | Output ($/M tokens) |
|---|---|---|
| LongCat-2.0 | $0.75 | $2.95 |
| GPT-5.6 Terra | $2.50 | $15.00 |
| GPT-5.6 Sol | $5.00 | $30.00 |
The ASIC Training Story Is the Real News
What most coverage has buried is the training hardware. LongCat-2.0 was trained from scratch on a 50,000-card cluster of domestic Chinese AI ASICs — not NVIDIA GPUs. DeepSeek V4-Pro used domestic chips only for inference. Full trillion-parameter pre-training on alternative hardware at this scale is considerably harder, and LongCat-2.0 is the first model to claim it.
For developers, the immediate implication is limited — you still access it through the same API. The larger implication is that US export controls have not produced the AI capability gap they were designed to create, at least not at the pace originally expected. A credible, open-weight, MIT-licensed coding model is now competing at frontier level without touching a single NVIDIA card.
Worth Your Attention, Not Your Hype
LongCat-2.0 is a real, capable, open-source coding model at an aggressive price point. The stealth period produced genuine validation that no benchmark could replicate. For teams running high-volume agentic workflows, the pricing alone justifies testing it. For those already on Claude Code, the integration is a single config change.
Do not expect it to replace Claude Opus for demanding coding tasks — the benchmark gap is real, even if narrower than before. But at $0.75 per million input tokens under MIT terms, it belongs in your model evaluation queue. Start with the official model page for the full spec breakdown.













