Overview

LongCat-2.0 is Meituan LongCat's next-generation trillion-parameter open-source model, purpose-built for agentic coding — code understanding, generation, and execution in real-world agent workflows. With 1.6T total parameters and dynamic activation of 33B–56B (averaging ~48B) per token, it is the industry's first trillion-parameter model to complete full training and inference on a 50,000-card domestic compute cluster.

Pretrained from scratch on 30T+ tokens spanning Chinese, English, multilingual, and code data, LongCat-2.0 natively supports 1 million token context. A preview version has been available on OpenRouter and longcat.ai, ranking among the top three models globally by call volume on OpenRouter.

longcat.ai — online experience
OpenRouter — global API access
LongCat API platform — longcat.chat/platform/usage

Model weights and self-hosted deployment guides will be linked here when officially published on Hugging Face / GitHub.

Architecture

LongCat Sparse Attention (LSA) — 1M Context

Traditional models begin to "forget" content beyond ~100K tokens. LSA intelligently selects key information instead of attending to every token, reducing computation from quadratic to linear complexity. This enables precise information retrieval and understanding across 1 million tokens — letting agents see an entire codebase at once.

Zero-Computation Experts + ScMoE

Code tasks vary enormously in complexity — naming a variable vs. deriving a recursive algorithm require vastly different compute. Zero-computation experts enable token-level dynamic activation: simple tokens consume no compute, while complex tokens automatically receive more resources (33B–56B range).

MOPD Multi-Expert Fusion

LongCat-2.0 fuses three expert groups through MOPD (Multi-Teacher On-Policy Distill). Training starts from a LongCat SFT checkpoint, branches into specialized experts, and distills their capabilities into one unified model on domestic accelerator clusters:

Agent Experts: Tool use, API parsing, and self-correction
Reasoning Experts: Multi-hop reasoning, STEM reasoning, and adaptive computation
Interaction Experts: Instruction following, human alignment, and hallucination suppression

LongCat-2.0 MOPD architecture diagram: Multi-Teacher On-Policy Distillation from Agent, Reasoning, and Interaction experts on domestic compute clusters to a unified LongCat-2.0 model — MOPD training pipeline — from LongCat SFT through Agent, Reasoning, and Interaction experts to unified LongCat-2.0. Real-world scenarios integrate agentic execution, reasoning, and interaction; domain experts are fused on domestic accelerator clusters.

At inference time, a gating network dynamically routes each task to the most capable experts, rather than simply merging parameters.

Benchmark Performance

Benchmark	LongCat-2.0	Focus
SWE-bench Pro	59.5	Deep software engineering (leads Gemini 3.1 Pro 54.2, GPT-5.5 58.6, Claude Opus 4.6 57.3)
SWE-bench Multilingual	77.3	Multilingual coding (on par with Claude Opus 4.6 77.8)
Terminal-Bench 2.1	70.8	Real terminal command interaction
RWSearch	78.8	Search agent tasks
FORTE	73.2	Productivity scenarios
BrowseComp	79.9	Complex browsing and retrieval

Real-World Use Cases

Agent building: End-to-end AI SQL Agent — natural language queries, step planning, and business insights
Codebase migration: Analyze legacy code, map to new SDK APIs, refactor with full feature parity and compile-first-pass
Full app development: From one-sentence idea to runnable product with architecture, logic, and UI
3D interactive demos: Single-file Three.js experiences from natural language descriptions
AI content pipelines: Multi-agent novel factory with million-token setting consistency

Training on Domestic Compute

LongCat's exploration of domestic compute began in 2023. Over three years, the team scaled from thousands to 50,000 cards, solving operator adaptation, communication optimization, and distributed stability challenges.

Stability: HCCL exception handling, elastic scaling, auto fault recovery — 70%+ reduction in monthly daily fault rate
Correctness: Deterministic operators, bitwise consistency verification, parameter detection
Efficiency: Pipeline scheduling, memory optimization, operator-level core control — 1.5× MFU improvement
Throughput: Steady-state daily throughput exceeding 1T tokens/day

LongCat-2.0