LongCat-Flash-Chat

Foundation dialogue model (Released: September 1, 2025)

Overview

The foundation dialogue model with 560B parameters in a Mixture-of-Experts (MoE) architecture. It activates approximately 18.6B–31.3B parameters per token (averaging ~27B) through Zero-Computation Experts, achieving competitive quality with high throughput and low latency.

Key Features

128K context length: Supports complex, multi-document tasks
100+ tokens/s: High throughput on H800 GPUs
Zero-Computation Experts: Cost-efficient parameter activation
Strong capabilities: Instruction following, reasoning, and coding
Agentic tool-use: Excellent performance in tool-call scenarios

Resources

Hugging Face Model Card
Benchmark Results
Documentation & Quick Start
Back to Models Overview