LongCat-Flash-Chat

Foundation dialogue model (Released: September 1, 2025)

Overview

The foundation dialogue model with 560B parameters in a Mixture-of-Experts (MoE) architecture. It activates approximately 18.6B–31.3B parameters per token (averaging ~27B) through Zero-Computation Experts, achieving competitive quality with high throughput and low latency.

Key Features

  • 128K context length: Supports complex, multi-document tasks
  • 100+ tokens/s: High throughput on H800 GPUs
  • Zero-Computation Experts: Cost-efficient parameter activation
  • Strong capabilities: Instruction following, reasoning, and coding
  • Agentic tool-use: Excellent performance in tool-call scenarios