News & Updates

Latest releases, announcements, and developments

Latest Releases

LongCat-Image Released: Open-Source SOTA Image Generation & Editing Model

Meituan LongCat officially releases and open-sources LongCat-Image, a 6B parameter AI image generation and editing model. Through high-performance architecture design, systematic training strategies, and data engineering, it achieves performance comparable to larger models, providing developers and industry with a "high-performance, low-threshold, fully open" solution.

Open-Source SOTA Performance

  • Image editing: ImgEdit-Bench 4.50, GEdit-Bench Chinese/English 7.60/7.64 (open-source SOTA, approaching top closed-source models)
  • Chinese text rendering: ChineseWord 90.7 (significantly leading all evaluated models), covering all 8,105 standard Chinese characters
  • Text-to-image: GenEval 0.87, DPG-Bench 86.8 (competitive with top open-source and closed-source models)
  • Subjective evaluation: Excellent realism in text-to-image (MOS); significantly outperforms other open-source solutions in image editing (SBS)

Key Technical Highlights

  • Unified architecture: MM-DiT + Single-DiT hybrid backbone with progressive learning strategies; text-to-image and image editing mutually assist
  • Highly controllable image editing: Multi-task joint learning mechanism, instruction editing and text-to-image training, achieving precise instruction following, generalization, and visual consistency
  • Comprehensive Chinese text coverage: Curriculum learning strategy covering 8,105 standard Chinese characters; character-level encoding reduces memory burden; supports complex stroke structures and rare characters
  • Texture and realism enhancement: Systematic data filtering and adversarial training framework; AIGC content detector as reward model for realistic physical textures, lighting, and quality

Applications & Resources

  • LongCat APP: Text-to-image, image-to-image, 24 zero-threshold templates (poster design, portrait refinement, scene transformation)
  • LongCat Web: Available at longcat.ai
  • Fully open-source: Multi-stage models (Mid-training, Post-training) and image editing models available on Hugging Face and GitHub

Learn More → | View Benchmarks →

UNO-Bench Released: One-Stop All-Modality Benchmark

The LongCat team announces UNO-Bench, a unified benchmark with strong Chinese support that evaluates single-modality and omni-modality intelligence in one framework. It reveals the Combination Law — weaker models show bottlenecks while stronger models achieve synergistic gains.

  • 1,250 omni samples + 2,480 single-modality samples; 98% require cross-modal fusion
  • Open-ended multi-step (MO) questions with weighted scoring; automatic scoring model reaches 95% accuracy
  • Audio-visual decoupling and ablations to prevent shortcuts and enforce real fusion
  • Cluster-guided sampling reduces compute by >90% with rank consistency

Combination Law (power-law): POmni ≈ 1.0332 · (PA × PV)^2.1918 + 0.2422

Read more on Benchmarks →

November 2025 - LongCat-Flash-Omni Launch

First open-source real-time all-modality interaction model. Omni unifies text, image, audio, and video with a single end-to-end ScMoE backbone, achieving open-source SOTA on Omni-Bench and WorldSense.

Learn More →

October 27, 2025 - LongCat-Video Launch

Introduction of the video generation model based on Diffusion Transformer architecture. Capable of generating 5-minute coherent videos at 720p/30fps, supporting text-to-video, image-to-video, and video continuation tasks.

Learn More →

September 22, 2025 - LongCat-Flash-Thinking Release

Enhanced reasoning model focusing on agentic and formal reasoning capabilities. Features dual-path reasoning framework and DORA asynchronous training system. Achieves 64.5% token savings in tool-call scenarios.

Learn More →

September 1, 2025 - LongCat-Flash-Chat Open Source

Meituan officially released the foundation dialogue model (MoE, 560B params) with strong throughput and competitive accuracy. Includes Zero-Computation Experts maintaining ~27B active params/token.

Learn More →

Timeline

Date Release Keyword
Latest Image SOTA
Nov 2025 Flash-Omni All
Oct 27, 2025 Video Long
Sep 22, 2025 Flash-Thinking Stable
Sep 1, 2025 Flash-Chat Fast

Technical Achievements

  • Training Scale: Successfully trained on >20T tokens in ~30 days
  • Training Stability: Achieved through variance alignment, hyperparameter transfer, and router balancing
  • Domestic Acceleration: Successfully demonstrated training path on domestic acceleration cards
  • Benchmark Performance: Excellent results on MMLU, CEval, terminal command benchmarks, and agentic tool-call evaluations