Documentation & Quick Start - LongCat AI

Quick Start

LongCat-Flash uses a chat template defined in tokenizer_config.json. Examples:

First Turn

[Round 0] USER:{query} ASSISTANT:

With System Prompt

SYSTEM:{system_prompt} [Round 0] USER:{query} ASSISTANT:

Multi-Turn

SYSTEM:{system_prompt} [Round 0] USER:{q} ASSISTANT:{r} ... [Round N-1] USER:{q} ASSISTANT:{r} [Round N] USER:{q} ASSISTANT:

Tool Call Envelope

{tool_description}

## Messages
SYSTEM:{system_prompt} [Round 0] USER:{query} ASSISTANT:

<longcat_tool_call>{"name": <function-name>, "arguments": <args-dict>}</longcat_tool_call>

Deployment

Flash-Chat & Flash-Thinking

SGLang and vLLM adaptations enable high-throughput inference for LongCat-Flash models. Deployment guides cover environment setup, tensor parallelism, and inference configurations. Supports both single-user and multi-user scenarios with cost-efficient inference around $0.7 per 1M output tokens on H800 GPUs.

Video Generation

LongCat-Video provides unified interfaces for text-to-video, image-to-video, and video continuation tasks. Optimized for generating long-form videos (up to 5 minutes) with high temporal consistency and physical motion plausibility.

Image Generation & Editing

LongCat-Image powers AI image generation and editing on LongCat Web and LongCat APP. Features include:

Text-to-image: Generate high-quality images from simple text prompts
Image editing: 15 editing task types with natural language (object add/remove, style transfer, perspective change, portrait refinement, text modification, etc.)
Multi-round editing: Iterate on generated images without quality loss
Chinese text rendering: Superior support for Chinese characters, including rare characters and calligraphy styles
Efficient inference: Supports consumer-grade GPU inference through optimized model design

Try it now: longcat.ai or download LongCat APP (iOS: search "LongCat" in App Store).

Official Resources

License & Usage

All LongCat models are released under the MIT License, allowing model distillation, fine-tuning, and secondary development. Evaluate and validate the models before use in sensitive or high-risk scenarios, and ensure compliance with applicable laws and regulations for your use case.