LongCat-Video
Video generation model (Released: October 27, 2025)
Overview
Video generation model based on Diffusion Transformer (DiT) architecture. Unified support for text-to-video, image-to-video, and video continuation tasks. Generates coherent 5-minute videos at 720p resolution and 30 fps, with emphasis on long temporal sequences, cross-frame consistency, and physical motion plausibility.
Key Features
- 5-minute videos: Long-form coherent video generation
- 720p/30fps: High-quality output
- Unified interface: Text-to-video, image-to-video, and video continuation
- Temporal consistency: Cross-frame coherence
- Physical plausibility: Realistic motion