[2025 Week 47] MetaX Weekly AI Paper Review

Integration of High-Resolution Video and 3D Generation and Structural Understanding Through Omnimodal MoE and Parallel Diffusion Models
Maximization of Scientific Reasoning and Computational Efficiency Through Reinforcement Learning, Model Souping, and Interactive Scaling

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

https://arxiv.org/abs/2511.14993

Kandinsky 5.0 is a family of state-of-the-art foundation models for high-resolution image and 10-second video synthesis, consisting of three core models: Image Lite (6B parameters), Video Lite (fast and lightweight 2B parameters), and Video Pro (19B parameters with excellent video generation quality). The research comprehensively reviews the entire data curation process from collection through filtering and clustering, and introduces a multi-stage training pipeline applying quality enhancement techniques such as supervised fine-tuning (SFT) and post-training based on reinforcement learning (RL). It demonstrates achieving high generation speed and performance through new architecture and inference optimization, and open-sources code and training checkpoints to support the research community''s advancement for use in a wide range of generative applications.

[2025 Week 47] MetaX Weekly AI Paper Review

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Related Articles

Anthropic Raises $65 Billion — The Era of the '$1 Trillion AI Company' Is Almost Here | META-X

Hyundai N Racing Simulator & Driving Joy | META-X

MMORPG History: The Shared World Dream | META-X

Related Articles

AI·테크
Anthropic Raises $65 Billion — The Era of the '$1 Trillion AI Company' Is Almost Here | META-X
이든 기자 · 2026.05.30

AI·테크
Hyundai N Racing Simulator & Driving Joy | META-X
김하영 기자 · 2026.05.21

AI·테크
MMORPG History: The Shared World Dream | META-X
김하영 기자 · 2026.05.20