Small Model Reasoning Revolution, Memory Hallucination Evaluation, Creative Limits of Safety Alignment
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
https://arxiv.org/abs/2511.08892
''Lumine'' presents the first open recipe for developing generalist agents capable of completing long-duration complex missions in real-time within 3D open-world environments. The agent integrates perception, reasoning, and action end-to-end based on a vision-language model (VLM), converting 5Hz raw pixel input into precise 30Hz keyboard/mouse operations and adaptively performing reasoning when necessary. Trained in "Genshin Impact," Lumine completes 5 hours of main story content at human-level efficiency, follows natural language instructions across diverse tasks, and notably demonstrates excellent zero-shot generalization performance in other games like "Wuthering Waves" and "Honkai: Star Rail" without separate training.
![[2025 Week 46] MetaX Weekly AI Paper Review](https://metax-images-bucket.s3.ap-southeast-2.amazonaws.com/defaults/aitech2.webp)

