Dramatically Improved Performance and Efficiency of Next-Generation AI Models Through Innovative Architecture and Training Techniques
Strengthened Practical Utility and Reliability Through Distributed Inference, Tool Integration, Knowledge Protection, and Verification Platforms

This week''s META-X AI paper review covers next-generation model development, deployment optimization, and evaluation advances.

Next-Generation AI Model Development: InternVL3 introduces "native multimodal pre-training" paradigm — training text and multimodal data together from scratch (vs. adapting text-only LLM afterward); uses V2PE for longer context, SFT and MPO for post-training, test-time scaling; achieves top open-source performance on MMMU benchmark; plans to release data and models. Seaweed-7B demonstrates cost-efficient video generation training strategy competitive with much larger models. GigaTok scales visual tokenizers to billion-parameter range improving image generation quality. CLIMB framework automatically finds optimal data mixture ratios for LLM pre-training. Genius develops unsupervised self-training for LLM reasoning enhancement without external ground truth. BitNet b1.58 2B4T implements 1-bit architecture at 2B parameter scale.

Model Deployment and Efficiency: PRIMA.CPP enables large LLM execution on commodity PC clusters (low-spec environments). ReTool teaches LLMs to strategically use external tools (code execution) through RL. Antidistillation Sampling prevents knowledge distillation/IP leakage from model outputs.

Model Evaluation and Analysis: xVerify effectively judges LLM answer correctness in complex reasoning traces. ColorBench systematically measures vision-language model color recognition and reasoning. GPT-4o Study provides deep analysis of knowledge-grounded image generation and editing capabilities.