This week''s review covers notable AI research from 2024 Week 34, spanning image generation, 3D reconstruction, computational imaging, LLM evaluation, multimodal learning, and automated AI system design.
Imagen 3 (Google DeepMind): Advanced text-to-image generation model with significantly improved prompt understanding, finer detail rendering, richer lighting, and fewer artifacts vs. previous versions; optimized variants from rapid sketching to high-resolution output; incorporates SynthID watermarking for AI-generated content identification; available through ImageFX and Vertex AI.
MeshFormer: Builds realistic 3D models from sparse multi-view images using transformer architecture — demonstrating that transformer attention mechanisms can effectively reason about 3D geometry from limited viewpoints, enabling rapid 3D asset creation for games and industrial applications.
DifuzCam: Lensless camera system restoring high-quality images using diffusion models — revolutionary computational imaging approach eliminating traditional optics, enabling ultra-thin cameras with applications in medical imaging, robotics, and mobile devices.
Self-Taught Evaluator (Meta): LLM performance evaluation without human intervention — the model generates its own evaluation criteria and scores responses, addressing the bottleneck of human annotation in LLM quality assessment pipelines.
BLIP-3: Efficient large-scale multimodal model training framework optimizing the tradeoff between model capability and training compute, providing accessible path to multimodal LLM development.
DEEM: Improves LLM visual perception capability for enhanced multimodal model robustness — addressing the systematic gap between language and vision processing that limits multimodal model reliability.
ADAS (Automated Design of Agentic Systems): AI systems designing stronger AI systems — meta-learning approach where AI searches over agent designs, demonstrating early potential for recursive AI capability improvement with important safety implications.
![[24W34] Latest AI Paper Tech Trends (Imagen 3, MeshFormer, DifuzCam, Self-Taught Evaluators)](https://metax-images-bucket.s3.ap-southeast-2.amazonaws.com/articles/24w34-ai-imagen-3-meshformer-difuzcam-self-taught-evaluator-deem-adas-1065587657145967/img-1.webp)