This article reviews notable AI research papers published in Week 36 of 2024 (24W36), covering scientific literature LLMs, multimodal models, time series forecasting, and efficient architectures.

Language Models: SciLitLLM adapts LLMs for scientific literature understanding through hybrid CPT (Continued PreTraining) and SFT (Supervised Fine-Tuning) strategy — building high-quality science corpora through PDF extraction, parsing error correction, quality filtering, and synthetic instruction generation. Mini-Omni introduces end-to-end audio-based dialogue enabling real-time speech interaction through text-instructed speech generation with parallel batch strategies — maintaining language capabilities with minimal degradation, open-sourced as VoiceAssistant-400K. OLMoE presents a fully open Mixture-of-Experts language model with transparent training data and methodology. LongRecipe proposes efficient training strategies for long-context generalization. LongCite improves fine-grained citation generation in long-context QA.

Vision/Multimodal: VisionTS reframes time series forecasting as image reconstruction using pretrained visual MAE (Masked Autoencoder) — achieving superior zero-shot forecasting without domain adaptation, suggesting cross-domain transfer from computer vision to time series analysis. Kvasir-VQA provides text-image pair dataset for gastrointestinal tract diagnosis through visual question answering. LongLLaVA efficiently processes 1,000+ images through hybrid Mamba-Transformer architecture. Loopy generates audio-driven portrait avatars with temporal consistency. Guide-and-Rescale enables effective real image editing through classifier guidance and rescaling techniques. Attention Heads Survey provides comprehensive analysis of LLM attention head specialization. FuzzCoder introduces LLM-based fuzzing for security testing.