Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17, 2025 • 46
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models Paper • 2512.15713 • Published 20 days ago • 16
In Pursuit of Pixel Supervision for Visual Pre-training Paper • 2512.15715 • Published 20 days ago • 8