Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 27 days ago • 102
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 24 days ago • 117
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 15 days ago • 150
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 19 days ago • 64
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published Feb 27 • 88
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published 28 days ago • 86
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published 17 days ago • 145
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding Paper • 2603.13366 • Published 22 days ago • 94