UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published Nov 11, 2025 • 37
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 49
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 96
Music Flamingo: Scaling Music Understanding in Audio Language Models Paper • 2511.10289 • Published Nov 13, 2025 • 10
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published Nov 26, 2025 • 35
DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders Paper • 2512.13690 • Published 22 days ago • 2
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published 11 days ago • 37