Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions Paper • 2606.09076 • Published 7 days ago • 57
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 6 days ago • 40
SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer Paper • 2605.30409 • Published 18 days ago • 38
Representation Forcing for Bottleneck-Free Unified Multimodal Models Paper • 2605.31604 • Published 17 days ago • 60
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 18 days ago • 140
Geo-Align: Video Generation Alignment via Metric Geometry Reward Paper • 2605.23903 • Published 24 days ago • 10
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 26 days ago • 110
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published May 14 • 86
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation Paper • 2605.06376 • Published May 7 • 27
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper • 2605.05204 • Published May 6 • 27
Lightning Unified Video Editing via In-Context Sparse Attention Paper • 2605.04569 • Published May 6 • 18
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published Apr 27 • 118
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published Mar 30 • 58