OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published 3 days ago • 36
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation Paper • 2601.15369 • Published 1 day ago • 9
Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation Paper • 2512.24271 • Published 24 days ago • 62
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published 22 days ago • 124
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published 24 days ago • 50
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 31
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published Nov 27, 2025 • 31
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 228
TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space Paper • 2501.12224 • Published Jan 21, 2025 • 48
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design Paper • 2505.19114 • Published May 25, 2025 • 1
IC-Custom: Diverse Image Customization via In-Context Learning Paper • 2507.01926 • Published Jul 2, 2025 • 1
LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer Paper • 2508.00477 • Published Aug 1, 2025 • 10
Does FLUX Already Know How to Perform Physically Plausible Image Composition? Paper • 2509.21278 • Published Sep 25, 2025 • 16
PositionIC: Unified Position and Identity Consistency for Image Customization Paper • 2507.13861 • Published Jul 18, 2025 • 1
DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing Paper • 2510.02253 • Published Oct 2, 2025 • 15
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20, 2025 • 64
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published Apr 2, 2025 • 68