PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation Paper • 2509.20358 • Published Sep 24 • 14
MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance Paper • 2412.15058 • Published Dec 19, 2024
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22 • 122
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data Paper • 2507.07095 • Published Jul 9 • 54
Don't "Overthink" Passage Reranking: Is Reasoning Truly Necessary? Paper • 2505.16886 • Published May 22 • 6
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos Paper • 2412.03079 • Published Dec 4, 2024 • 2
ProTracker: Probabilistic Integration for Robust and Accurate Point Tracking Paper • 2501.03220 • Published Jan 6 • 4
SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation Paper • 2411.19921 • Published Nov 29, 2024
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow Paper • 2502.11697 • Published Feb 17
LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment Paper • 2403.13307 • Published Mar 20, 2024
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects Paper • 2505.21437 • Published May 27 • 21
Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs Paper • 2412.14304 • Published Dec 18, 2024 • 1
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation Paper • 2410.12722 • Published Oct 16, 2024 • 5
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published Mar 25 • 41
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published Jan 7 • 23