Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation Paper • 2508.20470 • Published Aug 28, 2025 • 75
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8, 2025 • 138
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 104
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published Mar 13, 2025 • 79
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6, 2025 • 96
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published Mar 13, 2025 • 27
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published Mar 18, 2025 • 26
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published Mar 18, 2025 • 22