D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 141
MATRIX: Mask Track Alignment for Interaction-aware Video Generation Paper • 2510.07310 • Published Oct 8 • 35 • 3
MATRIX: Mask Track Alignment for Interaction-aware Video Generation Paper • 2510.07310 • Published Oct 8 • 35
Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published Sep 9 • 83 • 7