VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published May 29 • 53
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 191
RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models Paper • 2402.12908 • Published Feb 20, 2024 • 10
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization Paper • 2402.13249 • Published Feb 20, 2024 • 15
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability Paper • 2402.12225 • Published Feb 19, 2024 • 9
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements Paper • 2402.10963 • Published Feb 13, 2024 • 12
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation Paper • 2402.11929 • Published Feb 19, 2024 • 11
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration Paper • 2402.11550 • Published Feb 18, 2024 • 18
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots Paper • 2402.10329 • Published Feb 15, 2024 • 15
RLVF: Learning from Verbal Feedback without Overgeneralization Paper • 2402.10893 • Published Feb 16, 2024 • 12
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter Paper • 2402.10896 • Published Feb 16, 2024 • 16
DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization Paper • 2402.09812 • Published Feb 15, 2024 • 16
Computing Power and the Governance of Artificial Intelligence Paper • 2402.08797 • Published Feb 13, 2024 • 15
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models Paper • 2402.08714 • Published Feb 13, 2024 • 15