Nazzaroth2
's Collections
VLM RL Reasoning
updated
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning
via Iterative Self-Improvement
Paper
•
2503.17352
•
Published
•
24
When Less is Enough: Adaptive Token Reduction for Efficient Image
Representation
Paper
•
2503.16660
•
Published
•
72
CoMP: Continual Multimodal Pre-training for Vision Foundation Models
Paper
•
2503.18931
•
Published
•
30
MDocAgent: A Multi-Modal Multi-Agent Framework for Document
Understanding
Paper
•
2503.13964
•
Published
•
20
Qwen2.5-Omni Technical Report
Paper
•
2503.20215
•
Published
•
168
ViLBench: A Suite for Vision-Language Process Reward Modeling
Paper
•
2503.20271
•
Published
•
7
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper
•
2503.21776
•
Published
•
79
Rethinking RL Scaling for Vision Language Models: A Transparent,
From-Scratch Framework and Comprehensive Evaluation Scheme
Paper
•
2504.02587
•
Published
•
32
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual
Reasoning Self-Improvement
Paper
•
2504.07934
•
Published
•
20
Efficient Medical VIE via Reinforcement Learning
Paper
•
2506.13363
•
Published
•
31
Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in
Inference-time Scaling?
Paper
•
2506.17417
•
Published
•
11