VLM RL Reasoning - a Nazzaroth2 Collection

Nazzaroth2 's Collections

Reward Modeling

models to test out

RL_Papers in general

OCR

VLM RL Reasoning

LLM-External_information

llm_compression

LLM_Reasoning-ErrorCorrection

Loras

3D (nerfs, gaussians, generation etc.)

t2i consistency works

videogames_roleplay

small_or_multimodal_llm

manga_translation

VLM RL Reasoning

updated Jul 1, 2025

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published Mar 21, 2025 • 24
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation

Paper • 2503.16660 • Published Mar 20, 2025 • 72
CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24, 2025 • 30
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Paper • 2503.13964 • Published Mar 18, 2025 • 20
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 168
ViLBench: A Suite for Vision-Language Process Reward Modeling

Paper • 2503.20271 • Published Mar 26, 2025 • 7
Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27, 2025 • 79
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published Apr 3, 2025 • 32
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Paper • 2504.07934 • Published Apr 10, 2025 • 20
Efficient Medical VIE via Reinforcement Learning

Paper • 2506.13363 • Published Jun 16, 2025 • 31
Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling?

Paper • 2506.17417 • Published Jun 20, 2025 • 11