-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 62
Collections
Discover the best community collections!
Collections including paper arxiv:2601.22975
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 307 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 17 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 195 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 58 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100
-
V-Thinker: Interactive Thinking with Images
Paper • 2511.04460 • Published • 97 -
Visual Spatial Tuning
Paper • 2511.05491 • Published • 52 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 196 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100
-
On Memorization of Large Language Models in Logical Reasoning
Paper • 2410.23123 • Published • 18 -
LLMs Do Not Think Step-by-step In Implicit Reasoning
Paper • 2411.15862 • Published • 9 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 94 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 32
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 59 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 27 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 62
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 24 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 32
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 316 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 62
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 59 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 27 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 62
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 307 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 17 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 24 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 32
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 195 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 58 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100
-
V-Thinker: Interactive Thinking with Images
Paper • 2511.04460 • Published • 97 -
Visual Spatial Tuning
Paper • 2511.05491 • Published • 52 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 196 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 100
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 316 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211
-
On Memorization of Large Language Models in Logical Reasoning
Paper • 2410.23123 • Published • 18 -
LLMs Do Not Think Step-by-step In Implicit Reasoning
Paper • 2411.15862 • Published • 9 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 94 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 32
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75