rl - a z1z2z3zyy Collection

z1z2z3zyy 's Collections

rl

rl

updated 30 days ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6, 2025 • 129
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 96
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published Nov 20, 2025 • 108
Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16, 2025 • 104
Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

Paper • 2506.11902 • Published Jun 13, 2025
Tree Search for LLM Agent Reinforcement Learning

Paper • 2509.21240 • Published Sep 25, 2025 • 89