Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
z1z2z3zyy 's Collections
rl

rl

updated 30 days ago
Upvote
-

  • Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

    Paper • 2508.13167 • Published Aug 6, 2025 • 129

  • Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

    Paper • 2512.01374 • Published Dec 1, 2025 • 96

  • Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

    Paper • 2511.16043 • Published Nov 20, 2025 • 108

  • Agentic Entropy-Balanced Policy Optimization

    Paper • 2510.14545 • Published Oct 16, 2025 • 104

  • Agentic Reinforced Policy Optimization

    Paper • 2507.19849 • Published Jul 26, 2025 • 158

  • TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

    Paper • 2506.11902 • Published Jun 13, 2025

  • Tree Search for LLM Agent Reinforcement Learning

    Paper • 2509.21240 • Published Sep 25, 2025 • 89
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs