FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 116
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 262
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 181
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 260
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 189
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8, 2025 • 185
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11, 2025 • 130