rlhf/finetune
updated
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Paper
• 2402.12366
• Published
• 3
Contrastive Preference Optimization: Pushing the Boundaries of LLM
Performance in Machine Translation
Paper
• 2401.08417
• Published
• 37
Insights into Alignment: Evaluating DPO and its Variants Across Multiple
Tasks
Paper
• 2404.14723
• Published
• 10
Self-Play Preference Optimization for Language Model Alignment
Paper
• 2405.00675
• Published
• 28
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
• 2406.00888
• Published
• 33
Iterative Length-Regularized Direct Preference Optimization: A Case
Study on Improving 7B Language Models to GPT-4 Level
Paper
• 2406.11817
• Published
• 13
Following Length Constraints in Instructions
Paper
• 2406.17744
• Published
• 1
Understanding the performance gap between online and offline alignment
algorithms
Paper
• 2405.08448
• Published
• 18
Direct Language Model Alignment from Online AI Feedback
Paper
• 2402.04792
• Published
• 34
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper
• 2310.13639
• Published
• 25
Paper
• 2408.02666
• Published
• 29
Training Language Models to Self-Correct via Reinforcement Learning
Paper
• 2409.12917
• Published
• 140
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
• 2401.01335
• Published
• 68
The Differences Between Direct Alignment Algorithms are a Blur
Paper
• 2502.01237
• Published
• 113
Critique Fine-Tuning: Learning to Critique is More Effective than
Learning to Imitate
Paper
• 2501.17703
• Published
• 59
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model
Post-training
Paper
• 2501.17161
• Published
• 124
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
• 2501.03262
• Published
• 104
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
• 2501.11425
• Published
• 109
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published
• 441
START: Self-taught Reasoner with Tools
Paper
• 2503.04625
• Published
• 113
Exploring Data Scaling Trends and Effects in Reinforcement Learning from
Human Feedback
Paper
• 2503.22230
• Published
• 45
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper
• 2505.03335
• Published
• 189