BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Paper
• 2505.11080 • Published
• 5
This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".