Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7, 2025 • 66
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe Paper • 2410.05248 • Published Oct 7, 2024 • 9
WPO: Enhancing RLHF with Weighted Preference Optimization Paper • 2406.11827 • Published Jun 17, 2024 • 17