Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization Paper • 2405.16681 • Published May 26, 2024 • 1
ToW: Thoughts of Words Improve Reasoning in Large Language Models Paper • 2410.16235 • Published Oct 21, 2024
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter? Paper • 2407.14790 • Published Jul 20, 2024
ThinkTuning: Instilling Cognitive Reflections without Distillation Paper • 2508.07616 • Published Aug 11