ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published Feb 13, 2025 • 43
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text Paper • 2501.15654 • Published Jan 26, 2025 • 15
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published Jan 31, 2025 • 10
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Dec 11, 2024 • 46