(Some) Emergent Misalignment from Reward Hacking in RL Collection Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 29 days ago • 4
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2 Image-Text-to-Text • 28B • Updated 22 days ago • 593k • 116
🇮🇹 Italian NLP Resources Collection Collection of models, datasets and demos relevant to Italian NLP 🇮🇹 • 300 items • Updated Mar 26 • 34
👤 Implicit Personalization in Language Models Collection Works on detecting, attributing and controlling implicit personalization in language models • 29 items • Updated Mar 20 • 3