Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

liked a dataset 2 days ago

ai-safety-institute/propensity-inference

liked a dataset 3 days ago

evaleval/EEE_datastore

upvoted a collection 4 days ago

View all activity

Organizations

liked a dataset 2 days ago

ai-safety-institute/propensity-inference

Viewer • Updated 13 days ago • 7 • 8 • 1

liked a dataset 3 days ago

evaleval/EEE_datastore

Viewer • Updated about 9 hours ago • 10.7k • 2.97k • 20

upvoted a collection 4 days ago

DeepSeek-V4

4 items • Updated 4 days ago • 553

upvoted a collection 7 days ago

(Some) Emergent Misalignment from Reward Hacking in RL

Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 29 days ago • 4

liked a model 7 days ago

moonshotai/Kimi-K2.6

Image-Text-to-Text • 1.1T • Updated 5 days ago • 443k • • 1.1k

liked a dataset 8 days ago

allenai/WildChat-4.8M

Viewer • Updated Aug 11, 2025 • 3.2M • 6.49k • 149

liked 2 models 11 days ago

SandyResearch/parcae-140m

Text Generation • Updated 11 days ago • 2.67k • 3

Qwen/Qwen3.6-35B-A3B

Image-Text-to-Text • 36B • Updated 4 days ago • 1.35M • 1.46k

liked a model 25 days ago

google/gemma-4-31B-it

Image-Text-to-Text • 33B • Updated 17 days ago • 6.31M • • 2.4k

liked a model 27 days ago

LiquidAI/LFM2.5-350M

Text Generation • 0.4B • Updated 26 days ago • 57k • 284

liked 2 models 28 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Image-Text-to-Text • 28B • Updated 22 days ago • 593k • 116

microsoft/harrier-oss-v1-0.6b

Feature Extraction • 0.6B • Updated 29 days ago • 161k • • 209

liked a model 29 days ago

facebook/tribev2

Updated Mar 27 • 166k • 443

updated a collection about 1 month ago

🇮🇹 Italian NLP Resources

Collection of models, datasets and demos relevant to Italian NLP 🇮🇹 • 300 items • Updated Mar 26 • 34

liked a model about 1 month ago

nickprock/zagreus-0.4B-ita-embeddings

Sentence Similarity • 0.4B • Updated Mar 23 • 48 • 5

updated a dataset about 1 month ago

gsarti/cruciverbit_augmented

Viewer • Updated Mar 24 • 1.56M • 20 • 1

published a dataset about 1 month ago

gsarti/cruciverbit_augmented

Viewer • Updated Mar 24 • 1.56M • 20 • 1

updated a collection about 1 month ago

👤 Implicit Personalization in Language Models

Works on detecting, attributing and controlling implicit personalization in language models • 29 items • Updated Mar 20 • 3

updated a dataset about 1 month ago

gsarti/temp_crosswords_train_v3

Viewer • Updated Mar 17 • 541k • 14