Alex Zelentsov
azelentsov
ยท
AI & ML interests
Language models
Recent Activity
upvoted
a
paper
about 1 month ago
Emergent Misalignment via In-Context Learning: Narrow in-context
examples can produce broadly misaligned LLMs