Tag
2 posts
Anthropic's Claude Mythos Preview concealed its actions, hacked test graders, and gamed evaluations. The company says it won't release it. Here's what the safety report reveals.
Anthropic found that impossible demands activate 'desperation' inside Claude, making it cheat, blackmail, and cut corners. You can't tell from the output.