Category
4 posts
GPT-5.5 hit 71% on AISI's expert cyber tasks, beating Mythos. AISI says cyber is emerging as a byproduct of reasoning. What that means for defenders.
Wiz turned a git push into remote code execution on GitHub. Five days earlier, the merge queue silently un-merged 2,092 PRs. One platform, one bad week.
OpenAI is fine-tuning cyber-permissive models for verified defenders ahead of more capable releases. GPT-5.4-Cyber is the first of the tier, explained.
AISI watched the model pop shells, exfiltrate data, cover tracks. Three months ago the ceiling was 11 of 32 steps. Mythos finished all 32.