Attackers used “technical assessment” projects with repeatable naming conventions to blend in cloning and build workflows, retrieving loader scripts from remote infrastructure, and minimizing on-disk ...
OpenAI wants to retire the leading AI coding benchmark—and the reasons reveal a deeper problem with how the whole industry measures itself.