OpenAI Researches AI Agents Detecting Smart Contract Flaws

by MISSISSIPPI DIGITAL MAGAZINE


OpenAI has launched a new benchmark that evaluates how well different AI models detect, patch, and even exploit security vulnerabilities found in crypto smart contracts.

OpenAI released the “EVMbench: Evaluating AI Agents on Smart Contract Security” paper on Wednesday, in collaboration with crypto investment firm Paradigm and crypto security firm OtterSec, to evaluate how much the AI agents could theoretically exploit from 120 smart contract vulnerabilities.

Anthropic’s Claude Opus 4.6 came out on top with an average “detect award” of $37,824, followed by OpenAI’s OC-GPT-5.2 and Google’s Gemini 3 Pro at $31,623 and $25,112, respectively.

Detect awards won by AI agents. Source: OpenAI

While AI agents are becoming increasingly efficient at handling basic tasks, OpenAI said it is becoming more important to evaluate their performance in “economically meaningful environments.”

“Smart contracts secure billions of dollars in assets, and AI agents are likely to be transformative for both attackers and defenders.”

“We expect agentic stablecoin payments to grow, and help ground it in a domain of emerging practical importance,” OpenAI added.