OpenAI Aardvark: GPT-5 Agentic Security Researcher Scanning Codebases

Aardvark, powered by GPT-5, autonomously scans repositories, models threats, validates exploits in sandboxes, and suggests Codex patches for human review. In private beta, it integrates with GitHub and CI/CD to speed vulnerability triage, shorten patch cycles, and bring AI-driven security directly into the developer workflow.

openai cover

TL;DR

  • Aardvark (GPT‑5): autonomous security agent that continuously analyzes codebases to find, validate, and propose fixes; currently in private beta
  • Multi-stage pipeline: analysis (repository threat model), commit scanning (live and historical), sandboxed validation, and Codex-generated patching
  • LLM-driven reasoning and tool use rather than fuzzing/SCA — reads code, writes and runs tests, invokes tools, and annotates steps for reviewer consumption
  • 92% detection on benchmark “golden” repositories; deployed across OpenAI internal codebases and alpha partners; discovered vulnerabilities in open-source projects with ten CVEs assigned; pro-bono scans for selected non-commercial OSS
  • Integrates with GitHub, Codex, and CI/CD; explains exploitability, annotates code, and proposes Codex-generated patches for human review; also surfaces logic and privacy issues
  • Private beta for select partners; beta signup: https://www.openai.com/form/aardvark-beta-signup

OpenAI launches Aardvark, an agentic security researcher powered by GPT‑5

OpenAI has introduced Aardvark, an autonomous security agent powered by GPT‑5 that continuously analyzes codebases to find, validate, and propose fixes for vulnerabilities. Now in private beta, Aardvark is designed to integrate with GitHub and existing development workflows, delivering human-style reasoning over code and producing actionable remediation suggestions.

How Aardvark operates

Aardvark uses a multi-stage pipeline to move from understanding a project to proposing a patch:

  • Analysis: It builds a threat model of the full repository to capture security objectives and design assumptions.
  • Commit scanning: It monitors commits and inspects changes against the repository and threat model, also scanning historical commits when first connected.
  • Validation: Potential vulnerabilities are attempted in an isolated, sandboxed environment to confirm exploitability and reduce false positives.
  • Patching: Aardvark attaches Codex-generated patches to findings for human review and streamlined pull requests.

Rather than depending on traditional program-analysis techniques like fuzzing or software composition analysis, Aardvark relies on LLM-driven reasoning and tool use — reading code, writing and running tests, invoking tools, and explaining steps in an annotated, review-friendly format.

Measured performance and real-world use

In benchmark testing on “golden” repositories, Aardvark identified 92% of known and synthetically introduced vulnerabilities, indicating high recall in those scenarios. The agent has been running across OpenAI’s internal codebases and with alpha partners, surfacing vulnerabilities that occur under complex conditions and contributing to OpenAI’s defensive efforts.

Applied to open-source projects, Aardvark has discovered multiple vulnerabilities that were responsibly disclosed; ten of those have received CVE identifiers. OpenAI also intends to offer pro-bono scanning for selected non-commercial open-source repositories.

Workflow integration and practical benefits

Aardvark is built to work alongside engineers by integrating with GitHub, Codex, and existing CI/CD processes. For each finding, it explains exploitability steps, annotates code for reviewers, and proposes a Codex-generated patch, aiming to accelerate triage and remediation while preserving human oversight. In testing, the agent has also uncovered non-security issues such as logic flaws and privacy problems.

OpenAI updated its outbound coordinated disclosure policy to emphasize collaboration and scalable impact; that policy is available at: https://openai.com/policies/outbound-coordinated-disclosure-policy/ and additional context about responsible disclosure is here: https://openai.com/index/scaling-coordinated-vulnerability-disclosure/

Private beta and participation

Aardvark is currently available in private beta. Select partners will receive early access and work directly with OpenAI to refine detection accuracy, validation workflows, and reporting. Organizations and open-source projects interested in participating can apply via the beta signup form: https://www.openai.com/form/aardvark-beta-signup

For full details, see the original OpenAI announcement: https://openai.com/index/introducing-aardvark/?

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community