Mythos & Semgrep

For defenders to keep up, they must layer AI with deterministic tooling, fix issues quickly, and build real defense-in-depth into their programs.

Isaac Evans

For users of Semgrep: how should we think about Mythos?

First, we're fortunate that several of the foundation labs have partnered with us to provide early access to new models before release; we've been able to give feedback to improve the models as well as use them to improve Semgrep. Yesterday Semgrep was publicly announced as one of OpenAI's Trusted Access for Cyber grant recipients!

A new kind of pentest

We think of Mythos & models in general as a new kind of pentest: sophisticated, but probabilistic and variably expensive to find bugs. It misses the speed, deterministic, and guaranteed coverage of traditional SAST tooling. Semgrep, Anthropic, and OpenAI explicitly recommend running both LLMs and SAST^1,2. Semgrep's specific bet is different: we think the traditional SAST approach is obsolete, which is why our new architecture is about making the models 10x better by giving them access to great tools like our Pro Engine.

Guidance for AppSec teams

AppSec teams need to:

Run the models that attackers are going to use as quickly as possible and rapidly fix the exploitable issues.
Unfortunately, many organizations are already drowning in vulnerability noise. Having a new source of vulnerability data is not necessarily helpful without new capabilities for prioritizing and fixing.

Because most organizations don't have access to Mythos today, we recommend using Semgrep Multimodal. Layering our capabilities on top of the existing models can get a lot closer to the frontier of what's possible. For more details, see the conclusion of one of our security researcher's posts from today.

How we can help: Semgrep's new Multimodal product makes LLMs 10x better. It's not just running LLMs + Semgrep; it finds vulnerabilities agentically using a rotating cast of models and giving them access to Semgrep's Pro engine, which means lower cost per vulnerability, faster runtime, along with correctness, determinism and provable coverage. That translates into up to 8x more true positives and 50% fewer false positives than the models alone; Multimodal found many incident level 0-days at customers already since being in beta last year and now GA as of last month.
Ensure that new code being written by models is well-tested as early as possible.
The models write code that is significantly less secure than the average developer. Semgrep has popular plugins for Claude, Cursor, and Codex that can run security checks at generation time.
Invest in defense-in-depth for AppSec.
The best security programs have layers that can prevent a single bug from being a showstopping single point of failure. This isn't traditional "find and fix a vulnerability" work; it's hardening like using lockfiles so you know what your dependencies actually are during the next supply chain attack.

Semgrep Workflows is in private preview with select customers. It takes our product from "find bugs" to "automate security engineering tasks". The Semgrep poweruser (not the SAST/SCA "scanning tool" use case) of using Semgrep to find and eliminate bad patterns (not vulnerabilities – security engineer) in the codebase can now be every one of our customers. We're super excited about expanding this to more customers.

Defense-in-depth in action

We're using these capabilities ourselves in a way that may be illustrative: last month, Semgrep was targeted by TeamPCP (who successfully hacked two of our competitors). We found two significant vulnerabilities – we were protected thanks to defense-in-depth, but guess who the committer was on both of them? Claude. Semgrep had actually spotted the vulnerabilities, but Claude ignored it (something we're fixing now by moving to hooks over MCP). The best part was that the fixes were completely automated through our platform, and we were also able to do proactive hardening work to enable defense in depth with Semgrep Workflows as well.

Winning organizations won't be the ones chasing every new AI capability—they'll be the ones that execute well, keep things simple, and build systems that can take a hit. AI raises the stakes on both sides of the fight, but at the end of the day, good fundamentals and solid architecture are still what separate the best from the rest in cybersecurity.

The coming wave of vulnerabilities is going to be a challenge, but we are here to help!

¹_{https://openai.com/index/why-codex-security-doesnt-include-sast ”SAST tools are still very important…We expect the security tooling ecosystem to keep improving: static analysis, fuzzing, runtime guards, and agentic workflows will all have roles.”
² https://claude.com/blog/preparing-your-security-program-for-ai-accelerated-offense "Add static analysis and AI-assisted code review to your continuous integration pipeline, and block merges on high-confidence findings."}

Mythos & Semgrep

A new kind of pentest

Guidance for AppSec teams

Defense-in-depth in action

Dive deeper into Application Security or continue reading our featured posts.

Announcing Pyro Caml: A Continuous Profiler for OCaml

Mythos: Bad Takes, Facts, and Fear

Introducing Semgrep Custom Workflows