Through a Scanner Falsely: When AI-reported Critical Vulnerabilities Aren’t

Investigating an AI coding agent Host Header Injection alert

profile image
Jonathan Werrett
November 4th, 2025
Share

Security folks love automation. We have to. The game is asymmetric; our teams are buried under alerts, dashboards, and ever expanding backlogs. Attackers aren’t. But we’ve learnt over the years, automation without context can quickly become harmful noise rather than helpful signal.

Recently, one of our internal experiments with an AI-based code reviewer surfaced what it confidently labeled as a “Host Header Injection: CRITICAL VULNERABILITY 🚨” 

The explanation was thorough and the potential impact severe: OAuth hijacking, account takeover, the works. As a security engineer reading the analysis, my pulse definitely spiked. There was even an emoji, we’re cooked. Page the engineering team! 

Except… it wasn’t actually an exploitable issue. False positives like this waste time, erode trust in our security tools, and, probably worse, risk burning credibility with my peers in engineering.

The Host Header Injection “Vulnerability”

Here’s the gist of the AI’s finding:

  • Code was written to check the Hostname from the header of incoming requests.

  • That value was used in a way that could influence redirection URIs in an OAuth flow.

  • Therefore, an attacker could control the redirect target and steal OAuth codes.

On paper, this looks damning. The AI even helpfully provided a curl command that would demonstrate the issue.

But dear reader, there was no major concern. Our infrastructure won’t accept requests with the wrong Host headers. Browsers and modern stacks don’t just blindly serve requests for  unknown hosts. So while the code technically read as vulnerable, the exploit chain was dead on arrival. The semantic context was missing.

Vulnerable ≠ Exploitable

This is the teachable moment.

To be fair the code could be considered vulnerable to Host header injection, not unlike Open Redirect issues. But the way the OAuth protocol works, not to mention the surrounding ecosystem of load balancers, reverse proxies, and browser constraints, made exploitation impossible.

That distinction matters:

  • Vulnerable: the code has a weakness.

  • Exploitable: that weakness can actually be leveraged to cause harm.

A similar pattern is true of our software supply chain checks:

  • Vulnerable: the code has a weakness.

  • Reachable: we call that function in our code so it could cause harm.

AI analysis (and fair cop, sometimes SAST tools) can blur that line. They’re great at spotting “weak patterns” but blind to real-world context. 

An experienced security engineer will quickly suspect that was the case in this example. Headers controllable by end users? Nice catch! Using it to hijack a user’s OAuth flow? Sounds suspect. But spare a thought for the software engineer who lands this scary looking issue in their ticket queue.

What is missing from this workflow was the ability to customize and tune the policies and rules for what is exploitable.

The Cost of False Positives in AI Security

This wasn’t just a funny observation about an overeager LLM. It’s a reminder of why many engineering teams grow frustrated with security tools:

  • Backlogs fill up with issues that don’t yield impact.

  • Engineers lose trust because of the signal-to-noise ratio.

  • Security burns credibility asking teams to “investigate” things that don’t pose a risk.

If your developers are rolling their eyes at yet another “CRITICAL 🚨” ticket, your security program is losing ground.

What Security Leaders Should Take Away

  1. Defense in Depth Still Wins
    Yes, you should validate browser-provided Host headers. Yes, you should reject unrecognized hosts. But not every weakness needs to be a fire drill if the surrounding layers already protect you. Context here matters when setting rules and policies.

  2. Integrate, Don’t Backlog
    The real promise of AI in security isn’t dumping alerts into a ticketing system. It’s surfacing “belt and suspenders” fixes inline where developers can make the fix immediately while in flow. Taking off my skeptical security hat, this is the thing that excites me the most. Imagine fixing security issues (big and small) without a findings backlog and loop through engineering’s priority stack.

  3. Teach the Difference
    Customize your tools with context and policies. Tuning out false positives can be achieved only through just that – tuning the system on what is and isn’t above the cut line.

Closing Thoughts

Building security tools have taught a lot about alert fatigue and I feel AI-based tools are at a fun inflection point. Are they going to be a security panacea or just another channel trying to manipulate us with “🚨🚨“  emojis?

For the time being, I am prompting my little LLM mate to stick to surfacing code facts and suggesting ideas for what other context needs to be checked. I am shying away from piping its output to my peers in engineering. They have enough emoji-driven tasks in their day-to-day tasks.

Semgrep has customizable rules and policies to tune findings with AI Memories that get better through iteration. Teams running the Semgrep MCP server within their IDE are able to interactively discover and remediate potential issues prior to committing issues without much time investment.

Learn more: https://semgrep.dev/solutions/secure-vibe-coding/


About

semgrep logo

Semgrep enables teams to use industry-leading AI-assisted static application security testing (SAST), supply chain dependency scanning (SCA), and secrets detection. The Semgrep AppSec Platform is built for teams that struggle with noise by helping development teams apply secure coding practices.