Three key learnings for AppSec teams from the XZ backdoor

It’s been a week since the XZ backdoor dropped. As the dust settles, it’s a good time for security teams to review the lessons it holds.

It’s been a week since the XZ backdoor dropped. As the dust settles, it’s a good time for security teams to review the lessons it holds. The reality is that no amount of code analysis, secure builds, and/or security monitoring will stop a motivated insider like this. But that’s OK, never let a good crisis go to waste. As the Head of Security at an AppSec company, the XZ incident holds lessons both for our internal security efforts at Semgrep and for the broader security industry.

The security industry, as a whole, tends to over-index on headline grabbing issues. We wring our collective hands about why we didn’t catch this one. Vendors use tenuous claims and vacuous posts to get attention. While it’s important that we learn from novel, highly-publicized incidents, it’s also very easy to get distracted by them. What never gets brought up in a media frenzy are the unsung engineers that continue to make steady, collective progress on the unglamorous things that actually move the needle when it comes to cybersecurity. From my view, the XZ incident underscores three things: AppSec teams are going to increasingly play a response role; supply chain security goes beyond bad dependencies; and we all need to do a better job securing builds. Damnit. 


On March 29, 2024, details of a backdoored XZ package were publicized on the OSS Security mailing list. The backdoor subverts the SSH service on Linux systems that include the liblzma library provided by XZ. Several bleeding-edge Linux distributions were susceptible; including those from Debian, Fedora, and Alpine Linux. The backdoor targeted specific Linux operating systems. That said,  the compromised XZ library was widely distributed on developer machines. Kudos goes to Andres Freund, who originally found the issue and Evan Boehs who has done an authoritative timeline.

At Semgrep, our mission is to profoundly improve software security and the XZ vulnerability falls squarely in our sights. However, no code analysis product caught, nor could catch, the vulnerability. ‘Jia Tan’ shipped key backdoor elements in the XZs binary packages outside of the project's code repository. And they had plenty of time to circumvent any detection controls in either the XZ project or its downstream consumers. As illustrated by their efforts to avoid Google’s Fuzzing

So what lessons can we take away from it?

Lessons from XZ

What application security teams can do

The XZ incident underscores the big red circle around ‘software development’ in your risk matrix. It validates a bunch of the established concerns in this space. Vetting contributors. Having reproducible builds. Monitoring dependencies. All of these are areas where we’ve made a lot of ground but, I suspect few of us feel like we’ve ‘solved’. That said, I've used XZ as a chance to reassess the priors that I had for Semgrep’s own internal Application Security program.

Three lessons I think the XZ incident holds for most AppSec teams:

  1. Herding dependencies also requires you answer who, why, and how questions. XZ subverted sshd via Linux distros patching systemd. You’re not going to catch that via pure code analysis, but it underscores that dependencies are complex and they can wreck you. Improving supply chain security means more than knowing ‘what’ dependencies are in your code. A bunch of third-party dependencies questions remain hard to answer and they usually aren’t considered until Something Bad Happens. Surprise! You have a vulnerable dependency. Who owns it? Why are we using it? How do we get it quickly updated, tested, and deployed?

  2. Hardening your builds, but also wrangle containers. The XZ project had a vanilla build pipeline and involved relatively few ‘privileged’ parties. Even the super sneaky autoconf files were slipped into binary tarballs, but ultimately were out there for anyone to see.Contrast XZ’s build pipeline to today’s flavor of the month OSS product. It has 100s of committers, its build process relies on a bunch of third-party JavaScript-based GitHub runners, and it’s shipping binary container images full of binaries, known and potentially unknown.

  3. AppSec goes on call (I’m sorry). Application Security teams will need to increasingly join their SWE and SE friends. Apologies for stating the obvious to anyone who’s ever worked on a PSIRT team. As software eats the world, AppSec teams will increasingly be on the hook for responding to security incidents. Welcome to tabletoping. Pop quiz – What products bring in this known bad dependency? The XYZ vulnerability is in the versions we use, should we care? Is this dependency deployed to production? Who’s responsible for fixing it? 

Hopes and prayers for the broader ecosystem

A lot of ink has already spilled on what the XZ incident means for open source, software engineering, security vendors, nation state relations, etc. I don’t have much more to add, other than tapping the sign that says “securing software is important.” But if we are going to tap that sign, let’s be explicit about the things that go beyond a few AppSec teams doing good work.

Industry wide, if we want to improve supply chain security, we need to acknowledge:

  1. SBOMs are not enough. CISA wants your SBOMs but, honestly, let's do more than that. Just listing dependencies is not sufficient. Who is allowed to bring in dependencies? What sources were used? Does the source match the distributed tarball? Can I validate what you gave me and independently verify it? That is what SLSA has been asking for. This is as good a time as any to justify the lift that secure build will take.

  2. Open Source needs to be sustainable. You know these issues have gone mainstream when the XKCD cartoon on open source contributions appears in The Economist. Unfortunately, it’s also a measure of the importance of this unsolved problem. Semgrep publishes an open source static analysis tool and actively contributes to the various ecosystems we rely on. We think we are threading the needle of sustainability. If you’re a hobbyist, you can use our open source tool to secure your project. And if you’re an AppSec engineer or a developer in a large team, we strive to have products that you’ll pay for.

Lessons for Semgrep

I started out by stating we need to learn from novel incidents, but not get distracted by them, yet here we are 700 hundred words later, all spent imploring AppSec and software engineering to Do Better. But the aspiring Bayesian in me is shouting – forget about XZ and well-funded sleeper agents. It’s the new hire tasked to implement a webhook that will actually cause next week’s problem in your app. So, stick to grinding down that backlog of things that you already know about. Maybe bump up being prescriptive about dependencies and hardening your build pipeline.

XZ holds lessons for Semgrep’s own internal security practices. We were already on a path to hardening our build pipelines. XZ gives me very concrete motivation for our goal of hitting SLAS Level 3, which otherwise can be a fairly arcane standard that involves a lot of lift from our Engineering team. We also plan on simplifying our container-based release pipeline.

From a Semgrep product perspective there are a few obvious places where code analysis and Semgrep might help for XZ-like backdoors. To quote Kurt Boberg, staff Security Engineer, “RC4 is bad and if it’s not nefarious, someone should feel bad”. To that end, Kurt and Iago Abal, Senior Security Engineer  created a Semgrep rule to flag RC4 implementations. It’s experimental and currently too slow to enable by default. But it’s a great demonstration of Semgrep’s flexibility and something we’re actively working on.

2 - id: kb-xz-malicious-script
3   patterns:
4     - pattern: |
5         $K = $S[($A+$B)%$M]
6     - pattern-inside: |
7         $I = ($I + 1)%$M
8         ...
9         $A = $S[$I]
10         ...
11         $J = ($J + $ANY)%$M
12         ...
13         $B = $S[$J]
14         ...
15   message: Semgrep found an RC4 primitive
16   languages:
17     - generic
18   severity: WARNING


Various people deserve mensches that deserve a call out


Semgrep Logo

Semgrep lets security teams partner with developers and shift left organically, without introducing friction. Semgrep gives security teams confidence that they are only surfacing true, actionable issues to developers, and makes it easy for developers to fix these issues in their existing environments.

Find and fix the issues that matter before build time

Semgrep helps organizations shift left without the developer productivity tax.

Get started in minutesBook a demo