Why SAST tools need to be customizable to be useful

Introduction

Scaling an effective application security program is difficult. An effective AppSec program requires code scanning tools that can surface issues early in the development lifecycle. It also requires developer buy-in for these tools to be embedded in the middle of their workflow — meaning they can’t add friction or slow developers down.

Our experiences tell us that customizability is the critical element that makes a SAST solution able to fit seamlessly into a developer's workflow and earn their buy-in - and once that buy-in/enthusiasm is achieved, something really special happens.

TedLasso Gif

Most modern SAST tools only provide findings, with no controls over which ones are surfaced to developers (AppSec must manually validate each individual finding, or surface all of them). This forces an all-or-nothing approach to shifting left.

While tools like this might fulfill the requirement of being fast enough to integrate into CI and development flows, they certainly are not able to earn developer buy-in because they inundate them with false positives and damage overall trust in security processes.

We believe customization in SAST tools is critical in two specific areas:

1.) The ability to customize and manage the behavior (policies) of security findings at different levels of granularity - for example, controlling which findings are automatically surfaced to developers and which are only reported to security teams (and being able to do this at the rule-level).

2.) The ability to see and customize the underlying rules used by the SAST engine to uncover findings - for example, adding an exception to a rule so its results are more accurate and generate less false positives.

Helpful Definitions

What is a rule? Rules are the pattern-matching and taint-tracking logic that SAST engines use when they scan code to flag vulnerabilities. Other SAST tools may refer to a rule as a query, detection, or detector.

What is a finding? Findings are security or performance issues detected by rules (they can also represent bad coding practices or anything else rules are written to detect).

What is a ruleset? Rulesets are groups of rules - they could be rules for a specific language, rules that cover a specific type of vulnerability, basic rules that scan for OWASP Top 10 vulnerabilities, etc.

Fix rate: Number of fixed findings / total number of findings

True positive fix rate: Number of fixed true positives / total number of true positives

Comment fix rate: The fix rate for findings that have been surfaced to developers via comments on pull requests. Think of this metric as the fix rate for issues you have decided to bring into the developer's view.

Customizing policies

Semgrep has developed 3 distinct behaviors for rules and their findings:

Monitor: Findings for rules set to Monitor are not visible to developers and restricted to users with access to the Semgrep app.

Comment: Findings for rules set to Comment will notify a developer via a comment in their pull/merge request. Comments include the full context developers need to fix the issue with minimal cognitive load/context-switching (data-flow analysis, vulnerability information, AI generated auto-fixes, etc).

Block: Findings from rules set to Block follow the same behavior as Comment, but also block the PR/MR until the findings are addressed.

We have found that the quickest way to improve fix rates and increase developer trust/engagement in security processes is to customize and set rule behaviors based on the accuracy and actionability of their findings. A great example can be seen with our customer, Tide, who achieved a remarkable 100% fix rate using Semgrep custom rules.

Core workflow:

Let’s say you’re an AppSec engineer at an organization with 100 engineers. You’re using an out-of-the-box ruleset to catch OWASP Top 10 vulnerabilities. You would start by running all rules on monitor mode, to observe their behavior. Once you have high confidence that a rule is generating true positives, you can set the rule to comment mode so it notifies developers of findings. Assuming your developers don’t have negative feedback, you can then decide if any of those rules warrant breaking a build. This is what the workflow would look like in practice:

Code policy workflow Why is this workflow critical?

We believe AppSec engineers must roll out tools that encourage findings to be fixed. There needs to be a level of trust that developers have in the accuracy of findings (including context around findings that provide provability). An interruption to the developer workflow that is not helpful or clear, (even an accurate finding without clear steps for remediation) slowly chips away at developer trust in security processes. If you’re not confident about a rule's accuracy, you should strongly consider triaging and reviewing the rule's findings manually - if you can identify an issue with the rule and how it parses your codebase that leads to false positives, you can then customize the rule to address the issue (more on that in the following section).

As you become more confident in the accuracy of findings at the rule-level, you can modify the rule behavior accordingly and eliminate the need for manual triage at the finding-level.

Customizing rules

SAST tools that are customizable and transparent make it easy to optimize out-of-the-box performance - essentially giving organizations the power of a highly-optimized, custom SAST solution without the associated costs and resources required.

Semgrep rules are easy to understand and look like source code, so developers and engineers can easily tweak rule behaviors without needing to learn domain specific languages or ASTs.

Example rule customization (added sanitizer) Customizing an out-of-the-box rule so it accounts for an internal safeguard.

To see the benefit of customizing existing rules, let’s take an example of this rule that finds SSRF vulnerabilities (tainted data flowing into a piece of code that then makes an HTTP request to another server).

If an organization used a home-built library to check the target URL beforehand, this would cause any SAST tool to generate a ton of noise since it would not be able to identify this internal safeguard. Rule customization makes it easy to add a sanitizer to this rule to verify if the call to the customer check happened before the sink, greatly reducing false positives.

Lines 24-28 in the image above define the custom sanitizer, which causes the rule to look for the internal safeguard mentioned above. It’s based on the internal method(?) isValidRecipient

Users of a SAST tool without rule-level visibility and customization must simply accept the fact that their procured SAST tool will generate noise in circumstances like this - they have no way of improving the accuracy of the tool, even if the fix is relatively simple and straightforward!

The magic really starts when developers begin to customize the conditions of a rule to match the needs of their specific project/repository.

Fixing the right issues

Once rule behavior and the rule itself are customized, we recommend focusing on the integrity of the results shown and the fixes that need to be implemented. One of our favorite ways to think about fixes is stated well in this whitepaper from Google's security team: “A [suggested] Fix has two purposes; one is to make it easy for developers to apply the fix, but the other is to provide an explanation of analysis results.” We recommend you think of fixes in same manner.

False positives are awful. Not only do they get in the way of teams fixing vulnerabilities, they slowly chip away at the trust developers have in both their SAST solution, and their AppSec team. This is why we recommend AppSec teams work towards having fix rate be their north star metric.

For some organizations, a ‘good’ fix rate (not to be confused with a true positive fix rate) for a rule may be >=70% - meaning in most situations you’re pointing out an issue that is relevant and addressable. If you notice that a rule's performance starts to dip in terms of fix rate, don’t hesitate to disable the rule and manually analyze the false positive reports.

Our final piece of advice: regardless of how you’re scanning for vulnerabilities, make sure that your teams can understand why they’re being presented with a finding, and how they should go about implementing a fix. You don't need to provide a complete auto-fix to be helpful!

While Semgrep often does provide an auto-fix to developers alongside a finding, Semgrep also provides context and explainability that helps reduce the cognitive load on developers and security engineers (eliminating the need for them to search Stack Overflow, prompt ChatGPT, etc).

Conclusion

In conclusion, customizability and transparency are critical for Application Security (AppSec) teams of all maturity levels.

For less sophisticated/newer AppSec teams, customizability and visibility make it possible to gradually shift left and increase developer involvement (as confidence in SAST tooling and processes improves).

For more sophisticated AppSec teams, customizability should be considered a minimum requirement since they require a SAST tool that is highly optimized and can be tailored specifically to their codebase.

To try Semgrep Code in your environment, book a demo with one of our product advisors. To learn more about our Supply Chain (SCA) product, and why it's also trusted and engaging for developers, read this benchmark by Doyensec that benchmarks our tool against competitors like Snyk and Dependabot.

Why SAST tools need to be customizable to be useful

Share

Introduction

Customizing policies

Customizing rules

Fixing the right issues

Conclusion

About

Dive deeper into Secure Coding or continue reading our featured posts.

From idea to (secure) app: Semgrep + Replit

Take control of sensitive code without developer frustration

Announcing an AI AppSec engineer that users agree with 95% of the time

Find and fix the issues that matter before build time