This post describes the thought process behind writing a new Semgrep rule and gives a methodology for doing so effectively. There’s no one right way or process for writing a Semgrep rule. Do what works for you.
The following outlines a process many have found effective. At a high level:
For walkthrough examples of writing custom Semgrep rules, see the interactive, example-based tutorial here: https://semgrep.dev/learn.
A rule writing methodology
There’s no one right way or process for writing a Semgrep rule, so do what works for you.
The following outlines a process we’ve found effective at r2c. At a high level, it’s:
Brainstorm what you want to find
Determine how what that concretely looks like in code
Create a sample source file with example code snippets you’d like to find, and write an initial Semgrep rule that matches it
To test your custom Semgrep rule and refine it, first run it on one real code repository, then across many repositories
Once you’re happy with your rule’s performance, integrate it into your continuous integration (CI) systems so that it runs on every pull request to ensure your code maintains a high quality bar.
1. Brainstorm what you want to find
Determine what to look for. It could be something very concrete and demand-driven, like:
a. Finding other instances of known vulnerable code patterns (”variant analysis”), e.g.,
A pen test report / bug bounty program reported an issue
<like this>
. I want to find similar code across all my repos, because they might be vulnerable, too.
b. Finding business logic bugs
In my company, I know that our internal API method
foo()
must always be called a certain way. For example, one of the arguments must or must not be a hardcoded string, a certain flag or option should be set, etc. I want to find all the code locations where this implicit calling pattern is not being followed.
In my applications, I know that one particular API call,
bar()
must always be called before another,baz()
, or else it’s a bug. I want to find all the places this calling convention isn't followed.
Or, your search could start a bit more abstract and exploratory, where you don't know exactly what you're looking for, like:
c. Auditing dangerous function use
I know
eval()
is a potentially dangerous function, I want to see where it’s used and how.
d. Auditing API or technology uses
My company has a history of security issues with JWTs (or
<technology>
), let’s audit all of the code that touches JWT logic.
e. Reviewing authentication or authorization logic
My company uses the
acme_corp_auth
library for all authentication or authorization logic. I want to review everywhere it's called to see if how the library is used makes sense, ensure that trust boundaries are enforced, confirm that enforcement is done consistently, and locate any places that are missing authn/authz.
These are a few examples, but in summary, the first step in writing a new rule is determining:
What is an interesting aspect of my code that I want to find?
2. Determine what that concretely looks like in code
After you have a rough idea of what you want to find, the next step is to identify what that concretely looks like in code. The more specific, the better.
What does ”good” or ”safe” code look like?
For example, after reading some internal code and API docs, you find that for the Python subprocess
module, call()
and other methods are more dangerous when passed the argument shell=True
, so you decide:
I want to find all calls to
subprocess.call()
in which one of the keyword arguments isshell=True
.
Or perhaps you’re auditing Java Spring applications and want to find all routes that don’t perform authorization checks. You review some example routes and see code like this:
@Controller
@RequestMapping("/api/")
public class AcmeController {
@RequestMapping(method = RequestMethod.POST)
@Authorize(Permissions.ADMIN)
@ResponseBody
public ResponseEntity<Map<String, Object>> createProfile() {
return new ResponseEntity<>(result, HttpStatus.OK);
}
@RequestMapping(method = RequestMethod.GET)
@ResponseBody
public ResponseEntity<Map<String, Object>> showResults() {
return new ResponseEntity<>(result, HttpStatus.OK);
}
}
So you determine:
I want to find all routes, which are methods with the
@RequestMapping
annotation, that do not also have an@Authorize
annotation.
The key at this stage is to go from an idea to one or more concrete example code snippets that demonstrate secure and/or insecure examples of what you’re looking for.
3. Create a test file and write an initial rule
Now the fun part begins!
Create a YAML file in your current working directory, for example, the project root of the repo you intend to scan. You can name the YAML file whatever you want, but Semgrep will look for rules defined in a file named .semgrep.yml
by default.
rules:
- id: my-pattern-name
pattern: |
TODO
message: "Some message to display to the user"
languages: [python]
severity: ERROR
Create an example test file containing snippets of code that should and should not match.
Fill out the TODO
in the above YAML file with a Semgrep pattern, using pattern syntax and rule syntax as needed.
For additional help, see these docs for walkthrough of rule writing examples, or an interactive, example-based tutorial here: https://semgrep.dev/learn.
Check that your custom Semgrep rule matches the example file by running:
$ semgrep example_file.py
$ semgrep --config my-rule-file.yml
Break it down
If you’re writing a complex, multi-part pattern, rather than writing the whole pattern and then testing it, similar to writing a complex chunk of code, try building it in pieces and testing each step along the way.
pattern
and pattern-not
: If there are a number of cases that you want to filter out, first write an initial pattern (using pattern
) that finds the general case, and ensure it works as expected.
Then add a series of pattern-not
clauses to filter out the cases that you don't want to match, since those would cause your custom Semgrep rule to return false positives.
pattern-either
: If there code snippets you’d like to find that can’t all be matched by the same pattern, create a test file with each of these code snippets, and then write a pattern
clause for each.
Then combine these clauses under a pattern-either
clause, which will cause Semgrep to match code for which any subclause matches. Your Semgrep rule will look something like this:
rules:
- id: my-pattern-name
patterns:
pattern-either:
- pattern: |
<first pattern here>
- pattern: |
<second pattern here>
- pattern: |
<and however many more you want to match...>
message: "Some message to display to the user"
languages: [java]
severity: ERROR
See here for an example of using pattern-either
.
4. Iterate and refine
After you vet your Semgrep rule against some test examples, it’s time to see how it performs on real code.
4a. Test on one repo
Clone a repo locally, if you don’t have one ready, and scan it:
$ semgrep --config path/to/my-rules.yml path/to/repo
Look through the results. Are you finding what you intended to find?
False positives
In simple terms, false positives are results that your tool gives you that are not code instances you care about.
Review Semgrep’s output: for the code that your pattern matched, are they ”interesting” code snippets that you intended to match?
If not, what makes them not interesting?
If there are multiple results that you’d like to filter, is there a common reason that makes them something you’d like to filter?
e.g., ”Whenever
foo()
is called beforecheck_auth()
, the result is generally something I don’t care about, because of {business logic reasons},” or, ”If the second argument is a hardcoded string, the method call is safe.”
False negatives
False negatives are code that you intend to find but that your pattern misses.
In the general case, regardless of the tool you use, it’s difficult to impossible to eliminate all false negatives. Provably finding every bug can be done in some specialized systems with a massive amount of work, using techniques like formal methods.
But in most contexts, in terms of ”what can I practically do right now without person-years of effort,” you can use the trusty ripgrep to find code your Semgrep rule may have missed.
If your pattern involves a specific method call or annotation, search your code for all references to the string and manually audit them to determine if they are code locations your Semgrep rule should have matched.
# -C returns a file lines of code above
# and below the bmatch
$ rg -C 5 "exec\(" .
$ rg -C 5 "@RequestMapping" .
4b. Test on multiple repos
In the previous step, you tested your Semgrep rule on one real code base and iterated on the rule to ensure it catches the code patterns you intended (decreasing false negatives) and limiting the cases where it matches code you don’t want to match (false positives).
Now it’s time to vet your rule on a bigger corpus.
Gather a number of additional repos that contain the programming language and functionality that are relevant to the rule you’ve been writing.
Note that if you put the target repos into the same directory, you can scan all of them at once by running Semgrep in the parent directory:
# Scans all repos inside the current working directory
$ semgrep --config path/to/my-rules.yml .
As before, review the results and iterate on the rule to make it more precise.
5. Add to CI
Finding individual instances of bugs or antipatterns is nice, but ultimately what’s more impactful is continuously scanning your code for issues, and either alerting on or blocking bad code as soon as it’s entered.
Staying on top of this manually is a lot of work, so it’s easier to have this done automatically by including Semgrep in your continuous integration (CI) and letting your existing infrastructure handle scanning for you.
See the Semgrep CI docs for details on how to integrate Semgrep into platforms including AppVeyor, CircleCI, TravisCI, GitHub Actions, and GitLab.
You may also want to spend some time considering how you plan to handle Semgrep results:
Blocking the build: In some cases, you may want to fail the build if certain rules trigger, as you have high confidence that they've identified an impactful security issue that needs to be fixed.
Alerting: In other cases, Semgrep may identify code patterns that are security-relevant, but are not necessarily vulnerabilities. For these, it likely doesn't make sense to fail the build.
Instead, the security team can be notified, for example, in an
#appsec
Slack channel, that they may want to do a code review or reach out to the corresponding developer for more context.
5a. Make data-driven improvements to rules based on user feedback
There will likely be edge cases that occur in practice and trip up your rule — either vulnerable code it misses, or safe code it incorrectly identifies.
The key is to build and maintain close relationships with engineering teams so that they feel comfortable giving you feedback when your rules aren’t performing as well as intended, so that the rules can be improved.
If possible, collect metrics around your continuous code scanning, like:
For each rule, how often does it fire? (in total and per repo)
For each rule, when it fires, how often do developers perceive it to be a real issue vs a false positive?
How often do developers fix the underlying code?
Making it easy for developers to provide feedback on the signal quality of a rule is quite valuable for building a continuous scanning system that both provides real security value and is appreciated by engineers. See the Tricorder: Building a Program Analysis Ecosystem whitepaper by Google for more details on creating analysis feedback loops.