Why write a fix?
Developers have a lot on their plate, and security tools finding issues means even more work for them. Traditional security tools scan (partially) completed code for vulnerabilities. They are slow and provide feedback to the developer after the fact. These tools are considered one of the biggest inhibitors of the developer’s productivity.
Semgrep uses fast analyses that can be easily customized to your organization and project. This ensures that the feedback is much faster, and that the feedback is more relevant, leading to fewer Effective False Positives. A tool like this is more usable and better integrated into the workflows of developers. To improve the developer experience even more, the tool should not just slap developers on the wrist by pointing out their mistakes, but it should actually try to help them remediate these problems. A fix that automatically resolves the problem is the most effective and convenient way to do this.
Applying a fix
Semgrep rules can contain a number of optional fields, one of which is fix
. It provides rule-writers with a simple search-and-replace autofix functionality.
For rules with a fix
field, this fix is shown as a preview when running Semgrep in the command line interface. If you run this example from the Semgrep Playground with the command semgrep –config use-sys-exit.yaml use-sys-exit.py
the output will look like this:
use-sys-exit
Use "sys.exit" over the python shell "exit" built-in. "exit" is a helper for the interactive
shell and is not available on all Python implementations.
▶▶┆ Autofix ▶ sys.exit(3)
9┆ exit(3)
⋮┆----------------------------------------
▶▶┆ Autofix ▶ sys.exit(4)
14┆ exit(4)
To actually apply this fix, and replace the marked code with the code from the fix
field, add the flag --autofix
to your command. Semgrep can also run in CI where the autofix behavior can be used to create pull requests instead.
Writing your own fix
Metavariables
Writing a fix can be very simple, especially since metavariables allow you to reuse parts of the existing code. In the example above, the rule matches any calls of the exit
function with the pattern
pattern: exit($X)
The metavariable $X
catches the exit code that is being used. This is very handy, as metavariables can be reused in the fix
field! If we want to replace every call of the exit
function with sys.exit
with the same arguments, the fix looks like this:
fix: sys.exit($X)
Ellipsis Metavariables
Things can get a little more complicated when the pattern matches an unknown number of things. Take the following example, where the goal is to always call make_transaction
with the secure
option set to True
.
# ruleid: transaction
make_transaction(transaction = "transaction", secure=False)
# ok: transaction
make_transaction(transaction = "transaction", secure=True)
# ok: transaction
make_transaction(transaction = "transaction", commit=False, secure=True)
Since we don’t know the number of arguments in the call, to match this code, we need ellipsis (...
), resulting in the following rule:
pattern: make_transaction(..., secure=False)
It is difficult to write a fix for this rule, because we don’t know what is caught by the ellipses. Luckily we have a feature called Ellipsis Metavariables that allows us to name and reuse these ellipses, similar to a regular metavariable. We can then rewrite the rule and add a fix as follows.
pattern: make_transaction($...ARGS, secure=False)
fix: make_transaction($...ARGS, secure=True)
Unfortunately, Ellipsis Metavariables are not supported for every language. And if they are supported, they do not cover every construct in that programming language either. Check the docs to see if they cover your required patterns. If they don’t, consider filing an issue on GitHub, so we can add support for your use case.
Cleverly using pattern
and pattern-inside
By using Ellipsis Metavariables we can rewrite unknown pieces of code. Unfortunately, the Ellipsis Metavariables are not supported for all languages.
To get around this, we can combine pattern
with pattern-inside
. We will use pattern-inside
to match the actual insecure code construct and then use pattern
to match the part in that construct that we actually want to rewrite. In our example, that means matching the False
and rewriting it to True
. The resulting rule and fix look like this:
patterns:
- pattern-inside: |
make_transaction($...BEFORE, secure=False)
- pattern: |
False
fix: |
True
Focus-metavariable
This requires some forward thinking for which code construct is insecure and which part of the code we actually want to rewrite. To make this easier, the focus-metavariable
field is added. This allows us to use pattern
and pattern-inside
however we want, and then put the focus of the fix on a metavariable. Rewriting the rule and fix for our running example results in:
patterns:
- pattern: |
make_transaction($...BEFORE, secure=$VALUE)
- metavariable-pattern:
metavariable: $VALUE
pattern: |
False
- focus-metavariable: $VALUE
fix: |
True
Conclusion
Writing fixes can really improve the developer experience of a security tool. With Semgrep we’ve got several tricks up our sleeve to achieve in writing these fixes. These are use of metavariables and ellipsis metavariables, cleverly using a combination of pattern
and pattern-inside
, and ofcourse the focus-metavariable
option.