May 2022
Semgrep App
Additions
- Team and Enterprise tier users can now integrate Semgrep into their GitHub Enterprise (GHE) and GitLab Self-Managed (GLSM) repositories. See Integrating Semgrep into source code management (SCM) tools.
- You can now scan locally through Semgrep CLI and then upload findings to Semgrep App.
- Semgrep App now has a project setup page for integrating Semgrep with Jenkins. To create a new project with Jenkins, log in to Semgrep App and click Projects > Scan new project > Run scan in CI > Jenkins.
Changes
- The Playground UI is now similar to Semgrep App's Editor UI for a consistent experience.
Semgrep CLI and Semgrep in CI
These release notes include upgrades for all versions ranging between 0.91.0 and 0.94.0.
Changes
-
taint-mode: Let's say that the
taint(x)
function makesx
argument tainted by side-effect. Previously, Semgrep had to rely on a workaround that declared that any occurrence ofx
insidetaint(x); ...
was a taint source. Ifx
was overwritten with safe data, this was not recognized by the taint engine. Also, iftaint(x)
occurred inside of, for example, anif
block, any occurrence ofx
outside that block was not considered tainted. Now, if you specify that the code variable itself is a taint source (usingfocus-metavariable
), the taint engine will handle this as expected, and it will not suffer from the aforementioned limitations. We believe that this change should not break existing taint rules, but please report any regressions that you may find. -
taint-mode: Let's say that the
sanitize(x)
function sanitizesx
argument by side-effect. Previously, Semgrep had to rely on a workaround that declared that any occurrence ofx
insidesanitize(x); ...
was sanitized. Ifx
is later overwritten with tainted data, the taint engine would still considerx
parameter as safe. Now, if you specify that the code variable itself is sanitized (usingfocus-metavariable
), the taint engine handles this as expected and it will not suffer from such limitation. We believe that this change should not break existing taint rules, but please report any regressions that you may find. -
The dot access ellipsis now matches field accesses in addition to method calls. See the following example in Semgrep Playground.
-
In this version, we have made several performance improvements to the code that surrounds our source parsing and matching core. This includes file targeting, rule fetching, and similar parts of the codebase. When we tested
semgrep scan --config auto
on the Semgrep repository itself, the performance improved from 50-54 seconds to 28-30 seconds.- As part of these changes, we removed
:include .gitignore
and.git/
from the default.semgrepignore
patterns. This should not cause any difference in which files are targeted as other parts of Semgrep ignore these files already. - A full breakdown of our performance updates, including some upcoming ones, can be found in this GitHub comment that gives an overview of these changes.
- As part of these changes, we removed
-
If a metrics event request times out, Semgrep no longer retries the request. This avoids Semgrep waiting 10-20 seconds before exiting if these requests are slow.
-
The metrics collection timeout has been raised from 2 seconds to 3 seconds.
-
Files, where only a part of the code was skipped due to a parse failure, are now listed as
partially scanned
in the end-of-scan skip report. -
The
isAuthenticated
was added to metrics sent to Semgrep backend. This is a boolean flag that is true if you are logged in. -
Semgrep in CI prints out all findings instead of hiding nonblocking findings. (#5116)
Additions
-
metavariable-regex
now supports an optionalconstant-propagation
key. When this is set totrue
, information learned from constant propagation is used when matching the metavariable against the regex. By default, it is set tofalse
. -
Dockerfile: Constant propagation now works on variables declared with
ENV
. -
Added
shouldafound
. For more information, see Reporting false negatives. -
dataflow: The data-flow analysis engine now handles
if-then-else
expressions as in OCaml, Ruby, etc. Previously it only handledif-then-else
statements. (#4965) -
taint-mode: Previously, to declare a function parameter as a taint source, Semgrep relied on a workaround that declared that any occurrence of the parameter was a taint source. If the parameter was overwritten with safe data, this was not recognized by the taint engine. Now,
focus-metavariable
can be used to specify that a function parameter is a source of taint, and the taint engine handles this as expected. -
taint-mode: Add basic support for object destructuring in languages such as JavaScript. For example, given
let {x} = E
, Semgrep now infers thatx
is tainted ifE
is tainted. -
The JSON output of the Semgrep scan is now fully specified using ATD and JSON Schema (https://json-schema.org/). See the semgrep-interfaces submodule under interfaces/ (for example,
interfaces/semgrep-interfaces/Semgrep_output_v0.atd
for the ATD specifications). -
The JSON output of
semgrep scan
now contains aversion
: field with the version of Semgrep used to generate the match results.
Additional information
To see the complete change notes which include fixed issues, visit the Semgrep changelog.
Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.