Skip to main content

July 2022

Semgrep App

Additions

  • Semgrep App now integrates with Slack through a Slack app. To create a new integration, go to Settings > Integrations > Add Integration > Slack. Previously, Semgrep App used Slack webhooks.
  • Enable autofix for all of your Projects (repositories connected to Semgrep App) by clicking on Settings > Deployment > Autofix.

Changes

  • Clicking on the Project Name in the Projects page now takes you to that project's Findings page. Click the gear icon at the end of the Project's row to go to the project's Settings page.
  • Semgrep App detects additional environment variables depending on your provider. This simplifies the creation and committing of the configuration file when adding a new Project (repository) in Semgrep App.
  • UI and UX improvements to Scan new project workflow.

Semgrep CLI

These release notes include upgrades for all versions ranging between 0.102.0 and 0.107.0.

Additions

  • Semgrep in CI:

    • Fail-open support: Added --suppress-errors and --no-suppress-errors (the default is --suppress-errors). See Configuring blocking findings and errors for more information.
    • Semgrep in CI does not block builds on triage ignored issues.
    • The timeout for Git commands Semgrep runs is now configurable. To configure the timeout, set the SEMGREP_GIT_COMMAND_TIMEOUT environment variable. The time unit used as a value for this key is in seconds. The default value is 300 which represents 5 minutes.
    • The SEMGREP_GHA_MIN_FETCH_DEPTH environment variable lets you set how many commits semgrep ci fetches from the remote at the minimum when calculating the merge-base in GitHub Actions. Having more commits available helps Semgrep determine what changes came from the current pull request, fixing issues where Semgrep would otherwise report findings that were not touched in a given pull request. This value is set to 0 by default. (Issue #5664)
    • The cli/scripts/compare.py to compare rules for different versions of Semgrep is now supported on Podman environments. For more information, see Contributing rules documentation.
  • Extract mode:

    • New Semgrep CLI experimental extract mode. This mode runs a Semgrep rule on a codebase and extracts code from matches, treating it as a different language. This allows you to supplement an existing set of rules, for example, by writing additional rules to find JavaScript in files of a different language than JavaScript. Among many possible use cases, this enables you to write rules for HTML code in JavaScript code or in template files. While this is somewhat possible with metavariable-pattern, this reduces the work from an M * N problem to an M + N. To know more about extract mode, see Extract mode documentation.
    • Extract mode now has a concatenation reduction (concat). Disjoint snippets within a file can be treated as one unified file.
    • You can use extract mode to scan for generic languages (use value generic in dest-language).
  • Taint mode:

    • Add experimental support for taint labels, which is the ability to attach labels to different kinds of taint. Both sources and sinks can restrict what labels are present in the data that passes through them in order to apply. This allows you to write more complex taint rules that previously required unappealing workarounds. Taint labels are also helpful for writing certain classes of typestate analyses (for example, check that a file descriptor is not used after being closed).
    • Introduced the --dataflow-traces flag, which directs the Semgrep CLI to explain how non-local values lead to a finding. Currently, this only applies to taint mode findings and it traces the path from the taint source to the taint sink.
    • Added taint traces as part of Semgrep JSON output. This helps explain how the sink became tainted.
  • General and language support additions:

    • Semgrep has an experimental support for Elixir language!
    • Scala: Ellipsis are now allowed in for loop function headers, allowing you to write patterns such as for (...; $X <- $Y if $COND; ...) { ... } to match nested for loops. (Issue #5650)
    • Kotlin: Support for ellipsis in field access (for example, obj. ... .bar()).
    • For users logged-in under semgrep login while using Semgrep App. Semgrep now reports file extensions from App-connected scans that do not match the language of any enabled rule. This addition can make the development of new rules more effective by improving language prioritization.
    • Previously, expression statement patterns (for example foo();) were always matching when the expression statement was a bit deeper in the expression (for example, x = foo();). This default behavior can now be disabled through rule options: with implicit_deep_exprstmt: false in rules YAML file. (Issue #5472)
    • LSP support: Improving experimental Language Server Protocol (LSP) support for metavariable inlay hints, hot reloading, App integration, scan commands, and much more!

Changes

  • Breaking changes in the dataflow_trace JSON output to make it more easily consumable by Semgrep App. Added content for taint_source and intermediate_vars, and collapsed the multiple taint_source locations into one.

  • General performance improvements:

    • Semgrep significantly reduced its memory consumption in large repositories!
  • metavariable-comparison:

    • The metavariable-comparison allows you to strip ', ", and ` from the metavariable content, enabling you to scan for strings containing integer or float data. See metavariable-comparison documentation to get more information. With this update, the metavariable field is now only required for strip: true. You are no longer required to include the metavariable field for the default strip: false.

    • The metavariable-comparison now also works on metavariables that cannot be evaluated as simple literals. In such cases, Semgrep takes the string representation of the code bound by the metavariable. Use this string representation through str($MVAR). For example:

      - metavariable-comparison:
      metavariable: $X
      comparison: str($X) == str($Y)

      In this example, $X and $Y can bind to two different code variables and Semgrep checks whether these two code variables have the same name (for example two different variables but both named x).

  • metavariable-pattern:

    • Metavariable-pattern now uses the same metavariable context as its parent. This can cause breaking changes for rules that reuse metavariables in the pattern. For example, consider the following formula:

      - patterns:
      - pattern-either:
      - pattern-inside: $OBJ.output($RESP)
      - pattern: $RESP
      - metavariable-pattern:
      metavariable: $RESP
      pattern: `...{ $OBJ }...`

      Previously, the $OBJ in the metavariable-pattern was a new metavariable. The formula behaved the same if that $OBJ was $A instead. Now, $OBJ unifies with the value bound by $OBJ in the pattern-inside.

  • Using the ellipses operator in XML or HTML elements is now more permissive of whitespace. Previously, in order to have an element with an ellipsis no leading or trailing whitespace was permitted in the element contents, for example <tag>...</tag> was the only permitted form. Now, leading or trailing whitespace is ignored when the substantive content of the element is only an ellipsis.

  • --verbose no longer displays timing information, use --verbose --time to display the timing.

  • The semgrep --test output produced expected lines and reported lines that were difficult to read and interpret. This change introduces missed and incorrect lines making it easier to see the differences in output. See more information about the semgrep --test in the Testing rules documentation.

Additional information

Bug fixes are not included in the release notes unless they are potentially breaking your workflow. To see the complete change notes for Semgrep CLI and CI that include fixes, visit the Semgrep changelog.