Skip to main content

June 2023 release notes

Semgrep OSS Engine

This section of release notes includes upgrades of Semgrep OSS Engine for versions ranging between 1.27.0 and 1.30.0.

Added

  • New --experimental flag to switch to a new implementation of Semgrep entirely written in OCaml with the following benefits:
    • Faster startup time
    • Incremental display of matches
    • AST (abstract syntax tree) and registry caching
    • A new interactive mode
caution

Not all Semgrep features have been ported to the OCaml implementation.

  • Added a new field metavariable-type to Semgrep rule syntax. We at Semgrep have added a dedicated field for annotating the type information of metavariables. By adopting this approach, instead of relying solely on language-specific casting syntax, such as Java's String class, we improve usability by eliminating the need to write redundant type cast expressions for a single metavariable (#8119).

    • The new syntax improves support for target languages that lack built-in casting syntax.
    • It also promotes a unified approach to expressing type, pattern, and regex constraints for metavariables, resulting in better consistency across rule definitions.
    • The following is an example rule written with the current and new metavariable-type syntax:
      # Current syntax without metavariable-type
    rules:
    - id: no-string-eqeq
    severity: WARNING
    message: find errors
    languages:
    - java
    patterns:
    - pattern-not: null == (String $Y)
    - pattern: $X == (String $Y)
    # New syntax with metavariable-type
    rules:
    - id: no-string-eqeq
    severity: WARNING
    message: find errors
    languages:
    - java
    patterns:
    - pattern-not: null == $Y
    - pattern: $X == $Y
    - metavariable-type:
    metavariable: $Y
    type: String
    • The metavariable-type field is now supported for the following languages (#8148, #8165,#8126):
      • Kotlin
      • Go
      • Scala
      • C#
      • TypeScript
      • PHP
      • Rust
      • Python
  • Pattern syntax: You can now introduce metavariables from parts of regular expressions using pattern-regex, by using regular expressions with named capturing groups. Such capture group metavariables must be explicitly named.

    • For example, the pattern:
      pattern-regex: "foo-(?P<X>.*)"
      binds what is matched by the capture group ?P<X> to the metavariable $X, which can be used as normal.
    • pattern-regex patterns with capture groups, such as pattern-regex: "(.*)" still introduce metavariables of the form $1, $2, and so on, but this should be considered deprecated behavior, and that functionality will be taken away in a future release. Named capturing groups should be used instead. (#8115)
  • Rule syntax: Error messages for rule parsing have been improved. For instance, parsing now complains if you miss a hyphen in a list of patterns, or if you try to give a string to patterns or pattern-either. (#8098).

  • JavaScript and TypeScript (#8112): Now, patterns of records with ellipses, such as:

    { $X: ... }

    Properly match to records of anonymous functions, such as:

    {
    func: () => { return 1; }
    }
  • Matching: Writing a pattern which is a sequence of statements, such as

    foo();
    ...
    bar();

    now allows matching to sequences of statements within objects, classes, and related language constructs, in all languages (#8052).

  • Added support for post-PEP 614 decorators.(#8100). Now Semgrep accepts decorators of the form @ named_expr_test NEWLINE, for example with the pattern:

    lambda $X:$X($X):
    #match 1
    @omega := lambda ha:ha(ha)
    def func():
    return None

    #match 2
    @omega[lambda a:a(a)].a.b.c.f("wahoo")
    def fun():
    return None
  • Constant propagation is now applied to stack array declarations in C. A pattern $TYPE $NAME[101]; now produces two matches in the following snippet (#8085):

    int main() {

    int bad_len = 101;
    /* match 1 */
    int arr1[101];
    /* match 2 */
    int arr2[bad_len];
    return 0;
    }
  • Solidity: Allow metavariables for versions. This can be used for the pragma directive in Solidity. For example: >= $VER; (#8105)

  • PHP (#8107): Added support for parsing patterns of the form: #[Attr1], #[Attr2] in code such as:

    #[Attr1]
    #[Attr2]
    function test ()
    {
    echo "Test";
    }

    Previously, to match against multiple attributes it was required to write #[Attr1, Attr2].

  • Added lone decorators as a valid Python Semgrep pattern (#8047), so for example $NAME($X) now generates two seperate findings here:

    @hello("world")
    @hi("semgrep!")
    def shift():
    return "left!"
  • JavaScript and TypeScript: Patterns for class properties can now have the static and async modifiers. For instance:

    @Foo(...)
    async bar(...) {
    ...
    }

    or

    @Foo(...)
    static bar(...) {
    ...
    }
  • Semgrep VS Code extension: Semgrep Language Server now supports multi-folder workspaces. (#7966)

  • New pre-commit hook semgrep-ci to use CI rules in pre-commit, which uses rules from the Policies page and blocks PRs when detecting findings from rules in Block mode (#7973).

  • Added support for date comparison and functionality to get current date. This requires date strings to be in the format "yyyy-mm-dd" (#8053).

Changed

  • The output of --debug is now less verbose by default. It now shows internal warning and error messages. Alternatively, use --verbose for detailed scan information.
  • Updated the maximum number of cores autodetected to 16 to prevent overloading on large machines when users do not specify number of jobs themselves. (#8028)
  • Taint mode: Several improvements to taintassume_safe{booleans,numbers} options. Most notably, we now use type info provided by explicit type casts, and we also use const-prop info to infer types.

Fixed

  • Taint analysis: Improve handling of dataflow for tainted value propagation in class field definitions This change resolves an issue where dataflow was not correctly accounted for when tainted values flowed through field definitions in class/object definitions. For instance, in Kotlin or Scala, singleton objects are commonly used to encapsulate executable logic, where each field definition behaves like a statement during object initialization. In order to handle this scenario, we have introduced an additional step to analyze a sequence of field definitions as a sequence of statements for taint analysis. This enhancement allows us to accurately track tainted values during object initialization. (#7742)

  • Allow any characters in file paths used to create dotted rule IDs. File path characters that aren't allowed in rule IDs are simply removed. For example, a rule whose ID is my-rule found in the file hello/@world/rules.yaml becomes hello.world.my-rule. (#8057)

  • Fixed a typing issue with Go. Previously, the pattern ($VAR : *tau.rho).$F() wouldn't produce a match in the following:

    func f() {
    i_1 := &tau.rho{}
    i_2 := new(tau.rho)

    i_1.shift() //miss one
    i_2.left() //miss two

    return 101
    }

    but now Semgrep doesn't miss those two findings. (#8125)

Semgrep Cloud Platform

Added

  • You can now rename projects (repositories). This feature may be useful when you change the name of your repository in your source code manager. To rename a project from within Semgrep Cloud Platform, do the following:
    1. Click Projects.
    2. Click the gear icon on the entry of the project you want to rename.
    3. Click the three-dot icon > Rename project.
    4. Enter the name of the new project and click Rename. Screenshot of repository rename modal
  • You can now select a default RBAC (role-based access control) role. Refer to Setting a default role.
  • The Policies page is now the default page used to manage rules. See the Policies documentation for more information.

Changed

  • For organizations with multiple GitHub organizations, you can now add organizations as a newline separated list.

Removed

  • The Disliked column has been removed from the Dashboard.

Semgrep Code

Added

  • Semgrep Pro Engine for Java: Semgrep's taint mode can now relate Java properties and their corresponding getters or setters even when these are autogenerated, so the actual getters or setters are not declared in the sources.
  • Findings page: Added a badge to identify security-related findings. Screenshot of findings security badge
  • Findings page: You can now select a range of findings cards at once while pressing Shift.

Fixed

  • Cleaned up and updated wording and UI to reflect the updated Team tier.
  • Fixed many many issues in the Editor.

Removed

  • The Rule board has been deprecated and removed. To manage findings, use the Policies page. See the Policies documentation for more information.

Semgrep Supply Chain

  • Semgrep Supply Chain now supports pnpm lockfiles (pnpm-lock.yaml) version 8 and above.
  • CLI scans now include the following metadata:
    • Severity
    • Recommended version to upgrade a dependency to

Semgrep Assistant (beta)

  • A new setting lets you configure where Semgrep Assistant's triage suggestions appear. You can receive Semgrep Assistant messages within your Slack or GitHub PR comments.
  • Added a new drop-down menu to set a minimum autofix confidence level required to display autofix suggestions from Semgrep Assistant.

Documentation updates

Added

  • The Semgrep knowledge base is now live. The knowledge base hosts the following information:
    • Troubleshooting articles
    • How-to guides and tutorials for specific or custom user needs
  • Added a new document about Semgrep Assistant.
  • Minor updates and corrections.