July 2021
· 3 min read
The following updates were made to Semgrep in July 2021.
Version 0.60.0
Additions
- Detect duplicate keys in YAML dictionaries in Semgrep rules when parsing a rule (for example detect multiple metavariable inside one metavariable-regex).
Fixes
C/C++: Fixed stack overflows (segmentation faults) when processing very large files (#3538)
- JavaScript: Fixed stack overflows (segmentation faults) when processing very large files (#3538)
- JavaScript: Detect numeric object keys 1 and 0x1 as equal (#3579)
- OCaml: improved parsing stats by using tree-sitter-ocaml (from 25% to 88%)
- taint-mode: Check nested functions
- taint-mode: foo.x is now detected as tainted if foo is a source of taint
- taint-mode: Do not crash when it is not possible to compute range info
- Rust: recognize ellipsis in macro calls patterns (#3600)
- Ruby: correctly represent a.(b) in the AST (#3603)
Changes
- Added precise error location for the Semgrep metachecker, for example to detect duplicate patterns in a rule.
Version 0.58.2
Additions
- New iteration of taint-mode that allows to specify sources/sanitizers/sinks using arbitrary pattern formulas. This provides plenty of flexibility. Note that we breaks compatibility with the previous taint-mode format, e.g., - source(...) must now be written as - pattern: source(...).
- Experimental support for HTML. This does not rely on the generic mode but instead parses the HTML using tree-sitter-html. This allows some semantic matching (e.g., matching attributes in any order).
- js alpha support (#1751)
- New matching option implicit_ellipsis that allows disabling the implicit ... that are added to record patterns, plus allow matching "spread fields" (JS ...x) at any position (#3120)
- Support globstar (**) syntax in path include/exclude (#3173)
Fixes
- Apple M1: Semgrep installed from Homebrew no longer hangs (#2432)
- Ruby command shells are distinguished from strings (#3343)
- Java varargs are now correctly matched (#3455)
- Support for partial statements (e.g., try { ... }) for Java (#3417)
- Java generics are now correctly stored in the AST (#3505)
- Constant propagation now works inside Python with statements (#3402)
- Metavariable value replacement in message/autofix no longer mixes up short and long names like $X vs $X2 (#3458)
- Fixed metavariable name collision during interpolation of message / autofix (#3483) Thanks to@Justin Timmons for the fix!
- Revert pattern: $X optimization (#3476)
- metavariable-pattern: Allow filtering using a single pattern or pattern-regex
- Dataflow: Translate call chains into IL
Changes
-
Significant speed improvements (noted above)
-
The size of the semgrep-core the binary is now 95 MB (was 170 MB in v0.58.0) and a smaller Docker image (from 95 MB to 40 MB)
-
The --debug option now displays which files are currently processed incrementally; it will not wait until semgrep-core completely finishes.
-
Switch from OCaml 4.10.0 to OCaml 4.12.0
-
Faster matching times for generic mode
-
Better error message when rule contains empty pattern