Managing findings

Findings

A finding is the core result of Semgrep's analysis. Findings are generated when a Semgrep rule matches a piece of code. After matching, a finding can make its way through 3 parts of the Semgrep ecosystem: Semgrep CLI, Semgrep CI, and Semgrep App.

Semgrep CLI

Semgrep CLI findings are produced by a specific rule matching a piece of code. Multiple rules can match the same piece of code, even if they are effectively the same rule. For example, consider the following rule and code snippet:

rules:
- id: finding-test
  pattern: $X == $X
  message: Finding test 1
  languages: [python]
  severity: WARNING
- id: finding-test
  pattern: $X == $X
  message: Finding test 2
  languages: [python]
  severity: WARNING
print(1 == 1)

Running Semgrep CLI produces the following findings:

$ semgrep --quiet --config test.yaml test.py
test.py
severity:warning rule:finding-test: Finding test 1
1:print(1 == 1)
--------------------------------------------------------------------------------
severity:warning rule:finding-test: Finding test 2
1:print(1 == 1)

For more information on writing rules, see Rule syntax.

Semgrep CI

Semgrep CI, designed to continuously scan commits and builds, improves on Semgrep CLI findings to track the lifetime of an individual finding. A Semgrep CI finding is defined by a 4-tuple:

(rule ID, file path, syntactic context, index)

These pieces of state correspond to:

  1. rule ID: the rule's ID within the Semgrep ecosystem.
  2. file path: the filesystem path where the finding occurred.
  3. syntactic context: the lines of code corresponding to the finding.
  4. index: an index into identical findings within a file. This is used to disambiguate findings.

Note

syntactic context is normalized by removing indentation, nosemgrep comments, and whitespace.

These are hashed and returned as the syntactic identifier: syntactic_id. This is how Semgrep CI uniquely identifies findings and tracks them across state transitions. Semgrep CI does not store or transmit code contents. The syntactic context is hashed using a one-way hashing function making it impossible to recover the original contents.

Semgrep App

Semgrep App builds on Semgrep CI findings to track state transitions and provide additional context for managing findings within your organization. Findings move between states according to their Semgrep CI syntactic_id, as mentioned above. A finding can occupy 3 states in Semgrep App: OPEN, FIXED, and MUTED.

Finding states

Semgrep App finding states are defined as follows:

  1. OPEN: the finding exists in the code and has not been muted.
  2. FIXED: the finding existed in the code, and is no longer found.
  3. MUTED: the finding has been ignored by a nosemgrep comment.

Findings move between states as follows:

Finding state transitions

These transitions are defined as follows:

  1. Fix: a previously identified syntactic_id no longer exists.
  2. Regression: a previously fixed syntactic_id has been reintroduced.
  3. Mute: a previously identified syntactic_id has been ignored.
  4. Unmute: a previously muted syntactic_id has been unignored.

Analytics

Semgrep App provides analytics to measure Semgrep performance within your organization. Visit Manage > Analytics and use measurements like fix rate and findings over time to get the most out of your Semgrep deployment:

Blocking vs. non-blocking findings

Filter findings to drill down into specific areas:

Findings filters

View individual findings and their associated state:

Individual finding state

Track high, or low, performing policies, rulesets, and rules:

Ruleset performance

Note

The "rate" for any state is (state total / total of all states), e.g. fix rate = (fixed / (fixed + open + muted)).

For more information on blocking vs. non-blocking visit Managing CI policy.