A finding is the core result of Semgrep's analysis. Findings are generated when a Semgrep rule matches a piece of code. After matching, a finding can make its way through 3 parts of the Semgrep ecosystem: Semgrep, Semgrep CI, and Semgrep App.
Semgrep command line findings are produced by a specific rule matching a piece of code. Multiple rules can match the same piece of code, even if they are effectively the same rule. For example, consider the following rule and code snippet:
rules:- id: finding-test pattern: $X == $X message: Finding test 1 languages: [python] severity: WARNING- id: finding-test pattern: $X == $X message: Finding test 2 languages: [python] severity: WARNING
print(1 == 1)
Running Semgrep produces the following findings:
$ semgrep --quiet --config test.yaml test.pytest.pyseverity:warning rule:finding-test: Finding test 11:print(1 == 1)--------------------------------------------------------------------------------severity:warning rule:finding-test: Finding test 21:print(1 == 1)
For more information on writing rules, see Rule syntax.
Semgrep CI, designed to continuously scan commits and builds, improves on Semgrep findings to track the lifetime of an individual finding. A Semgrep CI finding is defined by a 4-tuple:
(rule ID, file path, syntactic context, index)
These pieces of state correspond to:
rule ID: the rule's ID within the Semgrep ecosystem.
file path: the filesystem path where the finding occurred.
syntactic context: the lines of code corresponding to the finding.
index: an index into identical findings within a file. This is used to disambiguate findings.
syntactic context is normalized by removing indentation,
nosemgrep comments, and whitespace.
These are hashed and returned as the syntactic identifier:
syntactic_id. This is how Semgrep CI uniquely identifies findings and tracks them across state transitions. Semgrep CI does not store or transmit code contents. The
syntactic context is hashed using a one-way hashing function making it impossible to recover the original contents.
Semgrep App builds on Semgrep CI findings to track state transitions and provide additional context for managing findings within your organization. Findings move between states according to their Semgrep CI
syntactic_id, as mentioned above. A finding can occupy 4 states in Semgrep App:
Semgrep App finding states are defined as follows:
OPEN: the finding exists in the code and has not been muted.
FIXED: the finding existed in the code, and is no longer found.
MUTED: the finding has been ignored by a
nosemgrepcomment or via
REMOVED: the finding's rule isn't enabled on the repository anymore. The rule was removed from the used ruleset, the rule was removed from the policy, or the containing policy was detached from the repo.
The possible transitions are defined as follows:
Fix: a previously identified
syntactic_idno longer exists.
Regress: a previously fixed
syntactic_idhas been reintroduced.
Mute: a previously identified
syntactic_idhas been ignored.
Unmute: a previously muted
syntactic_idhas been unignored.
Remove: a previously identified or muted
syntactic_id's rule is no longer part of the scan.
Readd: a previously removed
syntactic_id's rule is part of the scan again. A readded issue can immediately be marked as fixed or muted.
Fixed issues will stay fixed even if their rule is removed.
Semgrep App provides analytics to measure Semgrep performance within your organization. Visit Dashboard > Findings and use measurements like fix rate and findings over time to get the most out of your Semgrep deployment:
Filter findings to drill down into specific areas:
View individual findings and their associated state:
Track high, or low, performing policies, rulesets, and rules:
The "rate" for any state is
(state total / total of all states), e.g.,
fix rate = (fixed / (fixed + open + muted)).
For more information on blocking vs. non-blocking visit Managing CI policy.
Find what you needed in this doc? Join the Slack group to ask the maintainers and the community if you need help.