Semgrep Pro Engine taint traces
Introduction
This article documents dataflow analysis of Semgrep Pro Engine and cross-file (interfile) analysis in the Semgrep Code. This document helps you to enable these features and provides an overview of the benefits compared to the analysis of Semgrep OSS.
Viewing the path of tainted data
Semgrep Code can provide you with a visualization of the path of untrusted (tainted) data in specific findings. Findings that display tainted data can help you to track the sources and sinks of the tainted data as they propagate through a body of a function or a method. For general information about taint analysis, see Taint tracking documentation.
With Semgrep Pro Engine, Semgrep Code can display findings that show the propagation of tainted data across multiple files. To get such findings, follow the required steps in Enabling Semgrep Pro Engine documentation.
This feature is also called dataflow traces, and an underlying CLI command is --dataflow-traces
.
You can view dataflow traces in:
- The Findings page of Semgrep Code. For more details, see Path of tainted data in Semgrep Code.
- The PR or MR comments created by Semgrep Code running in your CI. To enable this feature, see the following documentation:
- To see dataflow traces in GitHub PR comments, see Dataflow traces in PR comments section.
- To see dataflow traces in GitLab MR comments, see Dataflow traces in MR comments section.
Displaying tainted data in Semgrep Code
Not all Semgrep rules or rulesets make use of taint tracking. Ensure that you have a ruleset, such as the default ruleset added in your Policies page. If this ruleset is not added, go to https://semgrep.dev/p/default, and then click Add to Policy. You can add rules that use taint tracking from Semgrep Registry.
To view a detailed path of tainted data with dataflow traces, follow these steps:
- Log in to Semgrep Cloud Platform, and then click the Code in the left panel to view your findings.
- Select a finding where Semgrep Code detected tainted data, and then do one of the following actions:
- If the default Group by Rule is enabled, click View details icon on the card of the finding.
- If No grouping view is enabled, click the header hyperlink on the card of the finding. In the example on the screenshot below, it is the tainted-sql-string.
- If the default Group by Rule is enabled, click View details icon on the card of the finding.
- In the Data flow section you can see the source, traces, and sink of the tainted data. The example below displays the path of tainted data across multiple files as Semgrep Pro Engine was enabled.
- Get cross-file (interfile) findings in your organization. Follow the steps in Enabling Semgrep Pro Engine documentation. See Semgrep Pro Engine overview for general information about Semgrep Pro Engine.
Find what you needed in this doc? Join the Semgrep Community Slack group to ask the maintainers and the community if you need help, or check out other ways to get help.