Skip to main content

Semgrep Pro Engine taint traces

Introduction

This article documents dataflow analysis of Semgrep Pro Engine and cross-file (interfile) analysis in the Semgrep Code. This document helps you to enable these features and provides an overview of the benefits compared to the analysis of Semgrep OSS.

Viewing the path of tainted data

With dataflow traces, Semgrep Code can provide you with a visualization of the path of tainted, or untrusted, data in specific findings. This path can help you track the sources and sinks of the tainted data as they propagate through the body of a function or a method. For general information about taint analysis, see Taint tracking.

When running Semgrep Code from the command line, you can pass in the flag --dataflow-traces to use this feature.

You can view dataflow traces in:

Get cross-file findings

To get cross-file (interfile) findings in your organization, follow the steps in Enabling Semgrep Pro Engine. See Semgrep Pro Engine overview for general information about Semgrep Pro Engine.

Displaying tainted data in Semgrep Code

Prerequisite

Not all Semgrep rules or rulesets make use of dataflow traces, or taint tracking. Ensure that you have a ruleset, such as the default ruleset added in your Policies page. If this ruleset is not added, go to https://semgrep.dev/p/default, and then click Add to Policy. You can add rules that use taint tracking from Semgrep Registry.

To view the detailed path of tainted data with dataflow traces:

  1. Log in to Semgrep Cloud Platform, and click Code in the left panel to view your findings.
  2. Select the finding you're interested in, then do one of the following actions:
    • If the default Group by Rule is enabled, click View details icon on the card of the finding. Click View details if Group by Rule is enabled
    • If No grouping view is enabled, click the header hyperlink on the card of the finding. In the example on the screenshot below, it is the tainted-sql-string. Click View details if No grouping is enabled
  3. In the Data flow section, you can see the source, traces, and sink of the tainted data. The example below displays the path of tainted data across multiple files because Semgrep Pro Engine was enabled. Data flow in Finding details page

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.