Skip to main content

Semgrep Pro Engine taint traces

Introduction

This article documents dataflow analysis of Semgrep Pro Engine and cross-file (interfile) analysis in the Semgrep Code. This document helps you to enable these features and provides an overview of the benefits compared to the analysis of Semgrep OSS.

Viewing the path of tainted data

Semgrep Code can provide you with a visualization of the path of untrusted (tainted) data in specific findings. Findings that display tainted data can help you to track the sources and sinks of the tainted data as they propagate through a body of a function or a method. For general information about taint analysis, see Taint tracking documentation.

With Semgrep Pro Engine, Semgrep Code can display findings that show the propagation of tainted data across multiple files. To get such findings, follow the required steps in Enabling Semgrep Pro Engine documentation.

This feature is also called dataflow traces, and an underlying CLI command is --dataflow-traces.

You can view dataflow traces in:

Displaying tainted data in Semgrep Code

Prerequisite

Not all Semgrep rules or rulesets make use of taint tracking. Ensure that you have a ruleset, such as the default ruleset added in your Policies page. If this ruleset is not added, go to https://semgrep.dev/p/default, and then click Add to Policy. You can add rules that use taint tracking from Semgrep Registry.

To view a detailed path of tainted data with dataflow traces, follow these steps:

  1. Log in to Semgrep Cloud Platform, and then click the Code in the left panel to view your findings.
  2. Select a finding where Semgrep Code detected tainted data, and then do one of the following actions:
    • If the default Group by Rule is enabled, click View details icon on the card of the finding. Click View details if Group by Rule is enabled
    • If No grouping view is enabled, click the header hyperlink on the card of the finding. In the example on the screenshot below, it is the tainted-sql-string. Click View details if No grouping is enabled
  3. In the Data flow section you can see the source, traces, and sink of the tainted data. The example below displays the path of tainted data across multiple files as Semgrep Pro Engine was enabled. Data flow in Finding details page
Get cross-file findings

Find what you needed in this doc? Join the Semgrep Community Slack group to ask the maintainers and the community if you need help, or check out other ways to get help.