Skip to main content

    Cross-file analysis taint traces

    Introduction

    This article documents the cross-file (interfile) dataflow analysis in Semgrep Code. This document helps you to enable these features and provides an overview of the benefits compared to the analysis of Semgrep OSS.

    Viewing the path of tainted data

    With dataflow traces, Semgrep Code can provide you with a visualization of the path of tainted, or untrusted, data in specific findings. This path can help you track the sources and sinks of the tainted data as they propagate through the body of a function or a method. For general information about taint analysis, see Taint tracking.

    When running Semgrep Code from the command line, you can pass in the flag --dataflow-traces to use this feature.

    You can view dataflow traces in:

    Get cross-file findings

    To get cross-file (interfile) findings in your organization, follow the steps in Perform cross-file analysis.

    Displaying tainted data in Semgrep Code

    Prerequisite

    Not all Semgrep rules or rulesets make use of dataflow traces, or taint tracking. Ensure that you have a ruleset, such as the default ruleset added in your Policies page. If this ruleset is not added, go to https://semgrep.dev/p/default, and then click Add to Policy. You can add rules that use taint tracking from Semgrep Registry.

    To view the detailed path of tainted data with dataflow traces:

    1. Log in to Semgrep AppSec Platform, and click Code in the left panel to view your findings.
    2. Select the finding you're interested in, then do one of the following actions:
      • If the default Group by Rule is enabled, click View details icon on the card of the finding. Click View details if Group by Rule is enabled
      • If No grouping view is enabled, click the header hyperlink on the card of the finding. In the example on the screenshot below, it is the tainted-sql-string. Click View details if No grouping is enabled
    3. In the Data flow section, you can see the source, traces, and sink of the tainted data. The example below displays the path of tainted data across multiple files because Semgrep Pro Engine was enabled. Data flow in Finding details page

    Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.