Skip to main content

    How to exclude certain file types for a particular rule

    Certain filetypes can generate numerous false positives and delay your triage process. This document helps you achieve a selective middle ground:

    • Continue to include the file type to scan with other rules
    • Reduce time spent triaging false positives

    Background

    This article uses a real-life case in scanning .svg files. svg files mostly comprise a string of thousands of characters:

    <image id="image0" width="2896" height="998" xlink:href="data:image/png;
    base64,iVBORw0KGgoAAAANSUhEUgAAC1AAA**AP6*mCAYAAABQS58cAAABR2lDQ1BJQ0M
    gUHJvZmlsZQAAKJFjYGASSSwoyGFhYGDIzSspCnJ3UoiIjFJgf8LAzsDIwM1gwqCRmFxc4B
    gQ4ANUwgCjUcG3a0C1QHBZF2SW3AzZBT+7Sn68UphgqTU7fyemehTAlZJanAyk/wBxWnJBU
    QkDA2MKkK1cXlIAYncA2SJFQEcB2XNA7HQIewOInQRhHwGrCQlyBrJvANkCyRmJQDMYXwDZ
    OklI4ulIbKi9IMDj4urjoxBqZG5oEUjAuaSDktSKEhDtnF9QWZSZnlGi4AgMpVQFz7xkPR0F
    IwMjAwYGUJhDVH8OAoclo9g+hFj+EgYGi28MDMwTEWJJUxgYtrcxMEjcQ…..

    Semgrep’s standard artifactory rule (see in Semgrep Registry), for example, reports on:

    _\# ruleid: detected-artifactory-password_

    _AP6xxxxxxxxxx_

    _\# ruleid: detected-artifactory-password_

    _AP2xxxxxxxxxx_

    ...

    Because .svg files are made up of thousands of characters, the substring AP6*m... in the .svg snippet creates a false positive finding due to the artifactory rule. It is a false positive because no passwords are leaked by the .svg file.

    Choosing the appropriate ignoring solution

    Semgrep offers many different ways of ignoring false positives:


    • Adding nosemgrep as a comment on the first line of code in the file. This would require having to keep track of each new file for this target .svg file type and editing each file accordingly, requiring constant maintenance.
    • Ignore the file entirely, by adding it to a .semgrepignore file. This would exclude the file from being scanned with all rules, not just the artifactory rule.

    Achieving a happy medium: creating a custom rule to exclude a file type

    You can safely assume .svg files do not intentionally contain artifactory passwords, so you can exclude this file type from being scanned. The following procedure demonstrates how to create a customized version of the rule that is generating the false positives that excludes the target file type.

    1. Download the rule generating false positives from the Registry.
    2. Modify the rule ID to something custom:
      \- id: my_detected-artifactory-password
    1. Exclude the target filetype in question from the rule through the path field:
     % cat my_custom_artifactory.yml 

    rules:

    \- id: my_detected-artifactory-password

    options:

    .
    .
    .

    - metavariable-analysis:
    analyzer: entropy
    metavariable: $ITEM
    paths:

    exclude:
    - "*.svg"
    languages:
    - generic
    .
    .
    .
    1. Alter the scan command to still scan for the default configuration you have, with the following changes:
      1. Exclude the original noisy rule as articulated in the false positive reporting.
      2. Include the new custom rule that excludes your target paths.

    Thus, your original semgrep scan command or semgrep ci command can be similar to the following::

    % semgrep scan --config=auto --config=my_custom_artifactory.yml --exclude-rule generic.secrets.security.detected-artifactory-password.detected-artifactory-password

    Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.