Skip to main content

Semgrep OSS in CI

Semgrep OSS can be set up run static application security testing (SAST) scans on repositories of any size.

This guide explains how to set up Semgrep OSS in your CI pipeline using entirely open source components, also known as a stand-alone CI setup. The preferred Semgrep OSS command is semgrep scan.

Prerequisites

  • Sufficient permissions in your repository to:
    • Commit a CI configuration file.
    • Start or stop a CI job.
  • Optional: Create environment variables.

Ensure your scans use open source components

This setup uses only the LGPL 2.1 Semgrep CLI tool. It is not subject to the usage limits of Semgrep Pro. In order to remain strictly open source, you must ensure that the rules you run use open source licenses or are your own custom Semgrep rules.

To verify a rule's license, read the license key under the metadata of a Semgrep rule.

Click to expand for an example of a rule with a license key.

This rule's last line displays a license: MIT key-value pair.

rules:
- id: eslint.detect-object-injection
patterns:
- pattern: $O[$ARG]
- pattern-not: $O["..."]
- pattern-not: "$O[($ARG : float)]"
- pattern-not-inside: |
$ARG = [$V];
...
<... $O[$ARG] ...>;
- pattern-not-inside: |
$ARG = $V;
...
<... $O[$ARG] ...>;
- metavariable-regex:
metavariable: $ARG
regex: (?![0-9]+)
message: Bracket object notation with user input is present, this might allow an
attacker to access all properties of the object and even it's prototype,
leading to possible code execution.
languages:
- javascript
- typescript
severity: WARNING
metadata:
cwe: "CWE-94: Improper Control of Generation of Code ('Code Injection')"
primary_identifier: eslint.detect-object-injection
secondary_identifiers:
- name: ESLint rule ID security/detect-object-injection
type: eslint_rule_id
value: security/detect-object-injection
license: MIT

For a comparison of the behavior between Semgrep OSS CI scans and Semgrep Pro scans, see Semgrep Pro versus Semgrep OSS.

Set up the CI job

Use template configuration files

Click the link of your CI provider to view a configuration file you can commit to your repository to create a Semgrep job:

Use other methods

Use either of the following methods to run Semgrep on other CI providers.

Direct docker usage

Reference or add the semgrep/semgrep Docker image directly. The method to add the Docker image varies based on the CI provider. This method is used in the Bitbucket Pipelines code snippet.

Install semgrep within your CI job

If you cannot use the Semgrep Docker image, install Semgrep as a step or command within your CI job:

  1. Add pip3 install semgrep into the configuration file as a step or command, depending on your CI provider's syntax.
  2. Run any valid semgrep scan command, such as semgrep scan --config auto.

For an example, see the Azure Pipelines code snippet.

Configure your CI job

The following sections describe methods to customize your CI job.

Schedule your scans

The following table is a summary of methods and resources to set up schedules for different CI providers.

CI providerWhere to set schedule
GitHub ActionsSee Sample CI configs for information on how to modify your semgrep.yml file
GitLab CI/CDRefer to GitLab documentation
JenkinsRefer to Jenkins documentation
Bitbucket PipelinesRefer to Bitbucket documentation
CircleCIRefer to CircleCI documentation
BuildkiteRefer to Buildkite documentation
Azure PipelinesRefer to Azure documentation

Customize rules and rulesets

Add rules to scan with semgrep scan

You can customize what rules to run in your CI job. The rules and rulesets can come from the Semgrep Registry, or your own rules. The sources for rules to scan with are:

  • The value of the SEMGREP_RULES environment variable.
  • The value passed after --config. You can use multiple --config arguments, one per value. For example: semgrep scan --config p/default --config p/comment.

The SEMGREP_RULES environment variable accepts a list of local and remote rules and rulesets to run. The SEMGREP_RULES list is delimited by a space ( ) if the variable is exported from a shell command or script block. For example, see the following BitBucket Pipeline snippet:

# ...
script:
- export SEMGREP_RULES="p/nginx p/ci no-exec.yml"
- semgrep ci
# ...

The line defining SEMGREP_RULES defines three different sources, delimited by a space:

- export SEMGREP_RULES="p/nginx p/ci no-exec.yml"

The example references two rulesets from Semgrep Registry (p/nginx and p/ci) and a rule available in the repository (no-exec.yml).

If the SEMGREP_RULES environment variable is defined from a YAML block, the list of rules and rulesets to run is delimited by a newline. See the following example of a GitLab CI/CD snippet:

# ...
variables:
SEMGREP_RULES: >-
p/nginx
p/ci
no-exec.yml
# ...

Write your own rules

Write custom rules to enforce your team's coding standards and security practices. Rules can be forked from existing community-written rules.

See Writing rules to learn how to write custom rules.

Ignore files

See Ignore files, folders, and code.

By default semgrep ci skips files and directories such as tests/, node_modules/, and vendor/. It uses the default .semgrepignore file which you can find in the Semgrep GitHub repository. This default is used when no explicit .semgrepignore file is found in the root of your repository.

Optional: Copy and commit the default .semgrepignore file to the root of your repository and extend it with your own entries or write your .semgrepignore file from scratch. If Semgrep detects a .semgrepignore file within your repository, it does not append entries from the default .semgrepignore file.

For a complete example, see the .semgrepignore file in Semgrep’s source code.

caution

.semgrepignore is only used by Semgrep. Integrations such as GitLab's Semgrep SAST Analyzer do not use it.

Save or export findings to a file

To save or export findings, pass file format options and send the formatted findings to a file.

For example, to save to a JSON file:

semgrep scan --json > findings.json

The JSON schema for Semgrep's CLI output can be found in semgrep/semgrep-interfaces.

You can also use the SARIF format:

semgrep scan --sarif > findings.sarif

Refer to the CLI reference for output formats.

Migrate to Semgrep AppSec Platform from a stand-alone CI setup

Migrate to Semgrep AppSec Platform to:

  • View and manage findings in a centralized location. False positives can be ignored through triage actions. These actions can be undertaken in bulk.
  • Configure rules and actions to undertake when a finding is generated by the rule. You can undertake the following actions:
    • Audit the rule. This means that findings are kept within Semgrep's Findings page and are not surfaced to your team's SCM.
    • Show the finding to your team through the use of PR and MR comments.
    • Block the pull or merge request.

To migrate to Semgrep AppSec Platform:

  1. Create an account in Semgrep AppSec Platform.
  2. Click Projects > Scan New Project > Run scan in CI.
  3. Follow the steps in the setup page to complete your migration.
  4. Optional: Remove the old CI job that does not use Semgrep AppSec Platform.

Semgrep OSS jobs versus Semgrep Pro jobs

FeatureSemgrep Pro CI (semgrep ci)Semgrep OSS CI (semgrep scan)
Customized SAST scans✔️✔️
SCA (software composition analysis) scans✔️--
Secrets scans✔️--
PR (pull request) or MR (merge request) comments✔️--
Finding status tracked over lifetime✔️--

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.