Skip to main content

Add Semgrep to CI

Your deployment journey

Semgrep is integrated into CI environments by creating a job that is run by the CI provider. After a scan, findings are sent to Semgrep AppSec Platform for triage and remediation.

By integrating Semgrep into your CI environment, your development cycle benefits from the automated scanning of repositories at various events, such as:

  • Push events
  • Pull or merge requests (PRs or MRs)
  • User-initiated events (such as GitHub Action's workflow_dispatch)

Guided setup for CI providers in Semgrep AppSec Platform

This guide walks you through creating a Semgrep job in the following CI providers, which are explicitly supported in Semgrep AppSec Platform:

  • GitHub Actions
  • GitLab CI/CD
  • Jenkins
  • Bitbucket
  • CircleCI
  • Buildkite
  • Azure Pipelines

CI providers explicitly supported in Semgrep AppSec Platform. Figure. Semgrep AppSec Platform provides steps and configuration files to easily set up a Semgrep job for popular CI providers.

If your provider is not on this list, you can still integrate Semgrep into your CI workflows by following the steps in Add Semgrep to other CI providers.

Projects

Adding a Semgrep job to your CI provider also adds the repository's records, including findings, as a project in Semgrep AppSec Platform. Each Project can be individually configured to send notifications or tickets.

Semgrep Projects page Figure. Semgrep Projects page. This displays all the repositories you have successfully added a Semgrep job to.

Add Semgrep to CI

To add a Semgrep job to your CI provider:

  1. Ensure you are signed in to Semgrep AppSec Platform.
  2. Click Projects on the left sidebar.
  3. Click Scan new project > CI/CD.
  4. Click the name of the CI provider you use. You are taken to the Add job page.
  5. Follow the steps provided on the page. The process varies depending on your CI provider, but generally includes the following steps:
    1. Click Create new token to create a SEMGREP_APP_TOKEN, which is used to when sending results to Semgrep AppSec Platform.
    2. Copy and paste the SEMGREP_APP_TOKEN and its value. Store it as an environment variable or secret in your CI provider.
    3. Optional: Click Review CI config to see Semgrep's default YAML configuration file for your CI provider.
    4. Click Copy snippet and paste it into your CI provider's configuration file (the filename is typically indicated in the page). Depending on your CI provider, you may have to create a custom configuration file or use an existing one.
    5. Commit the configuration file to your repository.
    6. Return to Semgrep AppSec Platform and click Check connection.

You have now added a Semgrep job to your CI provider; this starts your first full scan. Its findings are sent to Semgrep AppSec Platform for triage and remediation.

tip

You can edit your configuration files to send findings to GitHub Advanced Security Dashboard (GHAS) and GitLab SAST Dashboard. Refer to the following samples:

Sample CI configuration snippets

Refer to the following table for links to sample CI configuration snippets:

In-app CI providerSample CI configuration snippet
GitHub Actionssemgrep.yml
GitLab CI/CD.gitlab-ci.yml
JenkinsJenkinsfile
Bitbucket Pipelinesbitbucket-pipelines.yml
CircleCIconfig.yml
Buildkitepipelines.yml
Azure Pipelinesazure-pipelines.yml

Data collected by Semgrep

When running in CI, Semgrep runs fully in the CI build environment. Unless you have explicitly granted code access to Semgrep, your code is not sent anywhere.

  • Semgrep collects findings data, which includes the line number of the code match, but not the code. It is hashed using a one-way hashing function.
  • Findings data is used to generate line-specific hyperlinks to your source code management system and support other Semgrep functions.

Delete a project

Deleting a project removes all of its findings, metadata, and other records from Semgrep AppSec Platform.

  1. In Semgrep AppSec Platform, click Projects.
  2. Search for your repository's name.
  3. Click the windows icon to access the settings page for that project.
  4. Click the three-dot (...) button at the header and click Delete project.
info

It can take up to a day (24 hours) for the Dashboard to correctly update and remove findings associated with a recently deleted project.

Scan scope

Semgrep scans can be classified by scope. The scope of a scan refers to what lines of code are scanned in a codebase. When classifying scans by scope, there are two types of scans:

Full scan

A full scan runs on your entire codebase and reports every finding in the codebase. It is recommended to perform a full scan of your default branch, such as main or master at a regular cadence, such as every night or every week. This ensures that Semgrep AppSec Platform has a full list of all findings in your code base, regardless of when they were introduced. To run a full scan, run semgrep ci without setting the SEMGREP_BASELINE_REF environment variable. Full scans are triggered at a scheduled time, when the semgrep.yml file is edited, or manually by a user.

Diff-aware scan

A diff-aware scan runs on your code before and after some "baseline" and only reports findings that are newly introduced in the commits after that baseline. Typically, Semgrep runs diff-aware scans upon the creation of a new pull request or merge request.

For example, imagine a hypothetical repository with 10 commits. You set commit number 8 as the baseline. Consequently, Semgrep only returns scan results introduced by changes in commits 9 and 10. This is how semgrep ci can run in pull requests and merge requests, since it reports only the findings that are created by those code changes.

To run a diff-aware scan, use SEMGREP_BASELINE_REF=REF semgrep ci where REF can be a commit hash, branch name, or other Git reference. Note that the SEMGREP_BASELINE_REF does not apply to GitHub Actions and GitLab CI/CD environments. This variable cannot be set to turn a diff-aware scan in GitHub Actions or GitLab CI/CD into a full scan.

Default branch names

Branches with the following names are recognized as default branch names (also known as mainline or trunk branches). When you add a Semgrep CI job to your repository for the first time, Semgrep performs a full scan on these default branches.

Within Semgrep, default branches are also known as primary branches.

  • develop
  • development
  • main
  • master
  • trunk
  • staged
  • dev
  • production
  • prod
  • staging
  • HEAD
  • origin/stage
  • origin/master

You can also set the primary branch name. This is useful for repositories with unique names. This lets Semgrep know what branch to prioritize and perform full scans on.

Next steps

You've set up Semgrep to scan in your repository and send findings after each scan. Your core deployment is almost complete.

Remaining steps include:

  • Optional: Customize your CI job.
  • For software composition analysis (SCA) scans using Jenkins or Maven: Set up SCA scans for your infrastructure.
  • For Jenkins users: Set up a separate CI job for diff-aware scans for feature branches (non-trunk branches) when a pull or merge request is open. This is a prerequisite to receiving PR or MR comments. See Set up diff-aware scans.
  • Set up PR or MR comments, which post findings to developers in your SCM. This involves developers in the security process as active participants. See PR or MR comments for next steps.

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.