Add Semgrep to CI
- You have gained the necessary resource access and permissions required for deployment.
- You have created a Semgrep account and organization.
- For GitHub and GitLab users: You have connected your source code manager.
- Optionally, you have set up SSO.
Semgrep is integrated into CI environments by creating a job that is run by the CI provider. After a scan, findings are sent to Semgrep AppSec Platform for triage and remediation.
By integrating Semgrep into your CI environment, your development cycle benefits from the automated scanning of repositories at various events, such as:
- Push events
- Pull or merge requests (PRs or MRs)
- User-initiated events (such as GitHub Action's
workflow_dispatch
)
Guided setup for CI providers in Semgrep AppSec Platform
This guide walks you through creating a Semgrep job in the following CI providers, which are explicitly supported in Semgrep AppSec Platform:
- GitHub Actions
- GitLab CI/CD
- Jenkins
- Bitbucket
- CircleCI
- Buildkite
- Azure Pipelines
Figure. Semgrep AppSec Platform provides steps and configuration files to easily set up a Semgrep job for popular CI providers.
If your provider is not on this list, you can still integrate Semgrep into your CI workflows by following the steps in Add Semgrep to other CI providers.
Projects
Adding a Semgrep job to your CI provider also adds the repository's records, including findings, as a project in Semgrep AppSec Platform. Each Project can be individually configured to send notifications or tickets.
Figure. Semgrep Projects page. This displays all the repositories you have successfully added a Semgrep job to.
Add Semgrep to CI
- Supported CI providers
- GitHub Actions
To add a CI job to GitHub Actions:
- Ensure you are signed in to Semgrep AppSec Platform.
- Click Projects on the left sidebar.
- Click Scan new project > CI/CD.
- Click GitHub Actions.
- A list of repositories appears. Select all the repositories you want to add a Semgrep job to.
- If you do not see the repository you want to add, adjust GitHub Application's Repository Access configuration. See Detecting GitHub repositories for more information.
- Click Add CI job. You are taken to the Add CI job page.
- Optional: Click Review CI config to see Semgrep's default YAML configuration file.
- Click Commit file.
You have now added a Semgrep job to GitHub Actions. A full scan begins automatically after adding a new repository. Its findings are sent to Semgrep AppSec Platform for triage and remediation.
Detecting GitHub repositories
If you aren't seeing your GitHub repos in the Cloud Platform, complete the following steps to ensure that your GitHub repository is detected by Semgrep AppSec Platform:
- Log in to GitHub.
- Perform one of the following steps:
- For repositories in personal accounts: Click your profile photo > Settings > Applications.
- For repositories in org accounts: Click your profile photo > Your organizations > NAME_OF_ORG > Settings > GitHub Apps.
- On the
semgrep-app
entry, click Configure. - Under Repository access select an option to provide access:
- All repositories will display all current and future public and private repositories.
- Only select repositories will display explicitly selected repositories.
To add a Semgrep job to your CI provider:
- Ensure you are signed in to Semgrep AppSec Platform.
- Click Projects on the left sidebar.
- Click Scan new project > CI/CD.
- Click the name of the CI provider you use. You are taken to the Add job page.
- Follow the steps provided on the page. The process varies depending on your CI provider, but generally includes the following steps:
- Click Create new token to create a
SEMGREP_APP_TOKEN
, which is used to when sending results to Semgrep AppSec Platform. - Copy and paste the
SEMGREP_APP_TOKEN
and its value. Store it as an environment variable or secret in your CI provider. - Optional: Click Review CI config to see Semgrep's default YAML configuration file for your CI provider.
- Click Copy snippet and paste it into your CI provider's configuration file (the filename is typically indicated in the page). Depending on your CI provider, you may have to create a custom configuration file or use an existing one.
- Commit the configuration file to your repository.
- Return to Semgrep AppSec Platform and click Check connection.
- Click Create new token to create a
You have now added a Semgrep job to your CI provider; this starts your first full scan. Its findings are sent to Semgrep AppSec Platform for triage and remediation.
You can edit your configuration files to send findings to GitHub Advanced Security Dashboard (GHAS) and GitLab SAST Dashboard. Refer to the following samples:
Sample CI configuration snippets
Refer to the following table for links to sample CI configuration snippets:
In-app CI provider | Sample CI configuration snippet |
---|---|
GitHub Actions | semgrep.yml |
GitLab CI/CD | .gitlab-ci.yml |
Jenkins | Jenkinsfile |
Bitbucket Pipelines | bitbucket-pipelines.yml |
CircleCI | config.yml |
Buildkite | pipelines.yml |
Azure Pipelines | azure-pipelines.yml |
Data collected by Semgrep
When running in CI, Semgrep runs fully in the CI build environment. Unless you have explicitly granted code access to Semgrep, your code is not sent anywhere.
- Semgrep collects findings data, which includes the line number of the code match, but not the code. It is hashed using a one-way hashing function.
- Findings data is used to generate line-specific hyperlinks to your source code management system and support other Semgrep functions.
Delete a project
Deleting a project removes all of its findings, metadata, and other records from Semgrep AppSec Platform.
- In Semgrep AppSec Platform, click Projects.
- Search for your repository's name.
- Click the windows icon to access the settings page for that project.
- Click the three-dot (...) button at the header and click Delete project.
It can take up to a day (24 hours) for the Dashboard to correctly update and remove findings associated with a recently deleted project.
Scan scope
Semgrep scans can be classified by scope. The scope of a scan refers to what lines of code are scanned in a codebase. When classifying scans by scope, there are two types of scans:
- Full scan
A full scan runs on your entire codebase and reports every finding in the codebase. It is recommended to perform a full scan of your default branch, such as
main
ormaster
at a regular cadence, such as every night or every week. This ensures that Semgrep AppSec Platform has a full list of all findings in your code base, regardless of when they were introduced. To run a full scan, runsemgrep ci
without setting theSEMGREP_BASELINE_REF
environment variable. Full scans are triggered at a scheduled time, when thesemgrep.yml
file is edited, or manually by a user.- Diff-aware scan
A diff-aware scan runs on your code before and after some "baseline" and only reports findings that are newly introduced in the commits after that baseline. Typically, Semgrep runs diff-aware scans upon the creation of a new pull request or merge request.
For example, imagine a hypothetical repository with 10 commits. You set commit number 8 as the baseline. Consequently, Semgrep only returns scan results introduced by changes in commits 9 and 10. This is how
semgrep ci
can run in pull requests and merge requests, since it reports only the findings that are created by those code changes.To run a diff-aware scan, use
SEMGREP_BASELINE_REF=REF semgrep ci
where REF can be a commit hash, branch name, or other Git reference. Note that theSEMGREP_BASELINE_REF
does not apply to GitHub Actions and GitLab CI/CD environments. This variable cannot be set to turn a diff-aware scan in GitHub Actions or GitLab CI/CD into a full scan.
Default branch names
Branches with the following names are recognized as default branch names (also known as mainline or trunk branches). When you add a Semgrep CI job to your repository for the first time, Semgrep performs a full scan on these default branches.
Within Semgrep, default branches are also known as primary branches.
develop
development
main
master
trunk
staged
dev
production
prod
staging
HEAD
origin/stage
origin/master
You can also set the primary branch name. This is useful for repositories with unique names. This lets Semgrep know what branch to prioritize and perform full scans on.
Next steps
You've set up Semgrep to scan in your repository and send findings after each scan. Your core deployment is almost complete.
Remaining steps include:
- Optional: Customize your CI job.
- For software composition analysis (SCA) scans using Jenkins or Maven: Set up SCA scans for your infrastructure.
- For Jenkins users: Set up a separate CI job for diff-aware scans for feature branches (non-trunk branches) when a pull or merge request is open. This is a prerequisite to receiving PR or MR comments. See Set up diff-aware scans.
- Set up PR or MR comments, which post findings to developers in your SCM. This involves developers in the security process as active participants. See PR or MR comments for next steps.
Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.