Customize your CI job
- You have gained the necessary resource access and permissions required for deployment.
- You have created a Semgrep account and organization.
- For GitHub and GitLab users: You have connected your source code manager.
- Optionally, you have set up SSO.
- You have successfully added a Semgrep job to your CI workflow.
Customize your CI job to achieve the following goals:
- Run Semgrep on a schedule. Run full scans on main or trunk branches at the least intrusive time on developer teams.
- Run Semgrep when an event triggers. Run Semgrep when a pull or merge request (PR or MR) is created.
- Set a timeout to increase or decrease Semgrep's overall runtime. If scans are taking too long, or rules aren't running, customize your per-rule timeout.
Set up diff-aware scans
Follow the steps in this section only for the following CI providers:
- Jenkins
- CI providers without guidance from Semgrep AppSec Platform
Semgrep scans can be classified by scope. The scope of a scan refers to what lines of code are scanned in a codebase. When classifying scans by scope, there are two types of scans:
- Full scan
A full scan runs on your entire codebase and reports every finding in the codebase. It is recommended to perform a full scan of your default branch, such as
main
ormaster
at a regular cadence, such as every night or every week. This ensures that Semgrep AppSec Platform has a full list of all findings in your code base, regardless of when they were introduced. To run a full scan, runsemgrep ci
without setting theSEMGREP_BASELINE_REF
environment variable. Full scans are triggered at a scheduled time, when thesemgrep.yml
file is edited, or manually by a user.- Diff-aware scan
A diff-aware scan runs on your code before and after some "baseline" and only reports findings that are newly introduced in the commits after that baseline. Diff-aware scans are triggered upon creation of a new pull request or merge request.
For example, imagine a hypothetical repository with 10 commits. You set commit number 8 as the baseline. Consequently, Semgrep only returns scan results introduced by changes in commits 9 and 10. This is how
semgrep ci
can run in pull requests and merge requests, since it reports only the findings that are created by those code changes. To run a diff-aware scan, useSEMGREP_BASELINE_REF=REF semgrep ci
where REF can be a commit hash, branch name, or other Git reference.
To configure a diff-aware scan:
- Create a separate CI job following the steps in Add Semgrep to CI through Semgrep AppSec Platform.
- Set the
SEMGREP_BASELINE_REF
variable in your CI configuration file. The value of this environment variable is typically your trunk branch, such asmain
ormaster
.
Set a scan schedule
The following table is a summary of methods and resources to set up schedules for different CI providers.
CI provider | Where to set schedule |
---|---|
GitHub Actions | See Sample CI configs for information on how to modify your semgrep.yml file |
GitLab CI/CD | Refer to GitLab documentation |
Jenkins | Refer to Jenkins documentation |
Bitbucket Pipelines | Refer to Bitbucket documentation |
CircleCI | Refer to CircleCI documentation |
Buildkite | Refer to Buildkite documentation |
Azure Pipelines | Refer to Azure documentation |
Set a custom timeout
By default, Semgrep spends 5 seconds to run per rule. To set a custom timeout for the Semgrep job, set the SEMGREP_TIMEOUT
environment variable in seconds. Decreasing this value speeds up your scans, but with the possibility of skipping some rules. Alternatively, increasing this value ensures that your most complex rules finish running. For example:
SEMGREP_TIMEOUT="3" # Sets the per-rule timeout to 3 seconds.
Setting this variable to 0 removes the time limit, meaning that rules can take any amount of time to run.