Local scans with Semgrep Community Edition (CE)

Learn how to set up Semgrep CE, scan your codebase for security issues, and view your findings in the CLI.

Prerequisites

See Prerequisites to ensure that your machine meets Semgrep's requirements.

Set up Semgrep

Homebrew users: Ensure that you've added Homebrew to your PATH.
WSL users: Ensure that you have the Windows Subsystem for Linux installed before proceeding.

Install the Semgrep CLI tool and confirm the installation:

# macOS users only
brew install semgrep

# macOS, Linux, Windows, and Windows Subsystem for Linux (WSL) users
python3 -m pip install semgrep

# if you get the following error "error: externally-managed-environment",
# see semgrep.dev/docs/kb/semgrep-appsec-platform/error-externally-managed-environment 

# confirm
semgrep --version

Scan your codebase

Navigate to the root of your codebase, and run your first scan. The specific command you use depends on how you want to view the results.

To run a scan using recommended rules for your programming language, and view the results in the CLI:

semgrep scan

To export the results to a plain text file:

semgrep scan --text --text-output=semgrep.txt

To export the results to a SARIF file:

semgrep scan --sarif --sarif-output=semgrep.sarif

To export the results to a JSON file:

semgrep scan --json --json-output=semgrep.json

The JSON schema for Semgrep's CLI output can be found in semgrep/semgrep-interfaces.

In addition to the --text, --json, and --sarif flags, which set the primary output formats, and the --output=<value> flag that saves the results to a file or posts to a URL, you can append --<format>-output=<file> to obtain additional output streams:

# prints findings in SARIF format to standard output and writes in JSON format to `findings.json`.
semgrep scan --sarif --json-output=findings.json

# prints findings in text to standard out and writes JSON output to `findings.json`.
semgrep scan --json-output=findings.json

# prints text output to `findings.txt` and writes in SARIF to `findings.sarif`.
semgrep scan --output=findings.txt --sarif-output=findings.sarif

# writes text to `semgrep.txt`, JSON to `semgrep.json`, and SARIF to `semgrep.sarif`.
semgrep scan --text --output=semgrep.txt --json-output=semgrep.json --sarif-output=semgrep.sarif

Accepted values for <format>: text, json, sarif, gitlab-sast, gitlab-secrets, junit-xml, emacs, vim

Scan your codebase with a specific ruleset

You can scan your codebase using --config auto to run Semgrep with rules that apply to your programming languages and frameworks:

semgrep scan --config auto

info

Semgrep collects pseudonymous metrics when you use rules from the Registry. You can turn this off with --metrics=off.

To scan your codebase with a specific ruleset, either one that you write or one that you obtain from the Semgrep Registry, use the --config flag.

# Scan with the the JavaScript rules from Semgrep Registry
semgrep scan --config p/javascript

# Scan with the rules defined in your custom rules.yaml file
semgrep scan --config rules.yaml

You can include as many configuration flags as necessary.

# Scan with rules defined in two separate config files
semgrep scan --config rules.yaml --config more_rules.yaml

Rules stored under a hidden directory, such as dir/.hidden/myrule.yml, are processed by Semgrep when scanning with the --config flag.

Scan with rules in a directory and all its subdirectories:

semgrep scan --config DIRECTORY_NAME

Scan with all YAML rules detected in the current working directory and all its subdirectories:

semgrep scan --config .

Test custom rules

Semgrep includes features to test the custom rules that you write:

semgrep scan --test

Improve performance for large codebases

You can set the number of subprocesses Semgrep uses to run checks in parallel:

semgrep scan -j NUMBER_OF_SUBPROCESSES

By default, the number of jobs Semgrep uses is equivalent to the number of cores detected on the system.

Semgrep doesn't currently support parallelism on Windows. To run Semgrep on multiple cores, use Windows Subsystem for Linux (WSL).

Set log levels

Semgrep provides three levels of logging:

Log level	Flag	Description
Default	None	Prints scan progress, findings information, warnings, and errors.
Verbose	`-v` or `--verbose`	Includes everything printed when using the default logging level, adding a list of rules and details such as skipped files.
Debug	`--debug`	Logs the entire scan process at a high level of detail.

Example usage

To set the logging level for a scan, include the flag when scanning your project:

# run a scan and get debug logs
semgrep scan --debug

Exit codes

The command semgrep scan finishes with exit code 0 as long as the scan completes, regardless of whether there were findings. To finish with exit code 1 when there are findings, pass in the --error flag.

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.

Prerequisites​

Set up Semgrep​

Scan your codebase​

Scan your codebase with a specific ruleset​

Test custom rules​

Improve performance for large codebases​

Set log levels​

Example usage​

Exit codes​