Skip to main content

Generating Python manifest files for Semgrep Supply Chain scans

To correctly scan all dependencies in a project, Semgrep Supply Chain requires a Python manifest file. This article describes methods to generate the following Python manifest files or lockfiles:

  • requirements.txt, including those in a requirements folder, such as **/requirements/*.txt
  • requirements.pip
  • requirement.txt, including those in a requirement folder, such as **/requirement/*.txt
  • requirement.pip
  • Pipfile.lock
  • Poetry.lock

You can use any of these files to get a successful Semgrep Supply Chain scan. Your manifest files must have one of these three names to be scanned, or you must have a */requirement/* file in the project.

Generating requirements.txt

Using requirements.in

Prerequisites
  • A requirements.in file with direct Python packages. Do not include transitive packages in requirements.in.
  • pip-tools must be installed on your machine. See the pip-tools GitHub repository for installation instructions.

To generate a requirements.txt file from requirements.in, enter the following command in the root of your project directory:

pip-compile -o requirements.txt

Now, you have successfully generated a requirements.txt file with direct and transitive dependencies that Semgrep Supply Chain can scan.

Example of requirements.txt generated from requirements.in

Given the following example project Binder examples, the requirements.in file contains the following direct dependencies:

numpy
matplotlib==3.*
seaborn==0.10.1
pandas

Executing the command pip-compile -o requirements.txt, generates the following requirements.txt:

#
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile --output-file=requirements.txt
#
contourpy==1.0.7
# via matplotlib
cycler==0.11.0
# via matplotlib
fonttools==4.39.4
# via matplotlib
kiwisolver==1.4.4
# via matplotlib
matplotlib==3.7.1
# via
# -r requirements.in
# seaborn
numpy==1.24.3
# via
# -r requirements.in
# contourpy
# matplotlib
# pandas
# scipy
# seaborn
packaging==23.1
# via matplotlib
pandas==2.0.2
# via
# -r requirements.in
# seaborn
pillow==9.5.0
# via matplotlib
pyparsing==3.0.9
# via matplotlib
python-dateutil==2.8.2
# via
# matplotlib
# pandas
pytz==2023.3
# via pandas
scipy==1.10.1
# via seaborn
seaborn==0.10.1
# via -r requirements.in
six==1.16.0
# via python-dateutil
tzdata==2023.3
# via pandas

This file has all direct and transitive dependencies of the example project and can be used by Semgrep as an entry point for the Supply Chain scan.

Using pip freeze

Prerequisites
  • The pip freeze utility uses dependencies from packages already installed in your current environment to generate requirements.txt. You must be in an isolated or virtual environment.
  • An existing setup.py file.

To generate requirements.txt through pip freeze, enter the following commands:

pip3 install .
pip freeze --all > tee requirements.txt

Example CI configuration

The following GitHub Actions workflow provides an example on how to generate requirements.txt in a CI environment based on the preceding methods.

In the following example there are two jobs:

  • my_first_job: Generating requirements.txt and uploading it as an artifact
  • my_second_job: Downloading the artifact and scanning it with Semgrep
on:
pull_request: {}
workflow_dispatch: {}
push:
branches:
- master
paths:
- .github/workflows/semgrep.yml
schedule:
- cron: '0 1 * * 0'
name: Semgrep
jobs:
my_first_job:
name: requirementsGeneration
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Generate requirements txt
run: |
pip3 install pip-tools
pip-compile -o requirements.txt
- name: Upload Requirements File as Artifact
uses: actions/upload-artifact@v4
with:
name: requirementstxt
path: requirements.txt
my_second_job:
needs: my_first_job
name: Scan
runs-on: ubuntu-20.04
env:
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
container:
image: semgrep/semgrep
steps:
- uses: actions/checkout@v4
- name: Download artifact from previous job
uses: actions/download-artifact@v4
with:
name: requirementstxt
- run: semgrep ci --supply-chain

Generating Pipfile.lock

Prerequisite

An existing Pipfile. Depending on your development environment, a Pipfile may already be automatically generated for you.

Example of Pipfile

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
flasgger = "==0.9.5"
flask = "==2.2.2"
flask-cors = "==3.0.10"
marshmallow = "==3.18.0"
requests = "==2.25.1"
sqlalchemy = "==1.4.41"
waitress = "==2.1.2"
psycopg2 = "==2.9.5"
defusedxml = "==0.7.1"

[dev-packages]

[requires]
python_version = "3.9"

Generating a Pipfile.lock

Generate a Pipfile.lock with the following commands:

pip install pipenv --user
pipenv lock

The newly generated Pipfile.lock is a JSON file with all Python dependencies (direct and transitive) and their sha256 code.

The beginning of the file may look something like this:

{
"_meta": {
"hash": {
"sha256": "af0d5c3f87bd23f340a214b12ad766ca83aead0c462aa08dbc4f012ac2796708"
},
"pipfile-spec": 6,
"requires": {
"python_version": "3.9"
},
"sources": [
{
"name": "pypi",
"url": "https://pypi.org/simple",
"verify_ssl": true
}
]
},
"default": {
"attrs": {
"hashes": [
"sha256:1f28b4522cdc2fb4256ac1a020c78acf9cba2c6b461ccd2c126f3aa8e8335d04",
"sha256:6279836d581513a26f1bf235f9acd333bc9115683f14f7e8fae46c98fc50e015"
],
"markers": "python_version >= '3.7'",
"version": "==23.1.0"
},

Generating Poetry.lock

Poetry is a tool for dependency management and packaging in Python.

Prerequisite

A pyproject.toml file.

Example pyproject.toml

[build-system]
requires = ["poetry-core>=1.1.0"]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
name = "example-project"
version = "1.0.0"
description = "An example project"
authors = ["Your Name <yourname@example.com>"]

[tool.poetry.dependencies]
python = "^3.9"
requests = "^2.25.1"
numpy = "^1.21.0"

[tool.poetry.dev-dependencies]
pytest = "^6.2.4"
flake8 = "^3.9.2"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Generating a Poetry.lock

Generate a Poetry.lock file with the following command:

poetry lock

The generated Poetry.lock file contains all transitive and direct dependencies that the project uses.

Selecting a single manifest file among many

While there may already be a manifest file in the repository, such as a Pipfile.lock, you may want to generate a new one, for example a requirements.txt, to be sure it has the latest dependencies.

When scanning with Semgrep Supply Chain, you can use the flag --include to specify that only a single manifest file should be scanned. The manifest file must still have one of the supported names.

semgrep ci --supply-chain --include=requirements.txt

Conclusions

There are several ways to generate manifest files or lockfiles for Python dependencies. Depending on your preferences, you can select one or another. Keep in mind that the manifest file should be generated before the Semgrep scan and within the proper environment. This ensures that you are scanning only the dependencies of your project and not all the Python dependencies of your system.


Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.