Skip to main content

Command injection prevention for Python

This is a command/code injection prevention cheat sheet by r2c. It contains code patterns of potential ways to run an OS command or arbitrary code in an application. Instead of scrutinizing code for exploitable vulnerabilities, the recommendations in this cheat sheet pave a safe road for developers that mitigates the possibility of command/code injection in your code. By following these recommendations, you can be reasonably sure your code is free of command/code injection.

Mitigation summary

In general, try not to let dynamic content into APIs intended for code / OS command execution. If this is not an option then perform proper input validation and contextually escape user data.

Check your project for these conditions:

semgrep --config p/python-command-injection

1. Running an OS command

1.A. Using subprocess module

The subprocess module allows you to start new processes, connect to their input/output/error pipes, and obtain their return codes. Methods such as Popen, run, call, check_call, check_output are intended for running commands provided as an argument ('args'). Allowing user input in a command that is passed as an argument to one of these methods can create an opportunity for a command Injection vulnerability.

Example:

import subprocess
import sys

# Vulnerable
user_input = "foo && cat /etc/passwd" # value supplied by user
subprocess.call("grep -R {} .".format(user_input), shell=True)

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.run(["bash", "-c", user_input], shell=True)

# Not vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.Popen(['ls', '-l', user_input])

# Not vulnerable
subprocess.check_output('ls -l dir/')

References:

Mitigation

Do not let user input into subprocess methods. Alternatively,

  • Always try to use an internal Python API (if it exists) instead of running an OS command.
  • Don’t pass user-controlled input.
  • If it is not possible, use an array with a sequence of program arguments instead of a single string.
  • Use shlex.split to correctly parse a command string into an array and shlex.quote to correctly sanitize input as a command-line parameter.
  • Avoid running sh as a command with arguments. If it is not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-subprocess-use

1.B. shell=True

Functions from subprocess module have the shell argument for specifying if the command should be executed through the shell. Using shell=True is dangerous because it propagates current shell settings and variables. This means that variables, glob patterns, and other special shell features in the command string are processed before the command is run, making it much easier for a malicious actor to execute commands. The subprocess module allows you to start new processes, connect to their input/output/error pipes, and obtain their return codes. Methods such as Popen, run, call, check_call, check_output are intended for running commands provided as an argument ('args'). Allowing user input in a command that is passed as an argument to one of these methods can create an opportunity for a command injection vulnerability.

Example:

# prints home directory
subprocess.call('echo $HOME', shell=True)

# throws an error
subprocess.call('echo $HOME', shell=False)

References:

Mitigation

Avoid using shell=True. Alternatively, use shell=False instead.

Semgrep rule

python.lang.security.audit.subprocess-shell-true

1.C. Using os module to execute commands

The os module provides a portable way of using operating system dependent functionality. Methods such as system, popen and deprecated popen2, popen3 and popen4 are intended for running commands provided as a string. Letting user supplied data into a command that is passed as an argument to one of these methods can create an opportunity for a command injection vulnerability.

Example:

import os

# Vulnerable
user_input = "foo && cat /etc/passwd" # value supplied by user
os.system("grep -R {} .".format(user_input))

# Vulnerable
user_input = "foo && cat /etc/passwd" # value supplied by user
os.popen("ls -l " + user_input)

References:

Mitigation

Do not let user input into os methods. Alternatively,

  • Always try to use an internal Python API (if it exists) instead of running an OS command.
  • Consider using subprocess functions with array of program arguments.
  • Don’t pass user-controlled input.
  • If it is not possible, then don’t let running arbitrary commands. Use an allow list for inputs.

Semgrep rule

python.lang.security.audit.dangerous-system-call

1.D. Using os module to spawn a process

The os module allows executing the program path in a new process. Variations of spawn method including spawnl, spawnle, spawnlp, spawnlpe, spawnv, spawnve, spawnvp, spawnvpe, posix_spawn and posix_spawnp are intended for spawning a process with a program passed as a string argument. Allowing spawning of arbitrary programs or running shell processes with arbitrary arguments may result in a command injection vulnerability.

Example:

import os

# Vulnerable
user_input = "/foo/bar" # value supplied by user
os.spawnlpe(os.P_WAIT, user_input, ["-a"], os.environ)

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
os.spawnve(os.P_WAIT, "/bin/bash", ["-c", user_input], os.environ)

References:

Mitigation

Do not let user input into spawn methods. Alternatively,

  • Always try to use an internal Python API (if it exists) instead of running an OS command.
  • Don’t pass user-controlled input.
  • Use shlex.split to correctly parse a command string into an array and shlex.quote to correctly sanitize input as a command-line parameter.
  • Avoid running sh as a command with arguments. If it’s not possible to avoid, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-spawn-process.dangerous-spawn-process

1.E. Replacing currnet process with exec

Execution methods of the os module are intended to execute a new program, replacing the current process. The available methods are execl, execle, execlp, execlpe, execv, execve, execvp, and execvpe. Allowing running of arbitrary programs or running shell processes with arbitrary arguments may result in a command injection vulnerability.

Example:

import os

# Vulnerable
user_input = "/evil/code" # value supplied by user
os.execl(user_input, '/foo/bar', '--do-smth')

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
os.execve("/bin/bash", ["/bin/bash", "-c", user_input], os.environ)

References:

Mitigation

Do not let user input into exec methods. Alternatively,

  • Always try to use internal an Python API (if it exists) instead of running an OS command.
  • Don’t pass user-controlled input.
  • Use shlex.split to correctly parse a command string into an array and shlex.quote to correctly sanitize input as a command-line parameter.
  • Avoid running sh as a command with arguments. If it’s not possible to avoid, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-spawn-process.dangerous-spawn-process

1.F. Wildcard character in a system call that spawns a shell

Spawning a shell or executing a Unix shell command with a wildcard leads to normal shell expansion, which can have unintended consequences if there exist any non-standard file names. Consider a file named '-e sh script.sh' -- this will execute a script when 'rsync' is called.

Example:

# directory example
[root@user public]# ls -al
total 20
drwxrwxr-x. 5 user user 4096 Oct 28 17:04 .
drwx------. 22 user user 4096 Oct 28 16:15 ..
drwxrwxr-x. 2 user user 4096 Oct 28 17:04 DIR1
drwxrwxr-x. 2 user user 4096 Oct 28 17:04 DIR2
drwxrwxr-x. 2 user user 4096 Oct 28 17:04 DIR3
-rw-rw-r--. 1 user user 0 Oct 28 17:03 file1.txt
-rw-rw-r--. 1 nobody nobody 0 Oct 28 16:38 -rf


# running Python code like this will use `-rf` as an argument for rm and force delete all directories
os.system("/bin/rm *")

References:

Mitigation

Avoid wildcards in Unix shell commands.

Semgrep rule

python.lang.security.audit.system-wildcard-detected

1.G. Running shell commands asynchronously

asyncio.subprocess is an async/await API to create and manage subprocesses. Such methods as create_subprocess_shell and Event Loop's subprocess_shell are intended for running shell commands provided as an argument 'cmd'. Allowing user input into a command that is passed as an argument to one of these methods can create an opportunity for a command injection vulnerability.

Example:

import asyncio

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
loop = asyncio.new_event_loop()
# This is similar to the standard library subprocess.Popen class called with shell=True
loop.subprocess_shell(asyncio.SubprocessProtocol, user_input)

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
asyncio.subprocess.create_subprocess_shell(user_input)

References:

Mitigation

Do not let user input into asyncio subprocess methods. Alternatively,

  • Always try to use an internal Python API (if it exists) instead of running an OS command.
  • Consider using asyncio.subprocess functions with array of program arguments (e.g. create_subprocess_exec).
  • Don’t pass user-controlled input.
  • If it’s not possible, then don’t let running arbitrary commands. Use an allow list for inputs.

Semgrep rule

python.lang.security.audit.dangerous-asyncio-shell.dangerous-asyncio-shell

1.H. Creating subprocesses asynchronously

asyncio.subprocess also allows the creation of subprocesses asynchronously. Such methods as create_subprocess_exec and Event Loop's subprocess_exec are intended for creating a subprocess from one or more string arguments specified by args. Allowing user input into a command that is passed as an argument to one of these methods can create an opportunity for a command injection vulnerability.

Example:

import asyncio

# Vulnerable
user_input = "/evil/exe" # value supplied by user
loop = asyncio.new_event_loop()
loop.subprocess_exec(asyncio.SubprocessProtocol, [user_input, "--parameter"])

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
asyncio.subprocess.create_subprocess_exec("bash", ["bash", "-c", user_input])

# Not vulnerable
user_input = "/evil/exe" # value supplied by user
loop = asyncio.new_event_loop()
loop.subprocess_exec(asyncio.SubprocessProtocol, ['ls', '-l', user_input])

References:

Mitigation

Do not let user input into asyncio subprocess methods. Alternatively,

  • Always try to use an internal Python API (if it exists) instead of running an OS command.
  • Don’t pass user-controlled input.
  • Use shlex.split to correctly parse a command string into an array and shlex.quote to correctly sanitize input as a command-line parameter.
  • Avoid running sh as a command with arguments. If it’s not possible to avoid, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-asyncio-exec.dangerous-asyncio-exec
python.lang.security.audit.dangerous-asyncio-create-exec.dangerous-asyncio-create-exec

2. Executing\evaluating code

2.A. Executing code with exec

exec() function supports dynamic execution of Python code. exec() can be dangerous if used to execute dynamic content. If this content can be input from outside the program, this may be a code injection vulnerability.

Example:

# Value supplied by user
user_input = "');import requests;requests.get('localhost:3000');print('"

# Vulnerable
exec("foobar('{}')".format(user_input))

References:

Mitigation

Do not use exec() for non literal values. Alternatively,

  • Ensure executed content is not definable by external sources.
  • If it’s not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.
  • Don’t try to make exec safe with tricks like {'__builtins__':{}}

Semgrep rule

python.lang.security.audit.exec-detected.exec-detected

2.B. Evaluating code with eval

eval() function evaluates string value as a Python expression. eval() can be dangerous if used to evaluate dynamic content. If this content can be input from outside the program, this may be a code injection vulnerability.

Example:

# Value supplied by user
user_input = "__import__('code').InteractiveInterpreter().runsource('import requests;requests.get(\'localhost:3000\')')"

# Vulnerable
eval(user_input)

References:

Mitigation

Do not use eval(). Alternatively,

  • Ensure evaluated content is not definable by external sources.
  • If it’s not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.
  • Don’t try to make eval safe with tricks like {'__builtins__':{}}

Semgrep rule

python.lang.security.audit.eval-detected.eval-detected

2.C. Accepting logging configuration with logging.config.listen()

logging.config.listen() starts up a socket server on the specified port, and listens for new configurations. Because portions of the logging configuration are passed through eval(), use of this function may open its users to a security risk. While the function only binds to a socket on localhost, and so does not accept connections from remote machines, there are scenarios where untrusted code could be run under the account of the process which calls listen().

Example:

# Server example: starting up a socket server on 9999 port, and listening for new configurations.
import logging
import logging.config

logging.config.fileConfig('logging.conf')
t = logging.config.listen(9999)
t.start()


# Client example: sending configuration from `data_to_send` variable to localhost:9999
import socket, sys, struct

# Config example: print("pwned") will be evaluated and "pwned" will be printed to the console
data_to_send = """
[loggers]
keys=root

[handlers]
keys=hand01

[formatters]
keys=form01

[logger_root]
level=NOTSET
handlers=hand01

[handler_hand01]
class=StreamHandler
level=NOTSET
formatter=form01
args=(print("pwned"),)

[formatter_form01]
format=F1 %(asctime)s %(levelname)s %(message)s
datefmt=
class=logging.Formatter
"""

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('localhost', 9999))
s.send(struct.pack('>L', len(data_to_send)))
s.send(data_to_send)
s.close()

References:

Mitigation

Verify what is sent across the socket. Alternatively,

  • To avoid the risk, use the verify argument to logging.config.listen() to prevent unrecognized configurations from being applied. This could be done by encrypting and/or signing what is sent across the socket, such that the verify callable can perform signature verification and/or decryption.

Semgrep rule

python.lang.security.audit.logging.listeneval.listen-eval

2.D. Running code in interactive interpreter

The code module provides facilities to implement read-eval-print loops in Python. Two classes: InteractiveInterpreter and InteractiveConsole are used for that. Both have methods that can execute Python code: InteractiveInterpreter.runcode executes a code object and InteractiveConsole.push interprets a string as Python code. This is dangerous if external data can reach these function calls because it allows a malicious actor to run arbitrary Python code.

Example:

import code

# Value supplied by user
user_input = "print('pwned')"
console = code.InteractiveConsole()
# Vulnerable
console.push(user_input)

# Value supplied by user
user_input = "print('pwned')"
interpreter = code.InteractiveInterpreter()
# Vulnerable
interpreter.runcode(code.compile_command(user_input))

References:

Mitigation

Do not let user input in InteractiveInterpreter/InteractiveConsole methods. Alternatively,

  • Ensure evaluated content is not definable by external sources.
  • If it’s not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-code-run.dangerous-interactive-code-run

2.E. Using subinterpreter to run code

_xxsubinterpreters.run_string is an internal Python function that interprets the string as Python code. If unverified user data can reach run_string, this is a command injection vulnerability. A malicious actor can inject a malicious string to execute arbitrary Python code.

Example:

import _xxsubinterpreters

# Value supplied by user
user_input = "print('pwned')"

# Vulnerable
_xxsubinterpreters.run_string(_xxsubinterpreters.create(), user_input)

References:

Mitigation

Do not let user input in _xxsubinterpreters methods. Alternatively,

  • Ensure evaluated content is not definable by external sources.
  • If it’s not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-subinterpreters-run-string.dangerous-subinterpreters-run-string

2.F. Runing subinterpreter from regression tests package

run_in_subinterp is a funciton from regression tests package for Python that runs code in subinterpreter. This is dangerous if external data can reach this function call because it allows a malicious actor to run arbitrary Python code.

Example:

import _testcapi

# Value supplied by user
user_input = "print('pwned')"

# Vulnerable
_testcapi.run_in_subinterp(user_input)


from test import support

# Value supplied by user
user_input = "print('pwned')"

# Vulnerable
support.run_in_subinterp(user_input)

References:

Mitigation

Do not let user input in run_in_subinterp function. Alternatively,

  • Ensure evaluated content is not definable by external sources.
  • If it’s not possible, strip everything except alphanumeric characters from an input provided for the command string and arguments.

Semgrep rule

python.lang.security.audit.dangerous-testcapi-run-in-subinterp

3. Abusing built-in functions

3.A. Accessing dictionary with current global/local symbol table

globals() and locals() return a dictionary representing the current global/local symbol table. Using non-static data to retrieve values from this table is extremely dangerous because it may allow an attacker to execute arbitrary code on the system.

Example:

# Name of the arbitrary function supplied by user
user_input = "Name of the function"

# Vulnerable call of arbitrary function
function = locals().get(user_input)
function()

# Name of the arbitrary function supplied by user
user_input = "Name of the function"

# Vulnerable call of arbitrary function
function = test1.__globals__[user_input]
function()

References:

Mitigation

Do not access global/local symbol tables. Refactor your code not to use globals() and locals().

Semgrep rule

python.lang.security.dangerous-globals-use.dangerous-globals-use

3.B. Dynamically updating and accessing code annotations

Annotations passed to typing.get_type_hints() are evaluated in globals and locals namespaces. Make sure that no arbitrary value can be written as the annotation and passed to the typing.get_type_hints function.

Example:

from typing import get_type_hints

class C:
member: int = 0

def smth():
# Changing annotation for `member` property of class C
C.__annotations__["member"] = "print('pwn')"

# Annotations are evaluated and `print('pwn')` code gets executed
get_type_hints(C)

References:

Mitigation

Do not programmatically rewrite code annotations. Alternatively,

  • Ensure that annotations are not definable by external sources.

Semgrep rule

python.lang.security.audit.dangerous-annotations-usage