Skip to main content

    Code injection prevention for Python

    This is a code injection prevention cheat sheet by Semgrep, Inc. It contains code patterns of potential ways to run arbitrary code in an application. Instead of scrutinizing code for exploitable vulnerabilities, the recommendations in this cheat sheet pave a safe road for developers that mitigate the possibility of code injection in your code. By following these recommendations, you can be reasonably sure your code is free of code injection.

    Check your project using Semgrep

    The following command runs an optimized set of rules for your project:

    semgrep --config p/default

    1. Executing or evaluating code

    1.A. Executing code with exec

    The exec() function supports the dynamic execution of Python code. The exec() function can be dangerous if it is used to execute dynamic content (non-literal content). If this dynamic content has an input controllable by a user, it can cause a code injection vulnerability.

    Example:

    # Value supplied by user
    user_input = "');import requests;requests.get('localhost:3000');print('"

    # Vulnerable
    exec("foobar('{}')".format(user_input))

    References

    Mitigation

    Do not use exec() for non-literal values. Alternatively:

    • Ensure executed content is not controllable by external sources.
    • If it's not possible, strip everything except alphanumeric characters from the input.
    • Don't try to make exec safe with tricks such as {'__builtins__':{}}.

    Semgrep rule

    python.lang.security.audit.exec-detected.exec-detected

    1.B. Evaluating code with eval

    The eval() function supports the dynamic execution of Python code. The eval() can be dangerous if it is used to execute dynamic content (non-literal content). If this dynamic content has an input controllable by a user, it can cause a code injection vulnerability.

    Example:

    # Value supplied by user
    user_input = "__import__('code').InteractiveInterpreter().runsource('import requests;requests.get(\'localhost:3000\')')"

    # Vulnerable
    eval(user_input)

    References

    Mitigation

    Do not use eval(). Alternatively:

    • If you need to use eval() with non-literal values, ensure that executed content is not controllable by external sources.
    • If it's not possible, strip everything except alphanumeric characters from the input.
    • Don't try to make eval safe with tricks such as {'__builtins__':{}}.

    Semgrep rule

    python.lang.security.audit.eval-detected.eval-detected

    1.C. Accepting logging configuration with logging.config.listen()

    The logging.config.listen() function starts a socket server on the specified port, and listens for new configurations. As the logging.config.listen() configuration is passed through eval(), the use of this function can lead to a security risk. While the function only binds to a socket on localhost, and so does not accept connections from remote machines, there are scenarios where untrusted code can potentially run under the account of the process which calls listen().

    Example:

    # Server example: starting up a socket server on 9999 port, and listening for new configurations.
    import logging
    import logging.config

    logging.config.fileConfig('logging.conf')
    t = logging.config.listen(9999)
    t.start()


    # Client example: sending configuration from `data_to_send` variable to localhost:9999
    import socket, sys, struct

    # Config example: print("pwned") is evaluated and "pwned" is printed to the console
    data_to_send = """
    [loggers]
    keys=root

    [handlers]
    keys=hand01

    [formatters]
    keys=form01

    [logger_root]
    level=NOTSET
    handlers=hand01

    [handler_hand01]
    class=StreamHandler
    level=NOTSET
    formatter=form01
    args=(print("pwned"),)

    [formatter_form01]
    format=F1 %(asctime)s %(levelname)s %(message)s
    datefmt=
    class=logging.Formatter
    """

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('localhost', 9999))
    s.send(struct.pack('>L', len(data_to_send)))
    s.send(data_to_send)
    s.close()

    References

    Mitigation

    • Verify what is sent across the socket.
    • Alternatively: To avoid the risk, verify the argument to logging.config.listen() to prevent applying unrecognized configurations. This can be done by encrypting or signing what is sent across the socket, such that the verify callable can perform signature verification or decryption.

    Semgrep rule

    python.lang.security.audit.logging.listeneval.listen-eval

    1.D. Running code in an interactive interpreter

    The code module provides read-eval-print loops in Python. Two classes are included to provide interactive prompts, the InteractiveInterpreter and the InteractiveConsole. Both methods can execute Python code: InteractiveInterpreter.runcode executes a code object and InteractiveConsole.push interprets a string as Python code. This is dangerous if external data reaches these function calls as it allows a malicious actor to run arbitrary Python code.

    Example:

    import code

    # Value supplied by user
    user_input = "print('pwned')"
    console = code.InteractiveConsole()
    # Vulnerable
    console.push(user_input)

    # Value supplied by user
    user_input = "print('pwned')"
    interpreter = code.InteractiveInterpreter()
    # Vulnerable
    interpreter.runcode(code.compile_command(user_input))

    References

    Mitigation

    Do not let the user input in InteractiveInterpreter or InteractiveConsole methods. Alternatively:

    • Ensure that content that Python interprets is not controllable by external sources.
    • If it's not possible, strip everything except alphanumeric characters from the input.

    Semgrep rule

    python.lang.security.audit.dangerous-code-run.dangerous-interactive-code-run

    1.E. Using subinterpreter to run code

    The _xxsubinterpreters.run_string is an internal Python function that interprets the string as Python code. This causes a code injection vulnerability when unverified user data reaches run_string. A malicious actor can inject a malicious string to execute arbitrary Python code.

    Example:

    import _xxsubinterpreters

    # Value supplied by user
    user_input = "print('pwned')"

    # Vulnerable
    _xxsubinterpreters.run_string(_xxsubinterpreters.create(), user_input)

    References

    Mitigation

    Do not let a user input in _xxsubinterpreters methods. Alternatively:

    • Ensure that content that Python interprets is not controllable by external sources.
    • If it’s not possible, strip everything except alphanumeric characters from the input.

    Semgrep rule

    python.lang.security.audit.dangerous-subinterpreters-run-string.dangerous-subinterpreters-run-string

    1.F. Running subinterpreter from regression tests package

    The run_in_subinterp is a function from a Python regression tests package (test) that runs code in a subinterpreter. This is dangerous if external data reaches the run_in_subinterp function call because it allows a malicious actor to run arbitrary Python code.

    Example:

    import _testcapi

    # Value supplied by user
    user_input = "print('pwned')"

    # Vulnerable
    _testcapi.run_in_subinterp(user_input)


    from test import support

    # Value supplied by user
    user_input = "print('pwned')"

    # Vulnerable
    support.run_in_subinterp(user_input)

    References

    Mitigation

    Do not let a user input in run_in_subinterp function. Alternatively:

    • Ensure that content that Python interprets is not controllable by external sources.
    • If it's not possible, strip everything except alphanumeric characters from the input.

    Semgrep rule

    python.lang.security.audit.dangerous-testcapi-run-in-subinterp

    2. Abusing built-in functions

    2.A. Accessing dictionary with current global or local symbol table

    The globals() and locals() return a dictionary representing the current global or local symbol table. Using non-static data to retrieve values from this table is extremely dangerous because it can allow an attacker to execute arbitrary code on the system.

    Example:

    # Name of the arbitrary function supplied by user
    user_input = "Name of the function"

    # Vulnerable call of arbitrary function
    function = locals().get(user_input)
    function()

    # Name of the arbitrary function supplied by user
    user_input = "Name of the function"

    # Vulnerable call of arbitrary function
    function = test1.__globals__[user_input]
    function()

    References

    Mitigation

    Do not access global or local symbol tables. Refactor your code not to use globals() and locals().

    Semgrep rule

    python.lang.security.dangerous-globals-use.dangerous-globals-use

    2.B. Dynamically updating and accessing code annotations

    Annotations passed to the typing.get_type_hints() function are evaluated in globals and locals namespaces. Ensure that no arbitrary value can be written as the annotation and passed to the typing.get_type_hints function.

    Example:

    from typing import get_type_hints

    class C:
    member: int = 0

    def smth():
    # Changing annotation for `member` property of class C
    C.__annotations__["member"] = "print('pwn')"

    # Annotations are evaluated and `print('pwn')` code gets executed
    get_type_hints(C)

    References

    Mitigation

    Do not programmatically rewrite code annotations. Alternatively:

    • Ensure that annotations are not controllable by external sources.

    Semgrep rule

    python.lang.security.audit.dangerous-annotations-usage

    Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.