Contribute rules to the Semgrep Registry

Publish rules to the Semgrep Registry to share them with the Semgrep community and contribute to the field of software security. There are two ways in which you can contribute rules to the Semgrep Registry:

For users of Semgrep AppSec Platform: Contribute new rules to the Semgrep Registry through Semgrep AppSec Platform. This workflow is recommended. See Contribute through Semgrep AppSec Platform (recommended). This workflow creates the necessary pull request for you and streamlines the whole process.
For contributors to the repository through GitHub: Contribute rules to the Semgrep Registry, or suggest changes to existing rules, through a pull request to semgrep-rules. See the Contribute through GitHub section for detailed information.

Contribute through Semgrep AppSec Platform (recommended)

This is the recommended path for adding a new rule. To suggest a change to an existing rule, see Update existing rules in Semgrep Registry.

Sign in to Semgrep AppSec Platform.
Go to the Semgrep Playground.
Click Create New Rule.
Choose one of the following:
- Create a new rule and test code by clicking plus icon, select New rule and then click Save. Note: The test file must contain at least one true positive and one true negative test case to be approved. See the Tests section of this document for more information.
- In the Library panel, select a rule from a category in Semgrep Registry. Click Fork, modify the rule or test code, and then click Save.
Click Share.
Click Publish to Registry.
Fill in the required and optional fields.
Click Continue, and then click Create PR.

This workflow automatically creates a pull request in the GitHub Semgrep Registry. Find more about the Semgrep Registry by reading the Rule writing and Tests sections.

You can also publish rules as private rules outside of Semgrep Registry. These rules are not included in the Semgrep Registry, but they are accessible to your Semgrep organisation. See the Private rules documentation for more information.

Contribute through GitHub

Create a pull request in the semgrep/semgrep-rules repository. The pull request requires two files:
- The Semgrep rule saved as a YAML file.
- The test file with the file extension of the language or framework. The test file must contain at least one true positive and one true negative test case to be approved. See the Tests section of this document for more information.
Sign the Contributor License Agreement (CLA) on GitHub; this is required before Semgrep can accept your contributions.

Pull requests require the approval of at least one maintainer and successfully passed CI jobs.

Find more about the Semgrep Registry by reading the Rule writing and Tests sections.

Licensing

The Semgrep Registry can import rules from different repositories. These repositories can enforce their own licensing for rules. If you'd like to enforce a specific license, such as the MIT license or GNU Lesser GPL:

Create a GitHub repository and store your rules there.
Reach out to the Semgrep team through the Community Slack or Support

Write a rule for Semgrep Registry

The following sections document necessary fields in rule files of Semgrep Registry, provide information about rule messages, inform about test files, mention rule quality checkers, and describe additional fields required by rules in the security category.

General rule requirements

All rules in general, regardless of whether they are intended only as local rules or for Semgrep Registry, have the same initial requirements. The following table is also included in the Rule Syntax article.

All required fields must be present at the top-level of a rule, immediately under the rules key.

Field	Type	Description
`id`	`string`	Unique, descriptive identifier, for example: `no-unused-variable`
`message`	`string`	Message that includes why Semgrep matched this pattern and how to remediate it. See also Rule messages.
`severity`	`string`	One of the following values: `Low`, `Medium`, `High`, `Critical`. The `severity` key specifies how critical are the issues that a rule potentially detects. Note: Semgrep Supply Chain differs, as its rules use CVE assignments for severity. For more information, see Filters section in Semgrep Supply Chain documentation.
`languages`	`array`	See language extensions and tags
`pattern`*	`string`	Find code matching this expression
`patterns`*	`array`	Logical AND of multiple patterns
`pattern-either`*	`array`	Logical OR of multiple patterns
`pattern-regex`*	`string`	Find code matching this PCRE2-compatible pattern in multiline mode

info

Only one of the following is required: pattern, patterns, pattern-either, pattern-regex

Every rule also requires a test file in the language that the rule is targeting. See Tests for more details.

Semgrep registry rule requirements

In addition to the fields mentioned above, rules submitted to Semgrep Registry have additional required fields:

Field	Description	Possible values	Example
`metadata`	All rules require `technology`, `category`, and `references`. The `category: security` has more requirements. See Including fields required by security category.	Required by all Semgrep Registry rules: `references` `category` `technology`	`metadata:cwe:- "CWE-94: (...)"category: securitytechnology:- unicodereferences:- https://trojansource.codes/`
`metadata`		Additional keys required when `category` is `security`: `cwe` `owasp` `confidence` `subcategory` `likelihood` `impact` `vulnerability_class`
`technology`	Nested under the `metadata` field. Additional information about the technology. This helps to specify rulesets in Semgrep Registry.	`django` `docker` `express` `kubernetes` `nginx` `react` `terraform` `--no-technology--`	`metadata:technology:` `- react`
`category`	Nested under the `metadata` field. If you use catagory `security`, include additional metadata. See Including fields required by security category.	`best-practice` `correctness` `maintainability` `performance` `portability` `security`	category: security
`references`	Additional information that gives more context to the user of the rule. This helps developers understand the issue and how to fix it.	No finite value. Any additional information that gives more context.	`references:` `- OWASP DOM based XSS Prevention Cheat Sheet`

info

If you use category security, include additional metadata. See Including fields required by security category.
Cross-file (interfile) analysis requires interfile: true under the options key in YAML rules. For more information, see Creating rules that analyze across files.

Rule namespace

The namespacing format for contributing rules in the Semgrep Registry is <language>/<framework>/<category>/$MORE. If the rule does not belong to a particular framework, add it to the language directory, which uses the word lang in place of the <framework> - <language>/<lang>.

Tests

Include a test file in the language that your rule is targeting. A test file includes the following:

At least one test where the rule detects a finding. This is called a true positive finding.
At least one test where the rule does not detect a finding. This is called a true negative finding.

Test file names must match the rule filename, except for the file extension. For example, if the rule is in my-rule.yaml, the test filename must be my-rule.js. Use any valid extension for the target language.

Requirements of test files

In the test file, include examples that mark:
- What is expected to be a finding.
- What is not a finding.
The test filename must match the rule filename, except for the file extension.

See the examples of the rule and test file below:

Rule file:

rules:
- id: my-rule
  pattern: var $X = "...";
  …

In the test file, mark an expected finding with a comment tag and the ruleid of your rule in the comment before the expected finding. Also, mark the code that is expected not to be a finding with a comment stating ok and add the ruleid also. See the example below:

// ruleid: my-rule
var strdata = "hello";
// ok: my-rule
var numdata = 1;

For more information, visit Testing rules.

Rule messages

Include a rule message that provides details about the matched pattern and informs about how to mitigate any related issues. Provide the following information in a rule message:

Description of the pattern. For example: missing parameter, dangerous flag, out-of-order function calls.
Description of why this pattern was detected. For example: logic bug, introduces a security vulnerability, bad practice.
An alternative that resolves the issue. For example: Use another function, validate data first, and discard the dangerous flag.

Use the YAML multiline string operator >- when rule messages span multiple lines. This presents the best-looking rule message on the command line without having to worry about line wrapping or escaping the quote or using the backslash.

For an example of a good rule message, see: this rule for Django's mark_safe.

Rule message example

mark_safe() is used to mark a string as safe for HTML output. This disables escaping and may expose the content to XSS attacks. Instead, use django.utils.html.format_html() to build HTML for rendering.

Rule quality checker

When you contribute rules to the Semgrep Registry, our quality checkers (linters) evaluate if the rule conforms to Semgrep, Inc. standards. The semgrep-rule-lints job runs linters on a new rule to check for mistakes, performance problems, and best practices for submitting to the Semgrep Registry. To improve your rule writing, use Semgrep itself to scan semgrep-rules.

Fields required by the `security` category

Rules in category security in the Semgrep Registry require specific metadata fields that ensure consistency across the ecosystem in both Semgrep AppSec Platform and Semgrep CLI. Nest these metadata under the metadata field.

If your rule has a category: security, the following metadata are required:

Required metadata field	Values	Example use
`cwe`	A Comment Weakness Enumeration (CWE)	cwe: "CWE-502: Deserialization of Untrusted Data"
`owasp`	An OWASP Top 10 category	owasp: - A05:2021 - Security Misconfiguration
`confidence`	`HIGH`, `MEDIUM`, `LOW`	confidence: MEDIUM
`likelihood`	`HIGH`, `MEDIUM`, `LOW`	likelihood: MEDIUM
`impact`	`HIGH`, `MEDIUM`, `LOW`	impact: HIGH
`subcategory`	`vuln`, `audit`, `secure default`	subcategory: - vuln
`vulnerability_class`	See Vulnerability class for a list of sample values. Accepts custom values.	vulnerability_class: - Hard-coded Secrets

These fields help you to find rules in different categories such as:

High confidence security rules for CI pipelines.
OWASP Top 10 or CWE Top 25 rulesets.
Technology. For example, react so it is easy to find React rulesets.
Audit rules with lower confidence are intended for code auditors.

Examples of rules with a full list of required metadata:

High confidence JavaScript and TypeScript rule: javascript.express.security.audit.express-open-redirect.express-open-redirect
Medium confidence Python rule: python.lang.security.dangerous-system-call.dangerous-system-call
Low confidence C# rule: csharp.lang.security.ssrf.rest-client.ssrf

note

Details of each field mentioned above are provided in the subsections below with examples.

CWE

Include the appropriate Comment Weakness Enumeration (CWE). CWE can explain what vulnerability your rule is trying to find. Examples:

If you write an SQL Injection rule, use the following:

cwe:
  - "CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')"

If you write an XSS rule, use the following:

cwe:
  - "CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')"

Confidence

Indicate confidence of the rule to detect true positives. See the possible options below:

HIGH - Security concern, with high true positives. Useful in CI/CD pipelines.
MEDIUM - Security concern, but some false positives. Useful in CI/CD pipelines.
LOW - Expect a fair amount of false positives, similar to audit style rules. These rules can detect many false positives.

HIGH

HIGH confidence rules can use Semgrep advanced features such as metavariable-comparison or taint mode, to detect true positives. See examples below:

confidence: HIGH

MEDIUM

MEDIUM confidence rules can use Semgrep advanced features such as metavariable-comparison or taint mode, but with some false positives. See examples below:

confidence: MEDIUM

LOW

Low confidence rules generally find something which appears to be dangerous while reporting a lot of false positives. See examples below:

confidence: LOW

Likelihood

Specify how likely it is that an attacker can exploit the issue that has been found. The possible values are LOW, MEDIUM, HIGH.

HIGH

HIGH likelihood rules specify a very high concern that the vulnerability can be exploited. Examples:

The use of weak encryption: go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key
Disabled security feature in a configuration: javascript.angular.security.detect-angular-sce-disabled.detect-angular-sce-disabled
Hardcoded secrets that use a constant value "...": javascript.jose.security.jwt-hardcode.hardcoded-jwt-secret
Rules that leverage taint mode sources which indicate sources that can come from an attacker. Such as HTTP POST, GET, PUT, and DELETE request values. For example: javascript.express.security.audit.express-open-redirect.express-open-redirect

likelihood: HIGH

MEDIUM

MEDIUM likelihood rules detect a vulnerability in most circumstances. Although it can be hard for an attacker to exploit them. Also, these rules can detect part of a problem, but not the whole issue. Examples:

taint mode sources that reach a taint mode sink but the source is only vulnerable in certain conditions for example OS Environment Variables, or loading from disk: python.aws-lambda.security.dangerous-spawn-process.dangerous-spawn-process
taint mode sources with a taint mode sink but is missing a taint mode sanitizer which can introduce more false positives: javascript.express.security.express-puppeteer-injection.express-puppeteer-injection

likelihood: MEDIUM

LOW

LOW likelihood rules tend to find something dangerous, but are not evaluating whether something is truly vulnerable, for example:

taint mode sources such as function arguments which may or may not be tainted which reach a taint mode sink: typescript.react.security.audit.react-href-var.react-href-var
A rule which uses search mode to find the use of a dangerous function for example: trustAsHTML, bypassSecurityTrust(), eval(), or innerHTML: javascript.browser.security.dom-based-xss.dom-based-xss

likelihood: LOW

Impact

Indicate how much damage can a vulnerability cause. Use LOW, MEDIUM, and HIGH.

HIGH

HIGH impact rules can detect extremely damaging vulnerabilities, such as injection vulnerabilities. Examples:

impact: HIGH

MEDIUM

MEDIUM impact rules are issues that are less likely to lead to full system compromise but still are fairly damaging. Examples:

impact: MEDIUM

LOW

LOW impact rules are rules that leverage a security issue, but the impact is not too damaging to the application if discovered.

impact: LOW

References

References help provide more context to a developer on what the issue is, and how to remediate the vulnerability, see examples below:

A rule that is finding an issue in React: typescript.react.security.audit.react-href-var.react-href-var
```
references:
  - https://reactjs.org/blog/2019/08/08/react-v16.9.0.html#deprecating-javascript-urls
```
A rule that is detecting an issue in Express: javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection
```
references:
  - https://sequelize.org/docs/v6/core-concepts/raw-queries/#replacements
```

Subcategory

Include a subcategory to explain what is the type of the rule. See the subsections below for more details.

vuln

A vulnerability rule is something that developers certainly want to resolve. For example, an SQL Injection rule that uses taint mode. Example:

javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection

subcategory:
  - vuln

audit

An audit rule is useful for code auditors. For example, an SQL rule which finds all uses of the database.exec(...) that can be problematic. Example:

generic.html-templates.security.unquoted-attribute-var.unquoted-attribute-var

subcategory:
  - audit

secure default

A secure default rule makes use of inherently secure libraries, frameworks, configurations, or settings. These rules enforce the mitigation of common security concerns, such as preventing cross-site request forgery (CSRF) by properly verifying inbound requests in Django or Flask applications.

A secure default rule must contain remediation that suggests applying a one-time setting that ensures security throughout the codebase without the need for repeated application by developers. For example, configuring a global security setting in a web application framework that applies to all routes and inputs.

subcategory:
  - secure default

Technology

Technology helps to define specific rulesets for languages, libraries, and frameworks that are available in Semgrep Registry, for example express will be included in the p/express ruleset.

javascript.express.security.audit.express-open-redirect.express-open-redirect

technology:
  - express

Vulnerability class

The vulnerability class defines the category to which a rule and its resulting findings belong. The categories are used to group rules in Semgrep AppSec Platform's Policies page to help find similar rules. The category is also displayed on the Finding Details pages.

You can provide custom values. Sample values include:

Active Debug Code
Code Injection
Command Injection
Cookie Security
Cross-Site Request Forgery (CSRF)
Cross-Site-Scripting (XSS)
Cryptographic Issues
Dangerous Method or Function
Denial-of-Service (DoS)
Hard-coded Secrets
Improper Authentication
Improper Authorization
Improper Encoding
Improper Validation
Insecure Deserialization
Insecure Hashing Algorithm
Insufficient Logging
LDAP Injection
Mass Assignment
Memory Issues
Mishandled Sensitive information
Open Redirect
Other Security
Path Traversal
SQL Injection
Server-Side Request Forgery (SSRF)
XML Injection
XPath Injection

Update existing rules in Semgrep Registry

Find a rule you want to update in the semgrep-rules repository.
Submit a PR to the repository with your new update.
Follow the same instructions and recommendations as you can find in the rest of this document. For example the security category has specific metadata requirements.
Leave a message in the PR. Explain why are you making changes. What is the motivation for this update?

See a PR example.

There can be specific messages in the repository’s pipeline informing you about specific details of your rule. Ensure that your rule fulfills all of the necessities and requirements. However, sometimes the pipeline running in the semgrep-rules repository can have specific issues. In such a case, wait for a Semgrep reviewer's help.

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.

Contribute through Semgrep AppSec Platform (recommended)​

Contribute through GitHub​

Licensing​

Write a rule for Semgrep Registry​

General rule requirements​

Semgrep registry rule requirements​

Rule namespace​

Tests​

Rule messages​

Rule quality checker​

Fields required by the security category​

CWE​

Confidence​

HIGH​

MEDIUM​

LOW​

Likelihood​

HIGH​

MEDIUM​

LOW​

Impact​

HIGH​

MEDIUM​

LOW​

References​

Subcategory​

vuln​

audit​

secure default​

Technology​

Vulnerability class​

Update existing rules in Semgrep Registry​

Contribute through Semgrep AppSec Platform (recommended)

Contribute through GitHub

Licensing

Write a rule for Semgrep Registry

General rule requirements

Semgrep registry rule requirements

Rule namespace

Tests

Rule messages

Rule quality checker

Fields required by the `security` category

CWE

Confidence

HIGH

MEDIUM

LOW

Likelihood

HIGH

MEDIUM

LOW

Impact

HIGH

MEDIUM

LOW

References

Subcategory

vuln

audit

secure default

Technology

Vulnerability class

Update existing rules in Semgrep Registry