Publish rules in the open-source Semgrep Registry and share them with the Semgrep community to help others benefit from your rule-writing efforts and contribute to the field of software security. There are two ways in which you can contribute rules to the Semgrep Registry:
- For users of Semgrep Cloud Platform
- Contribute rules to the Semgrep Registry through Semgrep Cloud Platform. This workflow is recommended. See Contributing through Semgrep Cloud Platform (recommended). This workflow creates the necessary pull request for you and streamlines the whole process.
- For contributors to the repository through GitHub
- Contribute rules to the Semgrep Registry through a pull request. See the Contributing through GitHub section for detailed information.
Contributing through Semgrep Cloud Platform (recommended)
To contribute and publish rules to the Semgrep Registry through Semgrep Cloud Platform, follow these steps:
- Go to Playground.
- Click Create New Rule.
- Choose one of the following:
- Create a new rule and test code by clicking plus icon, select New rule and then click Save. Note: The test file must contain at least one true positive and one true negative test case to be approved. See the Tests section of this document for more information.
- In the Library panel, select a rule from a category in Semgrep Registry. Click Fork, modify the rule or test code, and then click Save.
- Click Share.
- Click Publish to Registry.
- Fill in the required and optional fields.
- Click Continue, and then click Create PR.
You can also publish rules as private rules outside of Semgrep Registry. These rules are not included in the Semgrep Registry, but they are accessible to your Semgrep organisation. See the Private rules documentation for more information.
Contributing through GitHub
Fork our repository and make a pull request. Sign our Contributor License Agreement (CLA) on GitHub before Semgrep, Inc. can accept your contributions. Make a pull request to the Semgrep Registry with two files:
- The semgrep pattern (as YAML file).
- The test file (with the file extension of the language or framework). The test file must contain at least one true positive and one true negative test case to be approved. See the Tests section of this document for more information.
Pull requests require the approval of at least one maintainer and successfully passed CI jobs.
Writing a rule for Semgrep Registry
The following sections document necessary fields in rule files of Semgrep Registry, provide information about rule messages, inform about test files, mention rule quality checkers, and describe additional fields required by rules in the security category.
General rule requirements
All rules in general, regardless of whether they are intended only as local rules or for Semgrep Registry, have the same initial requirements. The following table is also included in the Rule Syntax article.
All required fields must be present at the top-level of a rule, immediately under the
|Unique, descriptive identifier, for example: |
|Message that includes why Semgrep matched this pattern and how to remediate it. See also Rule messages.|
|One of the following values: |
|See language extensions and tags|
|Find code matching this expression|
|Logical AND of multiple patterns|
|Logical OR of multiple patterns|
|Find code matching this PCRE-compatible pattern in multiline mode|
Only one of the following is required:
Every rule also requires a test file in the language that the rule is targeting. See Tests for more details.
Semgrep registry rule requirements
In addition to the fields mentioned above, rules submitted to Semgrep Registry have additional required fields:
|All rules require ||Required by all Semgrep Registry rules:|
|Additionally required by |
|Nested under the |
|Nested under the |
|Additional information that gives more context to the user of the rule. This helps developers understand the issue and how to fix it.||No finite value. Any additional information that gives more context.|
- If you use category
security, include additional metadata. See Including fields required by security category.
- Semgrep Pro Engine rules that leverage cross-file (interfile) analysis also require
metadatakey in YAML rules. For more information, see Creating rules that analyze across files.
Understanding rule namespacing
The namespacing format for contributing rules in the Semgrep Registry is
<language>/<framework>/<category>/$MORE. If the rule does not belong to a particular framework, add it to the language directory, which uses the word
lang in place of the
Include a test file in the language that your rule is targeting. A test file includes the following:
- At least one test where the rule detects a finding. This is called a true positive finding.
- At least one test where the rule does not detect a finding. This is called a true negative finding.
Test file names must match the rule file name, except for the file extension. For example, if the rule is in
my-rule.yaml, the test file name must be
my-rule.js. Use any valid extension for the target language.
- In the test file, include examples that mark:
- What is expected to be a finding.
- What is not a finding.
- The test file name must match the rule file name, except for the file extension.
See the examples of the rule and test file below:
- id: my-rule
pattern: var $X = "...";
In the test file, mark an expected finding with a comment tag, and mention ruleid of your rule in the comment before the expected finding. Also, mark the code that is expected not to be a finding with a comment stating
ok and add the ruleid also. See the example below:
// ruleid: my-rule
var strdata = "hello";
// ok: my-rule
var numdata = 1;
For more information, visit Testing rules.
Include a rule message that provides details about the matched pattern and informs about how to mitigate any related issues. Provide the following information in a rule message:
- Description of the pattern. For example: missing parameter, dangerous flag, out-of-order function calls.
- Description of why this pattern was detected. For example: logic bug, introduces a security vulnerability, bad practice.
- An alternative that resolves the issue. For example: Use another function, validate data first, and discard the dangerous flag.
Use the YAML multiline string operator
>- when rule messages span multiple lines. This presents the best-looking rule message on the command line without having to worry about line wrapping or escaping the quote or using the backslash.
For an example of a good rule message, see: this rule for Django's mark_safe.
mark_safe() is used to mark a string as safe for HTML output. This disables escaping and may expose the content to XSS attacks. Instead, use
django.utils.html.format_html() to build HTML for rendering.
Rule quality checker
When you contribute rules to the Semgrep Registry, our quality checkers (linters) evaluate if the rule conforms to Semgrep, Inc. standards. The
semgrep-rule-lints job runs linters on a new rule to check for mistakes, performance problems, and best practices for submitting to the Semgrep Registry. To improve your rule writing, use Semgrep itself to scan semgrep-rules.
Including fields required by security category
Rules in category
security in the Semgrep Registry require specific metadata fields that ensure consistency across the ecosystem in both Semgrep Cloud Platform and Semgrep CLI. Nest these metadata under the
If your rule has a
category: security, the following metadata are required:
|Required metadata field||Values||Example use|
|A Comment Weakness Enumeration (CWE).|
These fields help you to find rules in different categories such as:
- High confidence security rules for CI pipelines.
- OWASP Top 10 or CWE Top 25 rulesets.
- Technology. For example,
reactso it is easy to find Reac rulesets.
- Audit rules with lower confidence are intended for code auditors.
Examples of rules with a full list of required metadata:
- Medium confidence Python rule: python.lang.security.dangerous-system-call.dangerous-system-call
- Low confidence C# rule: csharp.lang.security.ssrf.rest-client.ssrf
Details of each field mentioned above are provided in the subsections below with examples.
Include the appropriate Comment Weakness Enumeration (CWE). CWE can explain what vulnerability your rule is trying to find. Examples:
If you write an SQL Injection rule, use the following:
- "CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')"
If you write an XSS rule, use the following:
- "CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')"
Indicate confidence of the rule to detect true positives. See the possible options below:
- HIGH - Security concern, with high true positives. Useful in CI/CD pipelines.
- MEDIUM - Security concern, but some false positives. Useful in CI/CD pipelines.
- LOW - Expect a fair amount of false positives, similar to audit style rules. These rules can detect many false positives.
HIGH confidence rules can use Semgrep advanced features such as
taint mode, to detect true positives. See examples below:
MEDIUM confidence rules can use Semgrep advanced features such as
taint mode, but with some false positives. See examples below:
Low confidence rules generally find something which appears to be dangerous while reporting a lot of false positives. See examples below:
Specify how likely it is that an attacker can exploit the issue that has been found. The possible values are
HIGH likelihood rules specify a very high concern that the vulnerability can be exploited. Examples:
- The use of weak encryption: go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key
- Hardcoded secrets that use a constant value
- Rules that leverage
taint mode sourceswhich indicate sources that can come from an attacker. Such as HTTP
MEDIUM likelihood rules detect a vulnerability in most circumstances. Although it can be hard for an attacker to exploit them. Also, these rules can detect part of a problem, but not the whole issue. Examples:
taint mode sourcesthat reach a
taint mode sinkbut the source is only vulnerable in certain conditions for example OS Environment Variables, or loading from disk: python.aws-lambda.security.dangerous-spawn-process.dangerous-spawn-process
taint mode sourceswith a
taint mode sinkbut is missing a
LOW likelihood rules tend to find something dangerous, but are not evaluating whether something is truly vulnerable, for example:
taint mode sourcessuch as function arguments which may or may not be tainted which reach a
taint mode sink: typescript.react.security.audit.react-href-var.react-href-var
- A rule which uses
search modeto find the use of a dangerous function for example:
Indicate how much damage can a vulnerability cause. Use LOW, MEDIUM, and HIGH.
HIGH impact rules can detect extremely damaging vulnerabilities, such as injection vulnerabilities. Examples:
MEDIUM impact rules are issues that are less likely to lead to full system compromise but still are fairly damaging. Examples:
LOW impact rules are rules that leverage a security issue, but the impact is not too damaging to the application if discovered.
Include a subcategory to explain what is the type of the rule. See the subsections below for more details.
A vulnerability rule is something that developers certainly want to resolve. For example, an SQL Injection rule that uses taint mode. Example:
An audit rule is useful for code auditors. For example, an SQL rule which finds all uses of the
database.exec(...) that can be problematic. Example:
A guardrail rule is useful for companies writing custom rules. For example, finding all usages to non-standard XML parsing libraries within the company. The rule can also bring a message that a developer can use only a company-approved library.
Technology helps to define specific rulesets for languages, libraries, and frameworks that are available in Semgrep Registry, for example
express will be included in the
References help provide more context to a developer on what the issue is, and how to remediate the vulnerability, see examples below:
- A rule that is finding an issue in React: typescript.react.security.audit.react-href-var.react-href-var
Updating existing open-source rules in Semgrep Registry
To update an existing open-source rule, follow these steps:
- Find a rule you want to update in the semgrep-rules repository.
- Submit a PR to the repository with your new update.
- Follow the same instructions and recommendations as you can find in the rest of this document. For example the security category has specific metadata requirements.
- Leave a message in the PR. Explain why are you making changes. What is the motivation for this update?
See a PR example.
There can be specific messages in the repository’s pipeline informing you about specific details of your rule. Ensure that your rule fulfills all of the necessities and requirements. However, sometimes the pipeline running in the semgrep-rules repository can have specific issues. In such a case, wait for a Semgrep reviewer's help.