Match comments with Semgrep
When Semgrep rules target specific languages, they generally do not match comments in the targeted code files. Comments are not part of the semantic and syntactic structure of the document, so in most cases they are ignored.
However, it's sometimes useful to match comments. For example, comments can control the behavior of other linters, such as type checkers. You might also have certain formatting standards for comments, such as requiring that a TODO
comment contains a ticket capturing the required work.
To match comments with Semgrep, use the generic
language target to invoke generic pattern matching. (Alternatively you may use pattern-regex
which does file-level matching rather than semantic / syntactic matching, which is beyond the scope of this article.)
Example rule
Suppose that your organization requires all TODO
comments to have an associated Jira ticket. This rule finds TODO lines with no atlassian.net
content and identifies any lines not containing a Jira Cloud ticket link.
rules:
- id: no-todo-without-jira
patterns:
- pattern: TODO $...ACTION
- pattern-not: TODO ... atlassian.net ...
options:
generic_ellipsis_max_span: 0
message: The TODO comment "$...ACTION" does not contain a Jira ticket to resolve the issue
languages:
- generic
severity: INFO
metadata:
category: best-practice
Try this pattern in the Semgrep Playground.
This rule also includes the generic_ellipsis_max_span
option, which limits the ellipsis to matching on the same line and prevents it from over-matching in this generic context.
Limiting the match to certain file types
If particular types of comments are only relevant for certain files, you can use the paths:
key to limit the rule to files of that type. For example, mypy
type ignores are only relevant in Python files.
...
rules:
- id: no-mypy-ignore
...
paths:
include:
- "*.py"
Ignoring some comments in generic mode
It is possible to ignore comments of particular types in generic mode using the generic_comment_style
option. For example, to ignore C-style comments but match any other style:
rules:
- id: css-blue-is-not-allowed
pattern: |
color: blue
options:
# ignore comments of the form /* ... */
generic_comment_style: c
message: |
Blue is not allowed.
languages:
- generic
severity: INFO
Additional resources
Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.