tl;dr:
HTML injection is a vulnerability in which attacker-provided input is rendered as HTML. HTML injection in emails can lead to attackers phishing users from a legitimate email address.
Subtle Flask defaults can lead to HTML injections in emails--Flask only escapes templates with certain file extensions.
You can automatically detect and prevent email HTML injection in your code.
HTML Injection in Email
Did you know that injection vulnerabilities can occur in HTML-formatted emails?
Emails can be sent with the text/plain
or text/html
MIME types (and more). Text emails are just that: plain text. No fancy markup, no formatting, no images. Email clients render text emails exactly as written--HTML tags, for instance, are not processed.
On the other hand, clients will process HTML tags in an HTML email, allowing for the rich, colorful email experience we have today. HTML in emails, though, means your emails are vulnerable to some of the same issues as web pages.
This post demonstrates how an attacker can inject HTML into emails from Python web apps to phish users, how apps can prevent this, and how to automatically detect and eliminate email HTML injection from entering your code.
Sending stylized emails from the backend
Some email libraries send text emails by default, such as the default email.message
in Python. Developers must explicitly opt-in to HTML emails. This avoids most problems when whipping up quick notifications. But what if you want those clean, professional-looking emails coming from your app? You'll have to style your email using HTML. Beware though: if you let user data creep into your auto-generated emails, you have potentially introduced a vulnerability.
Let's see an example of how this works using this sample Flask app. It's actually a variation of a real Flask app where I made this exact mistake. Here's the winning formula:
I write up a really nice-looking HTML email template,
templates/welcome_message.email
(Note the file extension. It matters later.)
<h2>
Hello, {{ name }}!
</h2>
Hey! You're all signed up to get matched with a friend-of-a-friend to offset
housing costs. You can reach out to anyone on this site! Or, if you wait a bit,
we will make introductions with someone on your behalf. :) When you're all done,
click here to delete your entry.
<a href="{{ delete_link }}">{{ delete_link }}</a>
<em>We can't wait to see you!</em>
I want to personalize my app's email based on the new signup's name using form data. Who doesn't like polite computers?
name = request.form.get('name', "")
...
render_template("welcome_message.email", name=name, delete_link=delete_link)
I fire off my newly minted, custom-tailored email to the new signup with this snippet of code:
import smtplib
from flask import render_template
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
message = MIMEMultipart("alternative")
message['Subject'] = config.get('subject', 'Successful Signup for Roomshare')
message['From'] = config.get('smtp_sender_email', "noreply")
message['To'] = email
message.attach(MIMEText(render_template("welcome_message.email", name=name, delete_link=delete_link), "plain"))
message.attach(MIMEText(render_template("welcome_message.email", name=name, delete_link=delete_link), "html"))
...
s = smtplib.SMTP(smtp_host, 587)
...
s.sendmail(config.get('smtp_sender_email', "noreply"), email, message.as_string())
What could go wrong?
...What if my new signup's name looks like this?
Jerry!</h2><a href="http://www.evil.xyz/give_me_your_password">Click here</a> to view your registration! <div style="display:none">
In Gmail, the result looks like this:
What happened? The closing </h2>
tag ensures the rest of the text looks normal. Now, a relatively benign-looking message is displayed with a link going to http://www.evil.xyz/give_me_your_password
.
Further, <div style="display:none">
hides all the content of the original email template. So, now I have a legitimate-looking, attacker-controlled email coming from a real, recognizable email address--my email address--encouraging my signup to click through to an attacker-controlled page. Yikes.
Impact
An attacker with control of an email from a legitimate domain can create phishing emails tricking users to open attacker-controlled pages. In all likelihood, this page would look similar to the original site and ask users for sensitive information, such as passwords or account information.
Big-name email providers will strip out <script>
tags and will dump javascript:
URIs from anchor tags (I tried many variants), so the impact of email HTML injection is limited to phishing emails.
Preventing HTML injection in emails
So, how do we prevent this? The same way we prevent HTML issues normally: by escaping HTML characters!
The above vulnerability is actually possible due to a comedy of errors. Both of these conditions must be true for an attacker to control the HTML contents of an email:
The email template is not HTML escaped when rendered.
The email is an HTML email.
Since the demo app is a Flask app, I'll focus on why this happened in Flask.
Escaping behavior of Flask templates
Regarding (1), Flask templates are only automatically escaped if they end with the .html
extension. By simply changing the extension of our email template from .email
to .html
, we have mitigated the problem. However, if you didn't immediately say to yourself "obviously .email
extensions aren't escaped" while reading... you can understand how easy it is to make this mistake. (We wrote about this subtle escaping behavior previously, and you can read more about this escaping behavior in Flask here.)
from flask import render_template
name = request.form.get('name', "")
...
# Templates with '.html' are escaped.
render_template("email.html", name=name, delete_link=delete_link
Prevent this in your code
If you have a Flask app, you can scan your code for rendered templates without escaped extensions using Semgrep. You can scan your project if it's on GitHub or locally:
$ semgrep --config=https://semgrep.dev/c/r/python.flask.security.unescaped-template-extension
Further, you can keep this problem from ever happening again by integrating this check into CI. This way, you can always ensure HTML escaping is applied, which is especially helpful when working with a team.
Only use text emails
Regarding (2), the issue occurs because the email is explicitly given an HTML portion. The application uses Python's built-in email
library, using email.mime.multipart.MIMEMultipart
and email.mime.text.MIMEText
objects to construct an HTML portion. Had the email not included this portion, the email would only be text. Therefore, it would be safe because email clients would not process the injected HTML.
The code without an HTML portion would look like this:
import smtplib
from flask import render_template
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
message = MIMEMultipart("alternative")
message['Subject'] = config.get('subject', 'Successful Signup for Roomshare')
message['From'] = config.get('smtp_sender_email', "noreply")
message['To'] = email
message.attach(MIMEText(render_template("welcome_message.email", name=name, delete_link=delete_link), "plain"))
...
s = smtplib.SMTP(smtp_host, 587)
...
s.sendmail(config.get('smtp_sender_email', "noreply"), email, message.as_string()
A Django example
Flask's autoescaping behavior caught me by surprise leading to the issue above. However, sometimes developers will throw data straight into an email! No escaping, no templates! This is a snippet that resembles real code I have encountered in Django apps. This send_email
function lets an attacker completely control the email contents of an HTML email.
from django.http import HttpResponse
from django.http import HttpResponseBadRequest
from django.core.mail import EmailMessage
from app.models import Recipients
def send_email(request):
subj = "Daily Crossword"
from_email = "daily_crossword@example.com"
recipient_objs = Recipients.objects.all()
recipients = [recip.email_address for recip in recipient_objs]
message = request.POST.get('message')
if not message:
return HttpResponseBadRequest("No message set.")
email = EmailMessage(subj, message, from_email, recipients)
email.content_subtype = "html" # Sets the email to HTML
email.send()
return HttpResponse("Email sent successfully!")
This could easily be prevented by sending a text email instead (simply delete email.content_subtype = "html"
).
If HTML emails are needed, use an automatically escaping template engine (like the one Django provides) instead of reflecting user data directly into an email. The code to render an email body with a template looks like this:
from django.http import HttpResponse, HttpResponseBadRequest
from django.core.mail import EmailMessage
from app.models import Recipients
from django.template.loader import render_to_string
def send_email(request):
subj = "Daily Crossword"
from_email = "daily_crossword@example.com"
recipient_objs = Recipients.objects.all()
recipients = [recip.email_address for recip in recipient_objs]
try:
puzzle = request.POST.get('puzzle')
message = render_to_string("emails/crossword.html", {"puzzle": puzzle})
except:
return HttpResponseBadRequest("Problem generating email.")
email = EmailMessage(subj, message, from_email, recipients)
email.content_subtype = "html"
email.send()
return HttpResponse("Email sent successfully!")
You can scan a Django app for EmailMessages
directly using request
data with Semgrep on GitHub or locally:
semgrep --config=https://semgrep.dev/c/r/python.django.security.injection.email
And as before, you can eliminate request
data directly into EmailMessage
from your code forever by integrating checks for this issue into CI.
Conclusion
In summary, be careful when auto-generating HTML emails!
HTML injection in emails can lead to attackers phishing from legitimate domains.
Make sure your email content is escaped. Read the documentation to understand its behavior.
If you're paranoid, consider using text-only emails.
Set up automatic scanning, such as Semgrep, for your code to prevent dangerous code from entering the codebase.
And for me, personally, I'm switching my email client to text only!