Skip to main content

    XML External entity prevention for Java

    This article documents two attacks related to XML external entities: XML exponential entity expansion and XML external entity injection. In Java, applications are secure from exponential entity expansion by default. Consequently, no security measures are necessary. When exponential entity expansion occurs the JDK throws an exception, but if this exception is not caught and handled, an attack can still cause a denial-of-service attack (DoS). In contrast, there are various countermeasures to protect a Java XML parser from XML external entity injection, but not all are effective. This cheat sheet also provides test results, availability, and the effect of different security measures on several Java classes that process XML documents in section 3. Overview of the effect of security measures for each class.

    Check your project using Semgrep

    You may use Pro rules to check your project for XXE vulnerabilities.

    Catching XXE bugs in Java with Semgrep taint labels

    See the process behind creating a Semgrep rule that detects XXE vulnerabilities. You can create and improve your own rules using Semgrep Playground as you may see in the following video:

    Mitigation summary

    It is not easy to summarize Java XML security. The XML standard allows multiple ways to include external content, including but not limited to external Document Type Definitions (DTD), external Entity References to external data, general entity references, and external parameter references. There are also several ways to include external content in XML Schemas and stylesheets.

    The Java API for XML Processing (JAXP) contains multiple interfaces to process XML content:

    • The Document Object Model (DOM) interface
    • The Simple API for XML (SAX) interface
    • The Streaming API for XML (StAX) interface

    Each of these interfaces has its mitigation measures to disable the use of different sources of external content. The inconsistency in availability and even the effect of these measures across the different APIs, combined with the variety of ways to include external content makes it difficult to provide general mitigation recommendations. Many configurations that are secure for one parser are not secure for another. However, the following security settings have a consistent effect on specific parsers:

    • For SAXBuilder, SAXParser, and XMLReader use:
      setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
    • For DocumentBuilderFactory, TransformerFactory, SAXTransformer, SchemaFactory, and Validator use:
      setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
    • For SAXReader the only secure configuration is:
      setFeature("http://apache.org/xml/features/<b>disallow-doctype-decl", true);

    The complete Semgrep set of rules that covers vulnerabilities mentioned in this cheat sheet is distributed as part of the rules available in the Team tier. This set of rules verifies the settings mentioned in the list above and also includes combinations of several other security features.

    The following infographic provides an overview of Java XXE security features:

    XXE Java security features overview infographics


    1. Exponential entity expansion

    Exponential entity expansion occurs when several layers of nested entities reference several other entities. See an example of such a payload, called also an XML bomb, in Semgrep's research project on GitHub.

    When a document that contains an XML bomb is parsed, the parser expands each of the nested references. Consequently, this expansion becomes exponential leading to excessive use of resources and denial-of-service attacks (DoS). In Java, the JDK throws an exception when such excessive use of resources is detected. If this JDK exception is carefully caught and handled, the attack can be prevented. In such a case, no explicit security measures need to be taken. This is the case for all recent versions of Java (17, 18, 19) and all older versions with long-term support (8, 11).

    To catch and handle the exception, log the event and display in the UI that an upload failed. This example is for XMLReader. The same actions can be taken for any of the XML processing classes.

    public boolean uploadXmlFile(File xmlFile){
    XMLReader reader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();
    try {
    InputStream is = new FileInputStream(xmlFile);
    reader.parse(new InputSource(is));
    } catch (Exception e) {
    logger.error("The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.");
    return false;
    }
    logger.info("XML File uploaded successfully.");
    return true;
    }

    2. XML external entity injection

    XML external entity injection occurs when the system identifier to some of the external content contains data controlled by an attacker. The parser then dereferences this identifier containing attacker-controlled XML code, leading to the disclosure of confidential data, or even to arbitrary code execution.

    2.A Ways to include external content in XML documents

    There are many potential ways in which external resources can be referenced in XML documents, XML Schemas, and XSLT stylesheets. External content can be loaded through an External Document Type Definition (DTD), an External Entity Reference to external data, a General Entity reference, or an External Parameter Entity reference. Additional XML code can also be included through XInclude, or references to XML Schema components using the schemaLocation attribute of import or include elements. In stylesheets, multiple sheets can be combined using the xsl:include element, the ?xml-stylesheet processing instruction, or the document() function.

    For examples of each of these ways of including external resources, see the Java API for XML Processing (JAXP) Security Guide in Oracle documentation. In addition, you can review Semgrep's research project on GitHub.

    2.B Security measures

    There are several security measures, but not all of them are available for each class. See an overview of which measures are available for which class in section 3. Overview of the effect of security measures for each class with Semgrep research results.

    Feature for Secure Processing (FSP)

    Feature for Secure Processing (FSP) is considered the central mechanism for secure XML processing. It is defined as javax.xml.XMLConstants.FEATURE_SECURE_PROCESSING. For various classes, this feature can protect the parser from all available payloads. However, this is not the case for all of the classes that have been tested to create this cheat sheet.

    Disabling DTD processing

    Disable DTD processing with one of the available mechanisms:

    • The setFeature method with the following URL: http://apache.org/xml/features/disallow-doctype-decl
    • The setProperty method to set XMLInputFactory.SUPPORT_DTD to false.
    • The setAttribute method to set XMLConstants.ACCESS_EXTERNAL_DTD to an empty string (“”).

    For some parsers, there are multiple methods to disable DTD processing. For most parsers, this setting adds additional security for processing XML documents. However, this does not affect the processing of schemas or stylesheets. Still, there are classes such as DocumentBuilderFactory, Validator, and SchemaFactory with multiple methods to disable DTD, and setting the above-mentioned features throws no exceptions, but only one of them has any effect on the security of the parser.

    Disabling external schema and stylesheet processing

    The setAttribute method can be used to set XMLConstants.ACCESS_EXTERNAL_SCHEMA and XMLConstants.ACCESS_EXTERNAL_STYLESHEET to an empty string to disable these features. The effects on security, however, are often limited with no observed effect for several classes.

    Disabling external entities

    Set setFeature to false to disable http://xml.org/sax/features/external-general-entities and http://xml.org/sax/features/external-parameter-entities features. When both of these features are disabled, all parsers are protected against all of the tested payloads. Except for the Validator class, where they have no effect at all. There is also setExpandEntityReferences that can be set to false and seems to have the same effect as disabling external general entities, based on our testing.

    3. Overview of the effect of security measures for each class

    Summarizing security measures against XXE is not easy as not all security measures work for every class (as mentioned in section 2.B Security measures). The tables below this section can help you to choose the best security measures against a specific vulnerability.

    Table legend
    • Each table represents one Java class. For example DocumentBuilderFactory.
    • Columns display attack payloads that can be potentially used to exploit a vulnerability (for example DTD or XML bomb).
    • Rows represent ways to configure the parsers.

    ✅ - Secure
    ❌ - Not Secure
    ⚠️ - An exception is thrown
    N/A - Not available

    3.A DocumentBuilderFactory

    javax.xml.parsers.DocumentBuilderFactory can be used to process all payload types. By default it is vulnerable against 4 of the attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️
    setValidating(false) ✅⚠️
    setExpandEntityReferences(false) ✅⚠️
    setXIncludeAware(false) ✅⚠️
    setFeature("http://apache.org/xml/features/xinclude", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false) ✅⚠️
    setFeature("http://apache.org/xml/features/namespaces", false) ✅⚠️
    setNamespaceAware(true) ✅⚠️

    3.B SAXBuilder

    org.jdom2.input.SAXBuilder can process all 10 payloads. In the default configuration, it is vulnerable to 4 of the attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    ✅⚠️

    Default configuration DTDValidating

    ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️

    Default configuration XSDValidating

    ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️
    setExpandEntityReferences(false) ✅⚠️
    setFeature("http://apache.org/xml/features/xinclude", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) ✅⚠️

    3.C SAXParserFactory

    javax.xml.parsers.SAXParserFactory can process all 10 attack payloads. In its default configuration, it is vulnerable to 4 of the attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️
    setValidating(false) ✅⚠️
    setExpandEntityReferences(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setXIncludeAware(false) ✅⚠️
    setFeature("http://apache.org/xml/features/xinclude", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/namespaces", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setNamespaceAware(true) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A

    3.D SAXParser

    javax.xml.parsers.SAXParser can process all 10 payloads. In its default configuration, it is vulnerable to 4 attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    All other features N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A

    3.E SAXReader

    org.dom4j.io.SAXReader can process all 10 payloads. In its default configuration, it is vulnerable to 4 of the attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️
    setValidating(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setExpandEntityReferences(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setXIncludeAware(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/xinclude", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/namespaces", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setNamespaceAware(true) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setIncludeExternalDTDDeclarations(false) ✅⚠️

    3.F TransformerFactory & SAXTransformerFactory

    javax.xml.transform.TransformerFactory and javax.xml.transform.sax.SAXTransformerFactory are not able to process XML Schema Definition (XSD) payloads. Out of the 7 payloads they can process, the default configurations parses 6 payloads insecurely.

    Parser configurationAttack payload
    XMLXSL
    XML BombDTDParameter entityDTDimportincludedocument

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    All other features N/A N/A N/A N/A N/A N/A N/A

    3.G SchemaFactory

    javax.xml.validation.SchemaFactory is unable to process XML Stylesheet Language (XSL) documents. Out of the 6 payloads it can process, the default configuration is vulnerable to 5.

    Parser configurationAttack payload
    XMLXSD
    XML BombDTDParameter entityDTDimportinclude

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️
    All other features N/A N/A N/A N/A N/A N/A

    3.H Validator

    javax.xml.validation.Validator can only process XML documents, not XSL or XSD. Out of the 3 attack payloads it can process, the default configuration is vulnerable to 2.

    Parser configurationAttack payload
    XML
    XML BombDTDParameter entity

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️

    All other features

    N/A N/A N/A

    3.I XMLReader

    org.xml.sax.XMLReader can process all 10 payloads. The default configuration is vulnerable to 4 of the attack payloads.

    Parser configurationAttack payload
    XMLXSLXSD
    XML BombDTDParameter entityDTDimportincludedocumentDTDimportinclude

    Default configuration

    ✅⚠️
    setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true) ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "") ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "") ✅⚠️
    setAttribute(XMLConstants. ACCESS_EXTERNAL_STYLESHEET, "") N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/disallow-doctype-decl", true) ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️ ✅⚠️
    setFeature("http://xml.org/sax/features/external-general-entities", false) ✅⚠️
    setFeature("http://xml.org/sax/features/external-parameter-entities", false) ✅⚠️
    setValidating(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setExpandEntityReferences(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setXIncludeAware(false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/xinclude", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false) ✅⚠️
    setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setFeature("http://apache.org/xml/features/namespaces", false) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
    setNamespaceAware(true) N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A

    Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.