A Breakdown of the New SAML Authentication Bypass Vulnerability

Randall Degges

February 27, 2018

6 MIN READ

Several weeks ago a new critical vulnerability was discovered that affects many SAML implementations. This vulnerability was first reported by Kelby Ludwig of Duo Security and is particularly interesting to us (as a user management company) as it can be used to bypass authentication in a sinisterly simplistic way.

In this post, we’ll take an in-depth look at this new SAML vulnerability, what it is, how it works, and what you need to know to protect yourself.

NOTE: Just in case you’re wondering whether or not Okta is vulnerable to this new issue: we aren’t >;)

What is the New SAML Authentication Bypass Vulnerability?

It is a new attack which has the potential to directly affect single sign-on (SAML) security.

If you’re not familiar with SAML (short for Security Assertion Markup Language), it’s an open standard that allows users to share credentials between multiple web apps, so they don’t need to log in when accessing different web services manually. Many vendors use SAML to handle user authentication and authorization so their users can access web applications without requiring individual credentials for each.

The new SAML vulnerability allows an attacker to bypass authentication and directly assume the role of an authenticated user as part of the SAML flow. This is a BIG DEAL.

How the new SAML Authentication Bypass Vulnerability Works

When a user is authenticating to a website using SAML, there are always three parties involved:

A user in a web browser
A service provider running a website that user is trying to access (e.g., Salesforce)
An identity provider that stores and manages the user’s account and credentials (e.g., Okta)

At some point during SAML authentication the service provider (e.g., Salesforce) will ask the identity provider (e.g., Okta): who is this person that’s trying to log into me? It’s then the identity provider’s (e.g., Okta) job to generate a SAML assertion (a big XML document) and send it back to the service provider. This SAML assertion contains all the user information necessary for the service provider (e.g., Salesforce) to log this user in without ever needing their password.

NOTE: This is an extreme oversimplification of how SAML works, but accurately covers the important bits needed to understand more about the new vulnerability.

Here’s what a simplified SAML assertion might look like:

<saml:Assertion ID="123">
  <saml:Issuer>www.okta.com</saml:Issuer>
  <saml:Subject>
    <saml:NameID>randall.degges@okta.com</saml:NameID>
  </saml:Subject>
  <saml:Conditions
                   NotBefore="2018-02-27T00:00:00Z"
                   NotOnOrAfter="2018-02-28T00:00:00Z">
    <saml:AudienceRestriction>
      <saml:Audience>www.salesforce.com</saml:Audience>
    </saml:AudienceRestriction>
  </saml:Conditions>
</saml:Assertion>

Looking at the XML sample above, one thing stands out above all else: the saml:NameID element. This element is responsible for identifying the user who’s about to be logged in. In an actual SAML assertion, there’d be a lot of other stuff included. You’d also see things like:

User attributes (first name, last name, etc.)
User roles/permissions (is this user an admin? can they access a certain feature? etc.)
An XML signature (this is required by almost all SAML vendors). XML signatures are frequently used to cryptographically sign the assertion element so that the service provider (e.g., Salesforce) can ensure the SAML assertion it receives from the identity provider is valid and accurate (and has not been modified by an attacker).

The service provider (e.g., Salesforce) relies on the data in this XML document to properly authenticate a user, so it’s critical that this information be accurate. Anddddd, as you might suspect, this is exactly where the new vulnerability comes into play.

It takes advantage of two potential issues in a SAML library implementation:

XML parsing issues and
Cryptographic signing issues in XMLDSIG (a specification which is used to generate and validate XML signatures)

XML Parsing Issues

XML can be quite complicated.

Let’s say, for a moment, that a SAML assertion contains a saml:NameID element like the one below:

<saml:NameID>not-an-admin@okta.com</saml:NameID>

What do you think happens to the XML tree when this element is parsed? It looks something like this:

NameID
|_ Text: not-an-admin@okta.com

Most SAML libraries will parse the saml:NameID element out of the XML tree, extracting the last text element inside of it and will use that value to identify the user logging in.

But… What happens if you break the saml:NameID element up such that it contains an XML comment?

<saml:NameID>not-an-<!-- this is a comment -->admin@okta.com</saml:NameID>

In this scenario, the XML tree, when parsed, will look like so:

NameID
|_ Text: not-an-
|_ Comment: this is a comment
|_ Text: admin@okta.com

And depending on the XML parsing logic used in the SAML library, you can probably see where this is headed: depending on where you insert a comment you can dramatically impact that identity of the user that’s being logged in!

This is bad news for web security as a simple XML comment can cause a SAML implementation to incorrectly authenticate a user.

Cryptographic Signing Issues

The second issue the new SAML vulnerability takes advantage of is the relatively “weak” protection provided by XMLDSIG’s canonicalization algorithms that are used to create and validate XML signatures.

Typically, when a SAML assertion is created, the assertion element itself is cryptographically signed. This helps the service provider (e.g., Salesforce) trust that the user they’re about to log in is valid. Unfortunately however, there are several different types of canonicalization algorithms that are allowed to be used when XML signatures are created, and most of them are not well-suited for creating tamper-proof data.

Let’s take some basic XML, for example:

<p>     hi        </p>

The XML document above would have the exact same cryptographic signature as the XML document below:

<p>hi</p>

This is because XML doesn’t care about whitespace. When the XML document is analyzed before a signature is created, space is removed.

What this new vulnerability takes advantage of, specifically, is the fact that most canonicalization algorithms also don’t care about comments. This means that the following two XML documents (while vastly different in parseability as noted in the previous section) will have identical cryptographic signatures: thereby allowing an attacker to sneak a comment into a SAML assertion and bypass authentication without ever raising any red flags in signature checks. :(

<saml:NameID>not-an-<!-- this is a comment -->admin@okta.com</saml:NameID>

and

<saml:NameID>not-an-admin@okta.com</saml:NameID>

While it is theoretically possible for all SAML vendors to only allow a specific canonicalization algorithm which DOES leave XML documents unmodified (so that they contain things like whitespace, comments, etc.), in practice this would be hard to enforce across vendors and implementations.

To read more about XML canonicalization issues you might want to check out the Wikipedia article on the topic.

How to Protect Yourself from the New SAML Authentication Bypass Vulnerability

If your company is using SAML, you will want to check with your SAML vendors to ensure they are not susceptible to this new vulnerability. Okta, by the way, is not vulnerable. >:)

If you’re a developer building systems to integrate with SAML vendors or single sign-on providers, try to NOT integrate using SAML if possible. Instead: use a more modern identity standard such as OpenID Connect.

OpenID Connect is a lot simpler than SAML, doesn’t use XML or XMLDSIG, and is far less susceptible to attacks like these since it doesn’t rely on XML parsers. There’s a long, complex history with XML and SAML – many of the SAML vulnerabilities that have been found over the last ~15 years are centered around XML parsing (just like this one).

Take, for instance, XML Signature Wrapping attacks which have been known about since ~2002 but are still finding their way into modern SAML implementations as recently as 2014.

If possible, avoid SAML and use OpenID Connect whenever you can. If you must use SAML, check your SAML library (and its underlying XML parsing/validation/signature libraries) to ensure it is NOT vulnerable to the new attack.

Duo has graciously audited several popular SAML libraries and found that:

OneLogin’s python-saml,
OneLogin’s ruby-saml,
Clever’s saml2,
omniauth-saml, and
Shibboleth’s openSAML C++

are all vulnerable with many other libraries also likely to be affected.

If you are the author of a SAML library, there are two important steps you can take immediately to protect your users:

Look for an option in whatever XML library you’re using to remove all comments when creating and parsing XML documents. By strategically purging all XML comments upfront you’ll be able to avoid this issue entirely by never allowing a comment to leak into your XML tree.

If the XML library you’re using doesn’t support purging comments, you can always work with the XML library maintainers to add this functionality or swap to a new XML library.
When parsing nodes from an XML element tree using your XML parsing library, immediately bail out if you detect more than one child node exists for SAML nodes. This way, if you do stumble across a saml:NameId node (or any other SAML node) that contains more than one child, you won’t make the mistake of using just the first or last child values.

You should also avoid concatenating child values together as there may be a potential way to attack these values in the future.

Be safe out there.

If you have any questions, comments, or suggestions: please drop us a note in a comment below or hit us up on Twitter. Or, if you’re interested in other security pieces like this, you may enjoy reading some of the articles on our new security site.

Randall Degges

Randall Degges runs Evangelism at Okta where he works on security research, development, and education. In his spare time, Randall writes articles and gives talks advocating for security best practices. Randall also builds and contributes to various open-source security tools.

Randall's realm of expertise include Python, JavaScript, and Go development, web security, cryptography, and infrastructure security. Randall has been writing software for ~20 years and has built some of the most-used API services on the internet.

Previous post Next post

Okta Developer Blog Comment Policy

We welcome relevant and respectful comments. Off-topic comments may be removed.