A Crash Course in OAuth Demonstrated Proof of Possession (DPoP)

Proof? You can’t handle the proof!

James Collerton
6 min readSep 6, 2023
Stopping people taking your tokens

Audience

This article is aimed at developers with a very solid understanding of OAuth2.0. My article here should give you a good start.

You will also need a good understanding of how JWT tokens work, which you can get from reading my article here.

Within the piece we are going to be exploring Demonstrated Proof of Possession, or DPoP.

Previously, delegated authorization worked like a hotel key. Anyone who had the key (access token) could get in the room (access protected resources).

With sender-constraints such as DPoP or mTLS there is an additional identity check on the door, making sure you can prove the key has your personal details (the client’s) on it.

This prevents just anyone using it!

Argument

OAuth2.0 uses HTTP and DPoP manifests itself as an additional HTTP header on certain calls. This header is a JWT. The JWT allows the authorization server to constrain access tokens or authorisation codes to anyone who can prove they have access to the private key in a specified public/ private key pair.

Previously, if someone stole your access token they could go around using it pretending to be you! With DPoP they would also need your private key, which is a lot harder to get at.

DPoP is particularly useful for single page applications where they can generate a key pair on the fly.

It’s important to recognise DPoP is not a client authentication method (like a client secret).

The Overall Flow

The overall flow of DPoP

This diagram represents everything we’re going to try and explain throughout the article. I’m putting it first so you can get the gist and have something to refer back to as we explore the different parts.

The DPoP Proof JWT

This is the core data structure. As we covered in the overview, DPoP is a way of tying tokens to a private key. The proof is how we communicate the public key matching the private key so the authorization server can make the connection between the two. It is sent as an HTTP header, DPoP.

There are three components to a JWT: the header, the payload and the signature.

In DPoP the header contains information on the algorithm and key used to sign the token. The payload contains information including a unique ID and the HTTP method and target of the request the proof is attached to. This is to prevent the JWT being used where it is not allowed.

If you refer to the flow chart from previously there are two main places we will use a DPoP Proof JWT:

  1. When requesting an access token, to tie it to a private key.
  2. When using an access token, to prove we have the private key.

In the latter case the JWT payload must also contain a hash of the access token sent along with the request.

This is to address the case when a malicious party gets hold of an access token and a proof both signed by the same private key, but the proof is for a different token.

The final payload field is the nonce. We will cover that comprehensively in a following section. For the time being it’s sufficient to understand it’s a way of limiting the lifetime of a proof.

A complete description of the proof contents is here.

How do we bind a public key to an access token?

Binding the public key to an access token associates the two, and means the latter can only be used with a DPoP proof signed by the private key.

If you consult the previous flow diagram you can see there are two options for binding the public key to an access token:

  1. Transparently as a claim in a JWT access token using a base64url encoding of the JWK SHA-256 Thumbprint of the public key. The claim is called the jkt.
  2. Opaquely by returning the exact same information from the token introspection endpoint.

Both of these allow the resource server to make sure the public key provided by the client on using the DPoP proof:

  1. Verifies with the private key.
  2. Matches the public key that was provided when the access token was forged.

How do we bind a public key to an authorisation code?

Binding a public key to an authorisation code is a way of making sure the initial request for the authorisation code comes from the same person as the eventual request for a token.

We do this by sending the same JWK SHA-256 Thumbprint of the public key (jkt) as before. This is sent along with the request to the /authorize endpoint for generating an authorisation code as a dpop_jkt query parameter.

Then when the client redeems the authorisation code for an access token at the /token endpoint, the authorization server must calculate the hash of the provided proof and compare it to the jkt given at the start.

What makes a valid DPoP proof JWT?

The previous section detailed how a client might create a proof. In this section we’ll outline how an authorization or resource server might check it. The exhaustive list is here, but the main ones are:

  1. All the claims listed in the previous section are there.
  2. The HTTP target and method match the endpoint receiving it.
  3. The signature signed using the private key verifies with the public key contained in the header.
  4. The nonce is valid.
  5. If there is an access token it matches the one sent in the request and the public key bound to the access token matches the public key from the DPoP proof.

But this token isn’t just ‘bearer’ any more!

Something you may have noticed is this is no longer a bearer token. It is not enough just to have the token, we also need a key to match. Therefore it no longer makes sense to use the Bearer Authentication scheme, and instead we introduce a new, DPoP one.

The role of the nonce

The nonce is to prevent clients making or using the same DPoP proofs for too long.

There are two places a nonce can be required or issued:

  1. The authorization server: If a client requests an access token without a nonce in their DPoP proof then the authorization server returns an error code and a nonce to use. The client must then include the nonce in their proof, resign it (to show they still have the client key) and request again.
  2. The resource server: Exactly the same as above, only now the nonce is only scoped to the resource server.

We do this as if a malicious party temporarily gains control of a client they can only generate proofs for a limited time. Once they lost control of the client they can no longer re-sign proofs with newly provided nonces.

I’m sure you’ll have picked up on the inefficiency of the ‘wait-till-I-get-an-error’ methodology. The spec also defines a more efficient manner of procuring a new nonce, where it is supplied in a header on the last valid request using the old one.

We also don’t want to confuse nonces, so they should only be used at the server (resource or authorization) they were issued at.

Conclusion

In conclusion we have covered the main ideas behind demonstrated proof of possession. However, there are many others including the use of push requests, and authorization server and client registration metadata. Dig into the original RFC to find out!

--

--

James Collerton

Senior Software Engineer at Spotify, Ex-Principal Engineer at the BBC