Acquiring and Using Access Tokens in OAuth2.0
Basic Choreographies🕺
Audience
In a previous article we gave an overview of OAuth2.0 and its use case. Here we will explore some of the implementation of the framework, and how it might function in reality.
We will expect a solid understanding of the HTTP protocol and a familiarity with core OAuth2.0 concepts. The latter can be gleaned from the previous article.
This is designed as a practical introduction to the major OAuth2.0 flows. For a comprehensive coverage please refer to the original specification here.
Argument
Before we continue, let’s clarify the core entities in OAuth2.0.
The Core OAuth2.0 Entities
- Resource Owner: An entity with a protected resource they can grant access to. For example a user (the resource owner) with an email address (their resource) stored in Google, such that they can give access to their email address to a third party application.
- Resource Server: The server holding the resource owner’s resource. It must be able to accept access token-bearing requests for the resource. In our example this would be a server owned by Google.
- Client: This is any application making requests for the protected resource on behalf of the owner. Remember, the idea of OAuth2.0 is that the owner delegates authorisation to another application to access their protected resources.
- Authorization Server: This is responsible for issuing access tokens to the client after the resource owner has been authenticated and allowed access.
Registering a Client with the Authorisation Server
The first thing we need to do is register our client with the authorisation sever. How this happens isn’t defined directly in the spec, however there are the following requirements:
- You must specify a client type (covered shortly).
- You must provide the client redirection URIs (also covered shortly).
- You must include any other information the authorization server needs (client name, description etc.).
As you can register multiple clients with an authorisation server, the authorisation server needs a way of differentiating and authenticating them. This is the role of client credentials.
The different client types are based on the ability of a client to keep their credentials safe.
- Confidential: Clients who can keep their client credentials safe. One example would be a service running on a secure server. Here the client secrets would only be available to only the privileged few who can access the box.
- Public: Clients who can’t keep their client credentials safe. Examples include a native or browser-based app. In a native app the secret would need to be written into the code, which could be extracted. In a browser-based app we would somehow need to send the credentials to the browser, where they could also be exposed.
The OAuth2 spec is aimed at the following client groups:
- Web Application (Confidential): Client sits on a web server, exactly as in the confidential client example.
- User-Agent Based Application (Public): Client downloaded from a web server and executed on the resource owner’s device. A good example is a web page downloaded and executed on the browser (as referred to in the public client example).
- Native Application (Public): Client installed and executed on the resource owner’s device (again, referred to in the public client example, mobile and desktop apps offer concrete illustrations).
So what are client credentials?
Overall they are a way of a client identifying themselves to the authorisation server. Both public and private clients are issued a client_id
, but only private clients (ones who can keep a secret!) are issued a secure method of authenticating.
The most common method of secure authentication is using a client_secret
. The Id in itself isn’t a secret and can be put into apps, sent to the browser etc. However, the client_secret
must be kept hidden.
The Id can be sent as part of the URL, but the original OAuth spec recommends sending the secret (if you have one) using the HTTP Basic
authentication scheme.
The other point we said we would cover was the redirection URIs. Let’s explore the below diagram.
Ultimately what we want from OAuth is to be able to get tokens into our client such that we can access the protected resources we need. To do this we need to pass over to our authorisation server somehow, allow it to issue us the tokens, then redirect back into our application.
This is the role of redirection URIs. We need to tell our authorisation servers where we might go back to once we have our tokens. This prevents malicious parties redirecting in a nefarious manner.
One interesting thing to note is the use of the state parameter. Imagine you are on a news site and visiting a specific news article page. If you log in from that page you most likely want to be redirected back to that same page.
However, according to the spec so far, you would need to register every single page and every single news article as redirect URIs with the authorisation server!
The state parameter can be sent along with the redirection URI to maintain state. This is then returned to the client app in the URI.
Endpoints
There are three endpoints we need to worry about. On the authorisation server side there is:
The Authorisation Endpoint: This handles the process whereby the user gives consent to having their resources shared (issues an authorisation grant). This includes limiting the scope of the resource-sharing.
Somehow the authorisation endpoint needs to verify the identity of the resource owner. This could be done in multiple ways: making them log in, enter an authentication code, checking their cookies. The OAuth2.0 spec doesn’t care how it happens, just that it does.
The Token Endpoint: This is responsible for issuing access tokens when given either an authorization grant or a refresh token.
On the client side there is:
The Redirection endpoint: This is the URI mentioned earlier, which we return to with our credentials.
The Basic Choreography
We outline the basic steps to acquire and use a working access token below.
- The client makes a request to the resource owner to get their authorisation to access their data.
- The resource owner issues them an authorisation grant. This differs depending on the grant type, which we will cover in the following section.
- The client authenticates with the authorisation server (using the client credentials) and tries to use the authorisation grant to get an access token.
- If the authorisation grant is valid, the authorisation server gives the client an access token.
- The client makes a request to the resource server along with its new access token. Note, it is not defined in the spec how to use the access token, there are different options.
- If the access token is valid the resource server returns the resource. Note, another responsibility of the authorisation server is to check the scope of the access token to make sure that it is sufficient to access their resources.
So far so good! As long as you understand the flow of things we’re off to a great start.
Different Grant Types
Now we move onto the grant types in detail. The reason they are called ‘grant types’ is that they reflect the different authorisation grants we can receive as part of the above flow.
For the eagle-eyed amongst you, you’ll notice in the original spec there are four flows: authorisation code, implicit, resource owner password credentials, and client credentials.
However, the implicit flow has since been replaced by authorisation code with Prook Key for Code Exchange (PKCE) and resource owner password credentials has been deprecated, so we will ignore them.
Instead we will cover the authorisation code (with and without PKCE), refresh token, and client credentials grants.
Authorisation Code Grant
Both with and without PKCE, the authorisation code grant is used to acquire access (and optionally refresh tokens). The PKCE extension is used when we have a public client, but isn’t needed for confidential ones (although it’s still recommended).
In each of the following sections we will go over the flow of requests. However, their exact contents: the parameters they use etc., I will link to in the relevant section in the original spec document.
The plain authorisation code details can be found here. Relevant additional information for PKCE can be found here.
You will notice that the authorisation code grant requires redirects, so whichever client we use must be capable of interacting with the resource-owner’s user-agent (formally defined in the previous client groups section, but usually a web-browser), and can receive incoming redirect requests from the authorisation server.
Although this may seem a bit abstract, it is purposefully so. We can implement the flow in a variety of ways, but to concrete the ideas let’s think about this in terms of two applications: Example App (client) and Authorise Company (providing the authorisation server).
You are the resource owner visiting the Example App site, when you carry out some interaction requiring Example App needing access to your email address, which you store on Authorise Company’s servers. Example App would like to query Authorise Company’s resource server to get your email address. But first, it requires an access token.
Example App redirects you in the browser to Authorise Company’s /authorise
endpoint, where you authenticate and give Example App access to your email address. Authorise Company then redirects you back to an Example App URI, but supplies you with an authorisation code.
Note: how this redirection happens is not specified in the RFC, it does not necessarily use the /authorise
or /token
endpoints.
Your browser loads the Example App URI which uses the authorisation code to retrieve an access and refresh token, which can now be used by the Example App client to get your email address!
Authorisation codes don’t even need to be stored. It is enough to sign and encrypt them, as we can use this to verify their origin.
The next big question is, how does PKCE change this? If we have a confidential client, then it can authenticate itself to the authorisation server using its client secret. If we have a public one then it can’t do so.
When this is the case the client creates a secret (the code_verifier
) and does a one-way transformation on it to create a code_challenge
. When the client asks for the authorisation code they send the code_challenge
and the authorisation server stores it.
On redeeming the authorisation code they send the initial code_verifier
and the authorisation server performs the transformation. Finally it compares the transformed verifier and challenge to make sure they are equal!
This is obviously very high level. For more detail you’ll need to dive into the original spec here.
Refresh Token
Although we’ve covered the notion of refresh tokens, we haven’t looked too hard at how they are used to retrieve an access token. Initially, let’s examine why they are useful.
Access tokens are bearer tokens, in that if you’ve got it, you can use it (hence Authorization: Bearer
). There’s no verification of the person employing it. This means we need to be extra careful it doesn’t get stolen and misused.
One of the methods to prevent this is to introduce a short time to live (TTL). If bad actors get hold of it, they won’t be able to use it for long. However, if we make the user log in each time they want a new access token, that’s bad user experience.
To bridge the gap, we introduce the refresh token. When our user logs in they are issued an access and refresh token, the latter of which they are asked to keep safe. If our access token exhausts its TTL they can use the refresh token to request a new one.
‘But wait!’ I hear you cry, surely the refresh token suffers from the same issue, if a bad actor gets hold of it they can use it to get new access tokens!
To an extent this is true, but you’re not sending the refresh token with every authorised request (like the access token). Additionally, you’re only exchanging it in one place, with the authorisation server.
Another layer of protection some implementers add to the refresh token is to make it single use. Whenever you get a new access token you get a new refresh token too. However, beware that this sometimes removes your capacity to retry getting a new access token if this fails, as your refresh token will be used up. The user will need to log in again!
Now we have covered the motivation behind refresh tokens, let’s see how we might make use of them.
Initially, our client requests access from the resource owner, and is issued the authorisation grant, which they use to retrieve the access and refresh tokens. This is the same as the general flow. Our path then bifurcates.
In the case of a valid access token the flow acts as usual. However, when our access token is invalid (exceeded its TTL in this case), we send our refresh token to the authorisation server, and receive an access token (and in the case of our refresh token expiring, a refresh token).
We can then carry on as normal!
Again, this is an overview. For full implementation details consult the spec here.
Client Credentials Grant
The final (and most straight forward) grant we will look at is the client credentials grant. This is only used for confidential clients, and is where they can request an access token by authenticating with the authorisation server (using its client credentials or otherwise). More information on how client authentication works is found in the spec here.
We make a single request to the /token
endpoint, authenticating as described above. We optionally contain a list of scopes. From there the authorisation server authenticates the client and returns an access token if they have access.
Conclusion
In conclusion, we have briefly looked over the different ways we can acquire and use access tokens using OAuth2.0.