A Crash Course in the Grant Negotiation and Authorization Protocol (GNAP)

Flexible authorization for all!

James Collerton
16 min readAug 13, 2023
Apparently it also stands for the charisma hack ‘greeting, name, affiliation, purpose’

Audience

This article is aimed at those looking for an overview of the draft IETF GNAP core protocol.

It’s going to be very broad brushstrokes, but hopefully enough to give you a flavour of what GNAP is, and specifically how it differs to OAuth2.0.

The reason that the piece is written through a comparative lens is that the authorization paradigm most engineers (myself included) are familiar with is OAuth2.0. Therefore it can provide the base we build from.

This means you’ll need a fairly solid understanding of OAuth2.0, luckily this can be gleaned from an excellent article here. We’ll also be referencing some of its extensions, but won’t refer to them as separate entities.

Why GNAP?

GNAP was introduced to solve the same problems as OAuth2.0 and OIDC, and does so in a similar way. It still requests delegated authorization from a resource owner via a grant, and still utilises an authorization server.

I can immediately hear you asking ‘why do we need it then?’ which is incredibly fair. However, GNAP does address some important concerns, which we’ll explore next.

Consent and authorization flexibility

In OAuth2.0 we generally assume we have access to a web-browser. Additionally how our flow works is dictated by the grant type at the start. For example when we start the device authorisation grant for limited input devices (e.g. smart TVs) we make a request to a certain endpoint. Once we’ve begun this grant we can’t swap to another.

On top of this, the user employing the client is the same user who will approve access (i.e. the authorization server won’t send something out to another party requiring authorization on their behalf).

GNAP aims to address all of this. It is less tied to the web-browser, more flexible, and allows for the case where the user employing the client needs to request authorization from another party (as we’ll see in the ‘cross-user authentication’ section).

Intent registration and inline negotiation

Depending on the OAuth2.0 grant type we use, we tend to start in different places, and the following steps are rigidly defined. GNAP, on the other hand, always starts in the same place (which we’ll cover in the ‘requesting access section’).

It is also more flexible in its choreography to provide the different grants. This flexibility is provided by thinking of requests for access in terms of a state machine and transitions between states, rather than as a set sequence of requests. We’ll cover this in the ‘protocol flow’ section.

Client instances

In OAuth2.0 we require a client to be registered and to have a client Id. GNAP aims to remove this requirement. This is covered in the ‘identifying the client instance’ section.

Expanded delegation

OAuth2.0 relies on scopes to decide what kind of access we need, which are oft-misused. GNAP has a slightly more involved method of deciding what authorization you can delegate. This is covered in the ‘resource access rights’ section.

Cryptography-based security

In GNAP the client has a key, and communication is bound to this. We’ll explore this in the ‘identifying the client instance’ section.

Privacy and usable security

OAuth2.0 also assumes a strong relationship between the authorization server and the resource server. An example would be the resource server needing to use the authorization server for token introspection. GNAP tries to remove this dependency by supporting more flexible communication methods.

Although some of this reasoning may seem quite oblique without fully understanding GNAP, we will relate the rest of the article back to these points.

It’s also worth noting that the document acknowledges OAuth2.0 and GNAP are separate, and the latter not an extension of the former. Therefore they do expect you to potentially run both simultaneously.

OK, let’s dive in!

Core Roles in GNAP

Similar to in OAuth2.0, there are a number of core roles we need to understand to appreciate GNAP.

  • End User: The person wishing to access protected resources.
  • Authorization Server: This is responsible for granting privileges (whether by issuing access tokens or otherwise), and exposes the grant endpoint URI (the entrypoint to GNAP). In OAuth2.0 we have the token and authorize endpoints, which allows us to begin grants and acquire tokens.
  • Resource Owner: An entity with a protected resource they can grant access to. For example a user (the resource owner) with an email address (their resource) stored on a resource server, such that they can give access to their email address to an application.
  • Resource Server: The server holding the resource owner’s resource. It requires a valid access token with requests for protected resources.
  • Client: This is how the end user interacts with the authorisation server and resource server. GNAP differentiates between client software and client instances. Your application may be considered client software, and the unique case of it running on a machine (say a particular installation running on a particular device) would be a client instance. Client instances can have unique IDs, which is different to OAuth2.0 where each one would share the same client Id.

GNAP only states that these roles must be fulfilled. It doesn’t say anything about how and who by. It’s perfectly fine for one entity to carry out multiple roles, or for the entity doing so to change throughout the process.

We now need to define some elements

Elements in GNAP

As well as the roles played by the different physical parts of our system, we also need to explain some other terms.

  • Subject: A person, organisation or device.
  • Subject Information: Information provided by the authorization server on the subject. This might be things like an OIDC ID Token.
  • Attribute: These are characteristics of a subject. The subject is responsible for deciding who and how someone is allowed access to its attributes.
    I’m not going to lie, I found it a bit tricky finding the difference between an attribute and subject information. I think attributes are just more general properties of a subject (what kind of device it is etc.), whereas subject information is more strictly defined.
  • Right: Ability of a subject to perform a given operation on a given protected resource.
  • Access Token: Similar to OAuth2.0, this represents a set of rights. In GNAP this can also represent attributes.
  • Protected Resource: An endpoint provided by the resource server which can only be accessed with a valid access token.
  • Grant: Permitting (and defining the conditions of) the client receiving attributes or delegated authorization permissions.
  • Privileges: Right or attribute associated with the subject. Similar to scopes in OAuth2.0

Protocol Flow

One of the motivations of GNAP was to provide a higher level of flexibility than OAuth2.0. We always start in the same place, and the flow evolves from there. This is quite separate to OAuth2.0, where depending on the grant selected at the start we know exactly which steps we will take.

Because of this it’s easier to conceptualise the flow using a state diagram: a list of states and rules for how to transition between them.

The entity that travels between the states is a grant request (defined previously). For example, we say the grant request is processing, then the grant request is pending etc.

The GNAP states and transitions between them

Let’s go through each state.

Processing

The core point of GNAP is for an end-user to get access to a protected resource. Therefore it makes sense each flow will be started with a request for access.

The authorization server examines the context of the request to see if any end-user interactions are required, and if it should transition the grant to the pending stage. If this is the case the authorization server returns a response with information on how to carry out the interactions.

If we don’t require interaction and are happy to issue access tokens we transition to the approved stage. Otherwise if we error we transition to the finalise stage.

Pending

At this point the authorization server requires consent and authorization to allow the requested access. A grant request in the pending state is always associated with a continuation access token, which is bound to the client instance’s key (mentioned in ‘cryptography-based security’ and covered in the ‘identifying the client instance’ section).

There are two ways of continuing from this pending state. If the client has the capacity to receive requests, the authorisation server can reach out and send an interaction finish method. If not, the client can poll the authorization server, waiting for a continue response which tells the client where it needs to navigate to next.

When all of the interactions have finished the grant moves back to the pending stage so that the authorization server can take stock of the grant’s entire context before transitioning it to another state.

If all of the interaction requests time out, or the client revokes the access request, then it moves to a finalized state.

Approved

Hurray! You may now retrieve your resources! The authorisation server can communicate access tokens and subject information.

It may be that the we allow updates of the request from the approved stage. We can also request new access tokens by sending a continue polling request or using the token management API.

We can also move back from approved to processing by sending an update continuation request, which means the authorization server needs to re-evaluate the request.

If the continuation access token expires this means we can no longer accept these update requests from the client. If the authorization server also determines that it won’t issue new access tokens then the grant now has no other state to transition to and can go to finalized.

Finalized

This pretty much does what it says on the tin. The grant request cannot escape from this state, and for any new access you’ll need to create a new grant request.

Once in this state, the grant request is dead and cannot be revived. If future access is desired by the client instance, a new grant request can be created, unrelated to this grant request.

Using GNAP

So far we’ve covered a lot of GNAP theory. We understand the trust relationships we’re trying to build, and how we implement a flexible protocol through a state machine. Now let’s take a step in a more applicable direction and think about what this might look like in terms of requests between roles.

Note, these protocols aren’t hard and fast, and we may not do all of the steps every time. For example, if we’re just looking for subject information, this is provided by the authorization server, and so we wouldn’t require contact with the resource server.

Overall Protocol Sequence

If you’re familiar with OAuth2.0 this is comparable to the protocol flow defined at the beginning of the RFC. Here we define the overall shape of what’s going on, then extend it with more detail later.

Hopefully now things will be beginning to take shape! There is a lot of detail missing, but in the following sections we will expand these core concepts.

Interactions

Interactions are a way of saying ‘someone needs to do something to get this grant approved’. They come in a number of forms. We’ll expand the previous protocol in order to demonstrate a few.

Redirect-Based Interaction

For this interaction the end user employing the client instance is the same as the resource owner, and the client instance must be a web-browser (allowing us to do redirects).

The client instance must also be able to store a persistent session to confirm the user starting the interaction is the one it returns to.

A redirect-based interaction protocol

We might use this in similar situations to the authorisation code with PKCE flow from OAuth2.0.

User-Code Interaction

For those of you familiar with the device authorization grant in OAuth2.0 this should be reasonably straightforward! It allows us to issue access tokens to any device which can display a user code.

This is another interaction where the end user and resource owner are the same person.

A user-code interaction protocol

Asynchronous Authorization Interaction

This is the first flow we will look at where the resource owner and end user are not the same person.

We use this flow when a client instance would like access to resources but want to defer to the resource owner to grant access separately through a different channel.

One example could be an app on your phone which receives notifications when someone tries to use one of your media streaming accounts. When someone logs in on the media streaming application (the client instance) the authorization server will reach out to you as the resource owner to approve the request.

An asynchronous interaction

Software Only Authorization

This is the GNAP equivalent of the OAuth2.0 Client Credentials grant and is almost so straightforward it doesn’t bear discussing! The overview of the grant is below.

Software only authorization interaction

Refreshing an Expired Access Token

The core responsibility of GNAP is issuing an access token. However, to ensure security in the case of one being stolen they must expire. This means we require a way of procuring a new one (refreshing the token). There is an OAuth2.0 equivalent using refresh tokens, however with GNAP we no longer use a separate refresh credential.

Requesting Only Subject Information

Subject information pertains to information surrounding the resource owner, requested by the client instance via the authorization server. You’ll notice the end user and the resource owner are the same person.

This works slightly differently to some of the other protocols as we are not requesting an access token, only subject information.

How a client instance requests subject information

Cross-User Authentication

You’ll notice in the other flows we either have the resource owner and the end user as the same person, or only one of the two parties exists in the flow. We can use cross-user authentication in the case where we require resource owner subject information for an end user, but they are separate entities.

The requirements for using this flow are:

  • The client instance must be able to receive requests from the authorization server.
  • The end user must be pre-authenticated with the client instance.

This allows the end user to access the resource owner’s information on their behalf.

Cross-user authentication

Requesting Access

As we covered previously, GNAP flows always start with the same thing: a request for access. Here I will talk high-level about what the requests involve, but for full details consult the spec here.

Initially, we can request two things: access tokens (the plural is important, this is different to OAuth2.0 where we only request one) and/ or subject information.

If we request access tokens we need to define the rights (what the access token provides us access to, scopes in OAuth2.0 parlance and covered in the ‘resource access rights’ section). We can also indicate whether we would like the returned token to be bound to our client key (covered in the ‘identifying the client instance’ section), or a bearer token.

We additionally need to tell the authorization server about our client. This includes information on the client key, how our client can continue requests, and any additional info required by interactions.

Optionally, we may tell the authorization server about our end user and the interactions they support. This will help the Authorization Server decide which protocols it can use. For example, if the client supports redirects it may choose a redirect-based flow.

This entry point is equivalent to the start of the state diagram we demonstrated earlier. It is responsible for creating the grant request and moving it to the processing state.

Something interesting is the notion of labels for access tokens, these are assigned by the client and can be used to disambiguate the access tokens it receives back from the authorization server. This is especially useful in the case of requesting multiple access tokens.

Requesting Subject Information

Requesting subject information (i.e information on the resource owner) is slightly different. The client sends over fields declaring the:

  1. Subject Id: How we identify the resource owner we would like subject information on.
  2. Subject Id formats: Given we have retrieved the resource owner by their subject Id, we want to retrieve these other identifiers for the same subject.
  3. Assertion formats: An array of requested assertion formats (link article), e.g. ID tokens or SAML 2 assertions.

The authorization server may then reach out to the resource owner in order to authenticate them and authorize the release of this information.

Interacting with the User

One of the main properties of GNAP is the interaction functionality. Often we will want to use this to interact with a resource owner to approve access to resources.

We communicate interactions using three fields:

1. Start: How a user can start an interaction. For example we may use redirect in the redirect-based flow we discussed earlier, or user code in the other example.

1. Finish: How we can finish an interaction.

1. Hints: These tell the authorization server a little about the interaction. For example, what locale to use.

I personally find the start, finish, continue concept a little difficult to grasp. It’s marginally easier to understand using the redirect-based flow. In that particular example the start would be ‘redirect’, demonstrating the flows we can use. The finish would be a URI on the client instance the authorization server can use when the resource owner has finished their interaction.

This endpoint is called and then the client is responsible for using the continuation endpoint (previously supplied by the authorization server) along with some information given to the finish endpoint to keep the grant moving through the states.

Grant (Requesting Access Request) Response

When a client makes their initial request to begin the grant’s journey the authorization server needs to return information to facilitate this. This might include:

  1. Continuation information: This is where a client can make requests to keep the authorization going. For example, in the redirect-based flow the authorization server tells the client instance where it can continue the flow once it has made the end-user authenticate and authorize. More useful information on continuation is here.
  2. Interact: The client instance can indicate what kind of interaction it supports (for example, redirection to kick off the redirect-based flow). If the authorization server supports these interaction modes then it returns the necessary information to carry this out. Some more useful information on interactions is here.
  3. Access tokens: Access tokens if the authorization server deems the client deserves them.
  4. Subject information: This is information about a resource owner that a client may have requested.
  5. Instance Id: An identifier the client can use in future requests to the authorization server.
  6. Error: How we tell the client something has gone wrong.

The Token Management API

Once we have an access token we may want to revoke or rotate it. This is done via the token management API.

Rotating a token is the process of getting a new access token from an existing one, with the same rights and properties as previously, but an updated value and expiration time. You also have the opportunity to do things like bind a new client key.

This is also covered in the ‘refreshing an expired access token’ section.

Revoking a token does exactly what it says on the tin and cancels an existing token.

Identifying the Client Instance

Previously we talked about how GNAP has moved away from the need for client registration, and towards a more flexible model. Client information is sent with the initial request to create a grant, rather than separately.

The two core groups of information communicated are:

  1. Information we may display to the resource owner to help with authorization (for example, the client’s name and URL).
  2. The client’s key. This is a bit more involved, I’ll expand on it below.

If you’re familiar with OAuth2.0 then you’ll be familiar with the notion of a bearer token. The operative word is ‘bearer’ — anyone who has the token may use it.

This presents issues in that if someone steals the token they can go around using it however they please! To combat this there is something called demonstrated proof-of-possession (DPoP), which helps bind tokens to a public/ private key-pair which the client owns, preventing stolen tokens being abused.

GNAP has a similar mechanic to prove requests are coming from the same source, and potentially authenticate clients. Here clients declare a method they would like to use for signing messages, and present their public key on creating the grant.

If the authorization server recognises the public key this can be seen to authenticate the client. The authorization server can use this information to impose limitations on the client request, or correlate different requests to the same client.

Additionally, as all requests after the initial, grant-creation one, are signed using the private key, this guarantees the source.

This signature process also applies to access tokens. If we began the grant with a key, all requests using the access token must contain a relevant signature, unless we’ve specifically asked for a bearer token. Not too dissimlar to DPoP!

Resource Access Rights

As mentioned, GNAP eschews the concept of scopes for a more flexible model of access control. These ‘resource access rights’ are communicated by the client to say which permissions it would like a token to have, and back from the authorization server to describe the permissions of a token.

This is expressed as a JSON array containing the below recommended properties:

  • Actions: For example, read or write.
  • Locations: The locations (e.g. URIs) of resource servers we’d like to access.
  • Datatypes: This might be things like ‘metadata’ or ‘files’.
  • Identifier: Specific Ids of resources on the resource server.
  • Privileges: Admin, for example.

Or as a single string mapping to a set of the above properties that the authorization server is already aware of.

"access": [
{
"type": "file-api",
"actions": [
"read"
],
"locations": [
"https://app.example.com/"
],
"datatypes": [
"metadata",
"files"
],
"privileges": [
"user"
]
},
"write"
]

In the above example we manually define the read permissions for a fictitious file API. We know that the authorization server pre-defines something similar for write, so can use the shortcut.

Discovery

The final thing we will discuss is how clients can learn about the authorization server.

One interesting thing about GNAP is the capacity for the resource server to tell a client missing a valid access token the location of the authorization server.

You can also use an HTTP OPTIONS request to the grant request endpoint in order to discover information like the supported interaction methods, and if you can rotate keys.

As GNAP only exposes the single endpoint, there isn’t the requirement for a rich authorization server metadata like in OAuth2.0.

Trust Relationships and the Trust Model

We will consider this out of scope for this article, but it’s covered here in the spec and discusses how each of the individual components establishes that the others can be relied upon.

Conclusion

In conclusion, we have covered the Grant Negotiation and Authorization Protocol at a very high level. Particularly how it differs to OAuth2.0. Hope it was helpful!

--

--

James Collerton

Senior Software Engineer at Spotify, Ex-Principal Engineer at the BBC