A Crash Course in C4 Modelling

Representing complex architectures simply

Dreaded whiteboarding

Audience

This article is aimed at engineers or developers looking to gain a deeper understanding of C4 modelling, and how to employ it. We will also be exploring the PlantUML tool to create our diagrams.

We will begin with the motivation behind C4 modelling, before using a worked example to explore the details of the technique. Our demonstration will be based around a URL shortener, which is covered in a previous article here.

A later section will rely on a small amount of knowledge of STRIDE Threat Modelling. However, readers should not be discouraged, even with no grounding in the subject the ideas are still approachable.

Argument

Any engineer who has ever needed to document a system will attest to the pain of finding the best method of visualising components and their relationships. The more rigidly defined techniques are cumbersome and inaccessible, whereas the oft used semi-random boxes and arrows can quickly become impenetrable and confusing.

Additionally, diagrams are not necessarily one size fits all. Explanations for your product manager, architect and engineers will all vary in their scope and detail. Capturing all of the requirements in one place becomes tricky.

C4 modelling helps to alleviate these issues by combining a more flexible approach to how we represent different parts of our systems and our relationships, with a varied view of granularity.

One of the core ideas is to split the levels of detail of our system over four layers.

  1. A system context diagram shows how our system fits into its surroundings.
  2. A container diagram zooms into our system, showing the high-level technical parts (containers).
  3. A component diagram zooms into an individual container, showing its components.
  4. A code diagram then zooms into a component and shows how it’s implemented.
How different levels of C4 modelling interplay

This ‘abstraction first’, way of thinking allows us to draw diagrams in the way we think of them, at varying levels of detail. We don’t even need to use all four levels, often the first two are sufficient!

The main entities of C4 modelling are the system, the containers, the components, the code, and the people who use the system.

As mentioned previously, we will be using a system for URL shortening to explain the concepts behind C4 modelling. As an overview, let’s say we own an application, shorturl.co.uk, which provides short versions of long URLs. For example, if someone has URL

https://www.longurl.com/e2d89a01-ed9f-4796-a80e-3a13db3d0d3e

We want to give them an equivalent, more terse URL:

https://www.shorturl.com/c5b852e6

Which redirects to the same page. A similar service already exists here, feel free to have a play to get your head round it!

This system is used by our internal CRM team in order to generate short links to put in emails, as well as by our website in order to shorten links pasted into text boxes.

Given that this is what we’re working with, let’s set up PlantUML and break down each of the C4 entities

PlantUML is a programming language for drawing diagrams. It contains a useful C4 plugin, which we will be using today. Additionally, we will be employing VSCode in order to render the output.

The first thing we will need to do is download the PlantUML extension for VSCode. Having done this we will also need to install Graphviz, responsible for the visualisation component of our work. There may be some additional configuration needed, as documented here in order to point VSCode to Graphviz, but once this is done you should be able to generate something similar to the below!

Generating PlantUML C4 diagrams in VSCode

The system is the top level of abstraction. This includes the one we’re building, as well as any that may interact with it. In our URL Shortener example we need to represent our CRM system and the website.

The system context diagram

We include all of the systems and people directly connecting to the one of interest. If we were modelling the Website we may well include a person connecting to the website, however as we are only looking at the URL shortener we don’t need to capture this.

This kind of diagram is useful for technical and non-technical people, and should be approachable even by those outside of your team.

The PlantUML code is below. We can see it allows us to declare a diagram using terms common to our modelling process!

We will exclude all other code samples to avoid diluting the message of the article, but they are equally as intuitive.

We now have a top level view of how our system fits into our overall ecosystem. The next step is to drill down a layer. A container is something like a data store, application or file store. This diagram is mainly for technical people.

An example container diagram

At this juncture we depart from the most common levels of modelling. Generally we only generate a component diagram if it will really add value, and it is often recommended to consider automating their creation.

Components represent the next level of granularity within a container. For those familiar with Java Spring Boot (and perhaps other MVC frameworks), we can break down our URL service into the controller, service and repository layers.

An example component diagram for our URL service

This final level of C4 is rarely used, primarily as code changes so frequently. At this point you may as well knuckle down and read through the repo! Here we deal with classes, interfaces, exceptions etc.

Tragically, PlantUML and the C4 plugin we are using don’t cover this level of granularity. We will swap to another diagramming software to cover off the final stage.

An example code diagram

There are a number of supplementary diagrams we can also use. We won’t go into detail within this article, it is enough to know they are there.

  • System Landscape Diagram: This is used to detail how a system fits into a wider enterprise. In the System Context diagram we are only worried about the immediately connecting systems, however in a System Landscape diagram we are concerned with all systems in the ecosystem.
  • Dynamic Diagram: This is used to explain how different parts of the system interact at runtime. For example, how a repository may query a database (SELECT * FROM...), or how a user authenticates themselves.
  • Deployment Diagram: This is used to map our systems to infrastructure. For example, a lot of our services are hosted on AWS. In our C4 diagrams we have captured this, but a deployment diagram would allow us to pick into the finer details, for example which Tomcat server, which machine type, and which VPC.

The final note we will make is on STRIDE threat modelling, and how this plays into C4. To (somewhat reductively) summarise STRIDE, it is a way of identifying security weaknesses within a system. There are six elements: spoofing, tampering, repudiation, information disclosure and elevation of privilege. It is enough to know that the threat modelling technique involves assigning each of these factors to any C4 entity we feel is necessary.

There is then the concept of a trust: value ratio. Trust represents how much we trust an entity, whereas value is how valuable we deem access to the entity. Finally we denote risks associated with each entity. For those more curious an article written by my old colleague can be found here.

For our final example, we will generate a system context diagram with some of the STRIDE elements applied. Normally this may be done at the container level, however to reduce work I have moved one layer up.

Applying STRIDE to C4 modelling

Conclusion

In conclusion we have covered the motivation and implementation of C4 modelling, as well as how it can be employed for modelling threats.

Principal Software Engineer at the BBC