I’m a developer, I deploy to a VPC, but what’s going on in there…
Audience and Aim
This article is aimed at developers using AWS looking to understand how VPCs work. In many companies the complex infrastructure provisioning is separated from the development process. You may be deploying into a VPC, but not really know where your code is going.
Here we will explain at a very high level how a VPC functions. We will assume a small amount of networking knowledge, but the idea is to make the explanation as clear and practical as possible.
Above is a basic VPC setup which, admittedly, looks a bit daunting at first. Let’s break it down into the component parts and explain how they fit together.
Regions and Availability Zones
A Region is a physical location somewhere round the world where data centres are clustered. For example, Ireland is the
eu-west-1 region. Within a region there are groups of logical data centres called Availability Zones. Each AWS region consists of multiple, isolated, and physically separate availability zones within a geographic area. For example,
eu-west-1 has three separate availability zones:
eu-west-1a, eu-west-1b and
Each VPC is based in a given region, but spans over all availability zones within that region. This allows you to architect for resilience in the case that one of the availability zones goes down.
In our example architecture we have a VPC within a region and then two availability zones within that VPC.
This is essentially a group of IP addresses we reserve within a region. Initially, we need to select a CIDR block (a system I don’t think I’ll ever understand), which will give us the range of IP addresses we can use. It is common to use
/16 for the VPC and then use smaller ones for the subnets.
For example, we may use
10.0.0.0/16. This will give us the IP addresses
10.0.0.0-10.0.255.255 (around 65,536 of them).
Subnets are a group of IP addresses we reserve within our VPC IP addresses. We normally follow the convention
x is increasing for each new subnet. In this example we will use public subnets with the CIDR blocks
10.0.2.0/24, which gives us the ranges
We would then associate the subnets with a route table that sends all traffic to the internet gateway (covered below). It is this that allows us to route traffic out to the wider internet, and this that makes it a public subnet.
Similarly we can make private subnets. In this example we have two types of private subnets; one for the application layer and one for the database layer. We can assign them the CIDR blocks
10.0.6.0/24. All of these will then route out to the NAT gateway using route tables (described later).
Network Address Translation (NAT) Gateways are used to enable instances in a private subnet to connect to the internet or other AWS services, but prevent the internet from initiating a connection with those instances.
In our example we want to create a new NAT Gateway inside our public subnet. To do this we will need an elastic IP address for the gateway in order to have a static IP address to route to, something we provision separately.
Amazon EC2 is ‘a web service that provides secure, resizable compute capacity in the cloud’. For the purpose of our example we will think of it as a server in the cloud we can deploy code into. They come in many shapes and sizes though, so it is worth researching them independently to see what the offerings are.
Amazon Aurora is a cloud-based relational database running both MySQL and PostgreSQL. It is a fully managed Amazon Relational Database Service (RDS), which essentially means we provision an Aurora cluster and AWS helps keep it alive.
In our example we have a master/ replica set up, where the master database is in the first availability zone, but the replica is in the second. This gives us a level of resilience as if the first availability zone goes down as we can failover to the second.
Route 53 is a way of registering a DNS entry for our VPC. We buy a domain name and then point this to our load balancer.
For example, we may buy
vpcexample.com. It is then relatively easy to direct any traffic from that DNS to our load balancer, a process we will cover below.
An internet gateway is the component responsible for allowing communication between your VPC and the internet.
An internet gateway serves two purposes:
- To provide a target in your VPC route tables (covered below) for internet-routable traffic.
- To perform network address translation (NAT) for instances that have been assigned public IPv4 addresses.
We use Route tables in our example to direct traffic from a subnet. There are three subnets of two different types in our VPC design.
- Route table for the public subnet: This is used to route traffic from the public subnet out through the internet gateway. In the below this routes all traffic for IP addresses in the VPC to ‘local’, whereas anything else goes out through the internet gateway.
- Route table for private subnets: This is used to route traffic from the private subnets to the NAT gateway. In the below this routes all traffic for IP addresses in the VPC to ‘local’, whereas anything else goes to the NAT gateway, which in turn goes out through the internet gateway.
Subnets are explicitly associated with a route table.
A security group controls inbound and outbound traffic for an instance. When you launch an instance in a VPC you can assign up to five security groups. Security groups do not act at the subnet level. Therefore, each instance in a subnet can be assigned to a different set of security groups.
For our example architecture we will need two security groups.
- Security Group A: For the load balancer to accept all incoming traffic. This will accept all traffic (
0.0.0.0/0) to web ports
- Security Group B: For the EC2 instances to accept connections from the load balancer. This will accept all traffic from the load balancer security group to web ports
We then associate Security Group A with the load balancer and associate Security Group B with all of the EC2 instances.
Network Access Control List (NACL)
Network Access Control Lists are attached to subnets and specify a range of IP address based rules. To use an example:
The above demonstrates a NACL which allows all outgoing and ingoing IPv4 traffic for a subnet. Rules are processed in rule number order. In both cases rule 100 takes precedence over the starred rule.
We could have NACLs for the private and public subnets with similar constraints to the security groups, only using IP addresses. However, this is duplication of effort and so a common pattern is just to allow anything in the NACLs and then limit accessibility in the security groups.
Application Load Balancer
The application load balancer is used to distribute traffic amongst instances in our application subnet. For our example we have two EC2 instances spread over two availability zones and would like to share any incoming requests between them both.
Load balancers can be both internet facing or internal. Internet facing load balancers are used to balance traffic coming in from the open internet (in our example via Route 53), whereas internal load balancers are for components contained within our VPC.
The configuration of a load balancer includes:
- A listener configuration: This describes which ports to listen on and whether or not we should be using
- The availability zones and subnets: If it is an external load balancer it should be in the public subnet — an internal load balancer can be in the private one. A common pattern (and the one we use in our example) is to have the load balancer spread across multiple availability zones and for it to be a single point of contact for external connections.
- A security group: This ensures it only allows connections to and from specified locations. In the above we want to accept all incoming traffic, but only want to route outgoing traffic to our EC2 instances.
Autoscaling groups define a minimum, maximum and preferred amount of a certain type of EC2 instance we would like available. It works in conjunction with scaling policies to alter the number of these instances in accordance with certain criteria.
The components of an autoscaling group are as below:
- Load Balancer: This is the load balancer we would like to aim towards the autoscaling group.
- Health Check: This is responsible for making sure each of the instances are up and alive, removing the instances from the group if they are not. Often this comes in the form of an
httpcall to an endpoint on the box. If the box is unresponsive it will return nothing from this endpoint!
- Launch Template: The idea of an autoscaling group is that we would like to keep a number of identical instances available. In order to be able to launch a new EC2 we need a description of what it should look like. This is the purpose of a launch template.
- Scaling Policies: Autoscaling groups are often used in order to be able to scale up our infrastructure in the case of increased traffic. Scaling policies define these rules. For example we may want to add a new instance if the CPU Utilisation across an autoscaling group is above 80%. We can then scale down if the CPU Utilisation goes below 80% again.
- Network: We know what we’re launching from the launch template. We know when we’re launching from the scaling policies. We just need to know where we’re launching! This is the duty of the network configuration, which specifies the subnets to launch to.
To summarise we will follow two imaginary requests through our system, one ingoing and one outgoing.
- A client makes a request in order to retrieve some information via an API in our application.
- This is routed through the DNS entry we created in Route 53 to our application load balancer sitting in our public subnets. The load balancer accepts this traffic as its security group accepts all incoming traffic.
- Our application load balancer forwards this on to one of our EC2 instances and the application sitting in our private subnets. The EC2 instances accept this traffic as their security group accepts all requests from the load balancer.
- The application makes a request to the Aurora instance in order to retrieve the information from the database.
- The response is returned back through the system.
- Our application on our EC2 instance makes an outgoing request to an external API.
- This is routed to the NAT Gateway via the route table. The NAT Gateway then routes the request to the open internet via the Internet Gateway.
- The response is returned.