Site-to-Site VPN

Using Site-to-Site VPN to provide a network link between an AWS Environment and an on-premises system

SYMPOSIUM OF SERVICES

9/27/20236 min read


Learning the practical and theoretical underpinnings surrounding advanced networking concepts is no easy matter. Adrian Cantrill’s project surrounding Site-to-Site VPN connections provided me with a breadth of knowledge that boosted my confidence in understanding routing, data transfers and encryption over the public internet.


This particular project provided a system design to produce a highly available Site-to-Site VPN connection between an AWS VPC and On-Premises environment using a Transit and Customer Gateway, coupled with a myriad of other networking services.


Our first mode of protocol involved provisioning the base-infrastructure of our resources reliably created through CloudFormation (CF). This comprised both our AWS environment and On-Premises set up, the former comprising 4 EC2 Instances within a VPC, alongside a Transit Private Gateway (TGW). Our On-Premises environment resembled a small business application holding 2 Ubuntu v.18 LTS’ configured with StrongSwan (an IPSec solution for encryption and authentication) and FRR Endpoints (a reliable data routing source for UNIX OS).


Simply put, Virtual Private Cloud (or VPCs) is a logically isolated section of our AWS cloud where our AWS resources (like EC2s, DBs, ELBs etc.) are provisioned and stay in. VPCs all hold a CIDR range - a block of IP address’ which a VPC holds ‘domain’ over. For example, if a VPC CIDR range is “10.0.0.0/16” it encompasses IP addresses from 10.0.0.0 to 10.0.255.255. Now, it would be rather intimidating if you had to deal with ALL these IPs independently right? That’s where subnets come in. You can imagine subnets as tiny assistants to the VPC who help organise the VPC CIDR range by breaking it down into many smaller CIDR range compartments for greater security and organising. Resources can also be provisioned within a subnet and provided with an IP range from within the subnet CIDR in a process called ‘Subnet Allocation’. Sounds confusing? Probably because it is.


For now, if there’s one thing to remember it’s this. Within the wider AWS global architecture when traffic enters a particular region hosted by AWS, it enters through a VPC.


Our next order of priority is setting up the CGW to allow a working connection between our local (on-prem) and remote (AWS) environments. Creating a CGW will require a custom BGP ASN (don’t worry I’ll explain what this is). To establish actual connectivity between our AWS and our on-premises environment we provision TGW attachments as Dynamic VPNs with acceleration enabled. After we insert the TGW ID and the CGW ID (provisioned through CF) we have created the base infrastructure to begin working towards a connection between our local and remote!


A Virtual Private Network, or VPN is used so that 2 or more networks can hold a secure connection over the internet. Site-to-Site (S2S) VPNs provide a logical connection between our VPC and on-premises network. The connection itself (going over the public internet) is encrypted via IPSec - a process of encrypting network traffic at the IP layer. Dynamic VPNs use what we call Border Gateway Protocols (BGP). BGP is a routing protocol which determines HOW data flows from one networking location to another. BGPs comprise of Autonomous Systems (AS), each assigned with their own AS Number (ASN). The goal of an AS is to route networks in and out of AWS. ASNs have peering connections with other ASNs, and in doing so share the best paths to a destination with each other in what we call Autonomous System Path (ASPath). Let’s take an example. The cities of Japan have a series of AS'. Tokyo[ASN-100] peers with Nagoya[ASN-101]. ASN-100 now knows the ASPAth to ASN-101. Now imagine Nagoya[ASN-101] is aware of Osaka[ASN-102]. As a result of connection redistribution, Tokyo[ASN-100] is aware of the network path to Osaka[ASN-102] via Nagoya[ASN-101]. This is all occurring within BGPs and as a result of the ASNs the BGP can route traffic to a myriad of other locations and does so in the fastest way possible!


Performing a local → remote network connection involves downloading the configuration files derived from our S2S VPN connections as an IKEv1 Protocol. In this phase, public keys are exchanged. Before proceeding further, 2 core networking concepts surrounding routing and encryption must be understood.


Upon the creation of a TGW VPN Attachment, you’re creating 2 tunnels - a connection between AWS and the CGW. Tunnels refer to the encrypted connection between two ends of the VPN tunnel transfigured over the public internet. In this state, there are two IPs - the Outside IP and Inside IP. In the context of a VPN connection, the AWS Outside IP is establishing a secure connection with your Customer Outside IP. On the local side, the Outside IP is used to connect the on-premises to the public internet (and AWS thereafter). Thus, a tunnel which connects to the CGW router is established and is where the Inside IPs are located. The Inside IPs communicate with one another to help route the transfer of data via BGP routing.


Further to this is the Internet Key Exchange (IKE) phase which occurs within the tunnels itself. There are two phases to IKE, phase 1 representing authentication and phase 2 representing encryption. During phase 1, users exchange their pre-shared key (to authenticate that they are both part of the same VPN). Each side thereafter produces a private Diffie-Hellman (DH) key which is decrypted to produce a public key exchanged between the two users. The result of this exchange is a shared DH key which can be used to exchange key material and agreements the result of which is a symmetrical key. In phase 2, the symmetrical key is used to produce more Exchanged Key Material. As a result, the shared DH Key and Exchanged Key Material produces a symmetrical IPSec Key. It is this key which is used for bulk and large scale data transfers within the tunnel.


To establish a working connection between our on-prem and AWS environment required inputting the routing details of our VGW (the IP details of our connections) and CGW into the configuration files featured within our on-prem routers (an EC2 Instance which figuratively represents the customer environment), alongside the pre-shared keys (which are stored in a secrets file) and finally, the inside IP address’ of our VGW to run through virtual tunnels (to provision a virtual tunnel interface). These files were then read by the StrongSwan software to establish a connection. With this system design in place, we could now prepare to establish route traffic and send data packets via BGP.


At this stage, 4 IPSec Tunnels have been created with 4 AWS endpoints pointing to our 2 customer routers. In our case, a pre-requisites of BGP was installing FRR. The ASN of the local BGP is set within our on-prem router. Finally, peering is established with the AWS BGP which (through connection redistribution) ensures that the various tunnels can simultaneously interact with one another.

Heading over to S2S VPN Connections, we should now see 2 separate tunnels fully configured with BGP routes. Additionally, with the insertion of Acceleration which had been configured when setting up our CGW dynamic routing is also possible, ensuring high availability!


The various concepts surrounding advanced networking was at first quite challenging. There were several occasions where I had minor setbacks, an example being an initial error I had when attempting to have the IP routes read by the StrongSwan software (a result of incorrectly configuring my on-prem router’s IP address!). Retracing my steps helped in this endeavour and further solidified my knowledge in understanding the core differences between a VGW and a CGW. Many of the concepts featured in today’s blog are a mere fraction of what you’ll find when exploring AWS’ networking capabilities (I didn’t even talk about Egress-Only Gateways, Accelerated S2S VPN, AWS Direct Connect or anycast IPs!)

Regardless, what you’ve learnt today is more than enough to provide you with a step into the world of advanced networking within AWS. I hope you enjoyed reading this as much as I did writing it. If you also wish to try your knack at this project, then you can find it here.


Introduction

Initialisation & Setup

Detour 1: VPCs & Subnets

Local & Remote Connection PHASE 1

Detour 2: VPNs & BGPs

Detour 3.1: Outside & Inside IPs

Detour 3.2: IKE Phases

Local & Remote Connection PHASE 2

Local & Remote Connection PHASE 3

Conclusion