Pluralsight - Elastic Infrastructure (Scaling & Automation) Flashcards
Cloud VPN
- Connects on-premises network to Google Cloud network through IPsec VPN tunnel
- Data travels over the public internet
- Good for low volume data
- Supports static and dynamic routing (Cloud Routing): dynamic is like data travelling using GPS that updates traffic in real time
- two public addresses exist, one IP at each gateway end.
Eg:
You have a VPC network with 2 subnets, each subnet has resources (incl VMs). These can communicate between each other using internal IP because they are part of the same network. But the on-premises resources cannot communicate with those on VPS. Need VPN:
- Configure Cloud VPN gateway, on premises gateway and 2 tunnels.
Cloud VPN gateway - regional resource that uses regional external IP address
On-premises VPN gateway - can be a physical device in your datacentre or another software by another cloud provider, this gateway also has an external IP address.
Tunnel - connects the two gateways and creates secure connection - Maximum Transmission Unit (MTU) for on-prem gateway cannot be > 1460 bytes
- Note: there is a second type of Cloud VPN Gateway - HA VPN Gateway (High Availability) (2 or 4 tunnels are required for config)
– to get 99.99% SLA need to configure more than one external IP and active interface
– dynamic routing only
– can create active/active or active/passive routing configuration (In an active/active routing configuration, both paths or routes are actively used for traffic simultaneously)
HA VPN is a regional per VPC, VPN solution. HA VPN gateways have two interfaces, each with its own public IP address. When you create an HA VPN gateway, two public IP addresses are automatically chosen from different address pools. When HA VPN is configured with two tunnels, Cloud VPN offers a 99.99% service availability uptime.
There are 3 types of connections with HA VPN Gateway to Peer VPN:
1. HA VPN Gateway to 2 separate peer VPN devices (each with their own IP addresses)
2. HA VPN Gateway to 1 peer VPN device that has 2 IP addresses
3. HA VPN Gateway to 1 peer VPN device that uses 1 IP address
There are 2 types of connections with HA VPN Gateway to AWS gateway. They connect using 2 to 4 interfaces (So like 2 tunnels from 1 interface of HA VPN go to 2 interfaces on AWS)
1. Transit gateway (distributes traffic across all tunnels equally)
2. Virtual Private Gateway
There is HA VPN Gateway to HA VPN Gateway connection, to connect separate VPCs.
Dynamic Routes using Cloud Routers
- Border Gateway Protocol (BGP) is used by Cloud Router to manage traffic in the tunnels
- BGP session requires its own IP pair to be established between the VPCs
- To increase the range of IPs can simply add a new subnet/VM?? and it won’t affect other configs
- Dynamic routing means that all another network needs to know is the ASN (autonomous system number) to connect to a group of subnets. It then doesn’t matter which subnet is available, or if more subnets are added to this group. Cloud Router will manage this.
- Command to create a custom VPC:
gcloud compute networks create vpc-demo –subnet-mode custom - Command to add subnets and ranges
gcloud compute networks subnets create vpc-demo-subnet1 \
–network vpc-demo –range 10.1.1.0/24 –region “us-west1” - Command to create the HA VPN Gateway (need to create on cloud and on prem, hence two commands below):
gcloud compute vpn-gateways create vpc-demo-vpn-gw1 –network vpc-demo –region us-west1
gcloud compute vpn-gateways create on-prem-vpn-gw1 –network on-prem –region us-west1 - View details of the VPN gateway:
gcloud compute vpn-gateways describe vpc-demo-vpn-gw1 –region us-west1 - Create Cloud Router
gcloud compute routers create vpc-demo-router1 \
–region us-west1 \
–network vpc-demo \
–asn 65001 - Create Tunnel (Need to be created in both interfaces of HA VPN). Need to create them for the HA VPN and also on prem. So 4 times:
**gcloud compute vpn-tunnels create vpc-demo-tunnel0 \
–peer-gcp-gateway on-prem-vpn-gw1 \
–region us-west1 \
–ike-version 2 \
–shared-secret [SHARED_SECRET] \
–router vpc-demo-router1 \
–vpn-gateway vpc-demo-vpn-gw1 \
–interface 0
** - Create BGP+Router interface for each tunnel (for 4 tunnels)
gcloud compute routers add-interface vpc-demo-router1 \
–interface-name if-tunnel0-to-on-prem \
–ip-address 169.254.0.1 \
–mask-length 30 \
–vpn-tunnel vpc-demo-tunnel0 \
–region us-west1
gcloud compute routers add-bgp-peer vpc-demo-router1 \
–peer-name bgp-on-prem-tunnel0 \
–interface if-tunnel0-to-on-prem \
–peer-ip-address 169.254.0.2 \
–peer-asn 65002 \
–region us-west1
- Create VPCs with VMs
- Create HA VPN on each VPC (1 per VPC)
- Create Cloud Routers on each VPC (1 per VPC)
- Create Tunnels (2 per VPC on interface 0 and interface 1)
- Create router and then BGW peer interface for all tunnels. (do router + BGP for one tunnel, then for another etc)
- Create firewall rule to allow traffic into the on-prem and into the cloud VPC
*Step 3: Creates the basic router infrastructure in each VPC.
**Step 5: Configures the router for each specific VPN tunnel, ensuring it is aware of and can handle the traffic coming through each tunnel.
Cloud Interconnect & Peering general descriptions
Dedicated connection - direct physical connection (transfer large amounts of data)
1. Direct Peering
- it is through a dedicated connection but uses Public IPs
- need to configure BGP peering
- no SLA
- need to be close to the PoP Edge locations, if not, use Carrier Peering
2. Dedicated Interconnect
- allows to connect through Internal IPs
- need to use a common colocation facility where the on-premise wires are connected to Google’s
- need to configure BGP peering
Shared connection
1. Carrier Peering
- so you go through someone else to reach Google infrastructure (Here Google cannot guarantee 99.99% SLA)
- connects throughPublic IPs
2. Partner Interconnect - 2nd best
- consider when dedicated interconnect isn’t an option (so you cannot physically tap into Google’s infrastructure)
- L2 Partner Interconnect requires BGP peering (L3 doesn’t)
- connect through Internal IPs
Cloud VPN - 2nd best
- traffic travels over public internet but allows connection to Internal IP
- good addition to Direct Peering and Carrier Peering
- Bandwidth is lower than any other connection (1.5 - 3 GB)
Google recommends using Cloud (Dedicated) Interconnect instead of Direct Peering and Carrier Peering, which are used only in certain circumstances
Cloud VPN over Interconnect option must be chosen if you want Google to manage encryption keys
Configurations to share VPN connectivity between several GCP projects/VPC networks
Setting up ONE VPN connection for several VPCs:
Shared VPC
- share VPN across several projects
- projects are being connected under one VPC using Internal IPs
- one project is chosen to be host the rest are service projects
- a project that isn’t part of the shared VPC is standalone project/VPC
- all network policies/firewalls need to be united
VPC network Peering
- configure private communications between projects in the same or different organisations
- allows Internal IP connection regardless of whether the VPCs are part of the same group/project/organisation
- allows to avoid network latency issues/costs unlike using simply external IPs or VPNs
- firewall rules don’t need to be united, admins can stay separate
Managed Instance Groups (MIGs) + Health Checks
- are VM instances in a group created based off a common template
- these can be scaled up/down as required by the Load Balancer
- can recreate failed instances (use the same name etc)
- Regional MIGs are recommended over Zonal instance groups
Health Checks
Terraform - automate and deploy infrastructure
(eg VPC networks, VM instances, firewall rules)
Add a firewall rule to allow HTTP, SSH, RDP and ICMP traffic on mynetwork
- Configurations file holds info about deployments
- Templates are used to modularise configs (modify them)
- Terraform is an open source tool that lets you provision google cloud resources
- Instead of deploying resources separately you can specify set of resources and the configs will be applied to them all at once
- Unlike Cloud Shell, Terraform deploys resources in parallel
- terraform init - to initialise a new Terraform configuration (Terraform must be initialised in the folder where the .tf file is created)
- terraform plan - refreshes
- terraform apply - creates the infrastructure defined in the .tf file
provider.tf
provider “google” {
region = “us-central1”
}
Creating a VPC network
mynetwork.tf
# Create the mynetwork network
resource “google_compute_network” “mynetwork” {
name = “mynetwork”
# RESOURCE properties go here
auto_create_subnetworks = “true”
}
resource “google_compute_firewall” “mynetwork-allow-http-ssh-rdp-icmp” {
name = “mynetwork-allow-http-ssh-rdp-icmp”
# RESOURCE properties go here
network = google_compute_network.mynetwork.self_link
allow {
protocol = “tcp”
ports = [“22”, “80”, “3389”]
}
allow {
protocol = “icmp”
}
source_ranges = [“0.0.0.0/0”]
}
module “mynet-us-vm” {
source = “./instance”
instance_name = “mynet-us-vm”
instance_zone = “us-east1-d”
instance_network = google_compute_network.mynetwork.self_link
}
module “mynet-eu-vm” {
source = “./instance”
instance_name = “mynet-eu-vm”
instance_zone = “europe-west1-d”
instance_network = google_compute_network.mynetwork.self_link
}
output “instance_ip” {
value = “${google_compute.ip_address}”
}
main.tf
Adding details to create VMs
resource “google_compute_instance” “vm_instance” {
name = “${var.instance_name}”
zone = “${var.instance_zone}”
machine_type = “${var.instance_type}”
boot_disk {
initialize_params {
image = “debian-cloud/debian-11”
}
}
network_interface {
network = “${var.instance_network}”
access_config {
# Allocate a one-to-one NAT IP to the instance
}
}
}
variables.tf
variable “instance_name” {}
variable “instance_zone” {}
variable “instance_type” {
default = “e2-micro”
}
variable “instance_network” {}
Managed Services:
- BigQuery
Data transformation tools (to clean up the data, delete trailing spaces, nulls etc)
- Dataflow
- Dataprep
- Dataproc
Dataflow
- data processing in streaming and batch modes
- autoscales (processes stream and batch data)
- closely linked to other gcloud products to eg set alerts (Stackdriver)
Dataprep
- if used, used before Dataflow to prepare the raw data from BigQuery/Cloud Storage/File uploads from PC
Dataproc
- NOT serverless because you can see and manage the underlying master and worker instances
- For running Apache Spark and Apache Hadoop clusters
- it allows for quick start of these clusters (without Dataproc it can take 5 mins to start)
- Built-in integration with other gcloud services eg BigQuery, Cloud Storage (so it’s more than just a Spark/Hadoop cluster that don’t have this in-built integration)
NOTE:
- Dataflow & Dataproc are similar but:
– Dataflow provisions clusters automatically, Dataproc manually
– if you need to use specific tools/packages in Apache/Spark, use Dataproc
– basically Dataproc allows for more customisations
Dataproc clusters creation DEMO
Use console –> create cluster –> put name….
Define master node properties as well as “worker” node properties.
This makes new VM instances to be created.
Once the cluster we created is ready, can change the no. of worker nodes in configurations.
- Can also submit a new job by clicking on the ‘Jobs’ section of the Dataproc window.
- can specify job type (eg Spark)
- Once the job is done, delete the cluster if not needed
Dataproc allows you to easily create, configure, and manage clusters for big data processing.
Dataproc is specifically designed for running and managing distributed data processing frameworks on GCP, such as Spark and Hadoop.
Lab:
SQL for BigQuery and Cloud SQL
To use Cloud SQL and run queries there:
1. Create a Cloud SQL instance (create a password, can change to Multi-zones)
2. In Cloud Shell run run the following command to set your project ID as an environment variable::
export PROJECT_ID=$(gcloud config get-value project)
gcloud config set project $PROJECT_ID
3. Run the following command in Cloud Shell to setup auth without opening up a browser:
gcloud auth login –no-launch-browser
- This will give you a link to open in your browser. Open the link in the same browser where you are logged in to the qwiklabs account. Once you login you will get a verification code to copy. Paste that code in the cloud shell.
4. Connect to your SQL instance:
gcloud sql connect my-demo –user=root –quiet
- Enter the password you had set and you are in the SQL server, can now run the queries
5. From now on I can ‘IMPORT’ tables into CloudSQL from Cloud Storage bucket created previously
VPC networks
- Create a custom VPC network:
gcloud compute networks create privatenet –subnet-mode=custom - Command to create the privatesubnet-europe-west1 subnet:
gcloud compute networks subnets create privatesubnet-europe-west1 –network=privatenet –region=europe-west1 –range=172.16.0.0/24
gcloud compute networks subnets create privatesubnet-us-east1 –network=privatenet –region=us-east1 –range=172.20.0.0/20 - Listing all subnets sorted by VPC network:
gcloud compute networks subnets list –sort-by=NETWORK - Create a firewall rule:
gcloud compute firewall-rules create privatenet-allow-icmp-ssh-rdp –direction=INGRESS –priority=1000 –network=privatenet –action=ALLOW –rules=icmp,tcp:22,tcp:3389 –source-ranges=0.0.0.0/0 - Create instance:
gcloud compute instances create privatenet-europe-west1-vm –zone=europe-west1-d –machine-type=e2-micro –subnet=privatesubnet-europe-west1
Note: we can ping VMs by their name (ping -c 3 privatenet-europe-west1-vm) because VPC networks have an internal DNS service that allows you to address instances by their DNS names rather than their internal IP addresses.
When an internal DNS query is made with the instance hostname, it resolves to the primary interface (nic0) of the instance ONLY.
Lab:
Managing Deployments Using Kubernetes Engine
- Create a cluster with 3 nodes
gcloud container clusters create bootcamp \
–machine-type e2-small \
–num-nodes 3 \
–scopes “https://www.googleapis.com/auth/projecthosting,storage-rw” - Create a deployment object based on the instructions:
kubectl create -f deployments/auth.yaml - Verify deployment creation
kubectl get deployments - After deployments are created, Kubernetes creates ReplicaSet for deployment:
kubectl get replicasets - Pods are created one per ReplicaSet automatically:
kubectl get pods - Create a service based on the instructions:
kubectl create -f services/auth.yaml
This was the exposure of the auth deployment.
For hello deployment:
kubectl create -f deployments/hello.yaml
kubectl create -f services/hello.yaml
- Scale up a deployment:
kubectl scale deployment hello –replicas=5
Rolling update
Updating with new version without downtime: creates a new ReplicaSet and slowly increases the number of replicas in the new ReplicaSet as it decreases the replicas in the old ReplicaSet.
- Edit the deployment that needs the new version:
kubectl edit deployment hello –> change the ‘image: ___’ bit - Check the rollout history:
kubectl rollout history deployment/hello - Pause rollout if needed:
kubectl rollout pause deployment/hello - Check rollout status if needed:
kubectl rollout status deployment/hello - Resume rollout:
kubectl rollout resume deployment/hello - Undo rollout:
kubectl rollout undo deployment/hello
Canary deployments
When you want to test a new deployment in production with a subset of your users, use a canary deployment. Canary deployments allow you to release a change to a small subset of your users to mitigate risk associated with new releases.
- create the set up file hello-canary.yaml
- Create deployment
kubectl create -f deployments/hello-canary.yaml - So now when we refresh the page for the website, sometimes it will be served by the new version deployment
- In the spec: part of the yaml file change the following to make sure the person with the same IP lands consistently on the same version:
spec:
sessionAffinity: ClientIP
Blue-green deployments
There are instances where it is beneficial to modify the load balancers to point to that new version only after it has been fully deployed.
So in this rollout you will stay on ‘blue’ version until all the resources are provisioned in full in the ‘green’ version.
NOTE: make sure you have sufficient (x2) resources in your cluster to do this.
Lab:
Set Up and Configure a Cloud Environment in Google Cloud