CloudFront Flashcards
CloudFront
content delivery network or CDN
it improves read performance, because the content is going to be distributed and cached at the edge locations and edge locations are all around the world,
about 216 globally
the idea is that with the more users you have in a region, the more they will want to do the same kind of reads. And they will all have content served directly from their region, even if S3 bucket is in a totally different region, because it will be fetched once into your region and then served from there so cached locally.
on top of this caching at the edge CloudFront gives you
DDoS protection (distributed denial of service)
gives you integration with a shield and also a web application firewall.
it’s a good way to front your applications when you deploy them globally
CloudFront allows you to expose
HTTPS endpoint by loading the certificates and also talk internally in HTTPS to your applications if you need to encrypt that traffic as well.
CloudFront allows you to distribute
your reads all around the world based on these different edge locations. We improve latency
and reduce the load on your main S3 buckets.
what are the different CloudFront origins?
- using CloudFront in front of S3
- use CloudFront as an ingress, to upload files into S3 from anywhere in the world.
- use custom origin and there must be an HTTP endpoints, anything that respects the HTTP protocol: Application load balancer, an EC2 instance, an S3 website, any HTTP backend you want, for example, if it was on your own premises infrastructure.
using CloudFront in front of S3
is a very common pattern to distribute your files globally and cache them at the edge. You also get enhanced security between CloudFront and your S3 buckets using your CloudFront OAI or origin access identity. This allows your S3 bucket to only allow communication from CloudFront and from nowhere else.
how does it work
We have a bunch of edge locations all around the globe. And they’re connected to the origin we defined,
It could be an S3 buckets or it could be any HTTP endpoints.
- Our clients will send an HTTP request directly into CloudFront.
- The edge location will forward the request to your origin.
- Then your origin responds to the edge location.
- The edge location will cache the response based on the cache settings we’ve defined and return the response back to our clients.
- the next time another client makes a similar request,
the edge location will first look into the cache before forwarding the request to the origin.
That is the whole purpose of having a CDN.
S3 buckets as an origin
for example, you have an edge location in Los Angeles
and some users want to read some data from there.
So your edge location is going to fetch the data
from your S3 buckets over the private AWS network
and give you the results from that edge location.
For the edge location of CloudFront to access your S3 buckets it is going to use an OAI or an origin access identity, it is IAM role for your CloudFront origin.
And using that role is going to access your S3 buckets
and the bucket policy is going to say yes, this role is accessible and yes, send the file to CloudFront.
So this works as well for other edge locations for example, in Sao Paulo in Brazil, or Mumbai, or Melbourne. And so all around the world, your edge locations are going to serve cached content
EC2 as an origin
our EC2 instances must be public because they must be publicly accessible from HTTP standpoint
Our users all around the world will access our edge location and our edge location will access our EC2 instance and it traverses the security group.
So the security group must allow the IPs of CloudFront edge locations into the EC2 instance. There is a list of public IP for edge locations that you can get on this website.
Security group must allow all these public IP of edge locations to allow CloudFront to fetch content from your EC2 instances.
Load Balancer as an origin
we have a security group for the an LB and the LB must be public to be accessible by CloudFront. But the backend EC2 instances now can be private.
security group for your ALB
must allow the public IP of the edge locations
geo restriction
you can restrict who can access your distribution. So you can provide a white list. We’re saying, okay, only
users from this list of approved countries can go to a CloudFront.
Or we can say blacklist: the users from these countries
are not allowed to access our distribution.
The country is determined using a third party Geo-IP database where the incoming IP is matched against it to figure out the country.
use case for geo restriction
when you have copyright laws to prevent access to your content. And you want to prove to regulators
that you are indeed restricting content access from,
say, France if you have content in America.
CloudFront vs S3 cross region replication
CloudFront is using a global edge network and files are going to be cached for a TTL. So a time to live maybe for a day. So it’s great when you have static content that must be available everywhere around the world. And maybe you are okay if that content is outdated a little bit.
S3 cross region replication, it must be set up for each region in which you want to have replication to happen. And the files will be updated in near real time,
it’s going to be read only so is going to help you with read performance. So S3 cross region replication will be great if you have dynamic content that needs to be available at low latency in a few regions.
CloudFront is for catching globally and S3 cross region replication for replication into select regions.
CloudFront signed URL and cookies
want to make CloudFront distribution private and you want to give access to people to premium paid shared content all over the world, but you want to be able to see and know who has access to what on your CloudFront distribution.
CloudFront signed URL and cookies HOW
when we create a URL and a cookie, you need to attach a policy and you need to tell
- when the URL or the cookie expires
- what IP ranges can access this data from,
so if you know the target IP of your client’s, then you should definitely use that and the trusted signers.
- which AWS account can create signed URLs for your users.