DevOps Flashcards

1
Q

Continuous Intergration

A

Continuous integration (CI) is the automated process of integrating code from potentially multiple sources in order to build and test it: Unit, Servicers/API, Functional/GUI (UI interaction tests…like Capybara?). Continuous Integration is a development practice that ensures your application is always in a “good” state.

Gets triggered by source code commits

Runs any tests for frontend or backend code

Builds any artifacts (production js files, rails assets)

Publishes the artifacts (test results to dashboard, or assets to S3)
Triggers a deploy (in Opsworks)
Many CI systems are configured by a config file in the source code repo
Can build multiple branches and only deploy specific branches to specific environments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Continuous Delivery/Deployment

A

Continuous Delivery is an automated way to deploy your application to an environment. This can involve a number of automated or manual steps, including more integration testing, performance testing, or manual testing. The level of automation involved depends on your needs.

This includes setting up brand new environment and getting the code from the repository to the creation of fully tested and verified distribution.

Your software is deployable throughout its life-cycle
Your team prioritizes keeping the software deployable over working on new features
Anybody can get fast, automated feedback on the production readiness of their systems any time somebody makes a change to them
You can perform push-button deployments of any version of the software to any environment on demand

Continuous Deployment means that every change goes through the pipeline and automatically gets put into production, resulting in many production deployments every day.
Continuous Delivery just means that you are able to do frequent deployments but may choose not to do it, usually due to businesses preferring a slower rate of deployment. In order to do Continuous Deployment you must be doing Continuous Delivery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe CI/CD pipeline

A

Your CI pipeline is usually triggered when code is checked into an integration branch by a developer. Unit tests are run to ensure basic functionality is correct, and then, binaries are built. The binaries created could be a JAR or Zip file or even a Docker container.

The CD can be triggered after a successful build, or it can be timed. Typically, for dev environments, your CD pipeline will be triggered by every successful build. Deployment to production can be an automatic process or can require manual sign off.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the point of CI/CD?

A

Instead of writing an entire app and investing a lot of time debugging , CI/CD automates the process as we develop. Saves lots of time. Speeds up onboarding too!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe CI/CD bet practices?

A

It should run the tests after every commit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A single Process

A

Remember: when declaring variables, we are allocating a small space in RAM*

  • Each process has its own ID (PID - process ID)
  • Programs you write are typically “interpreted”
    • For even high level languaes, the source code is written in C. (including system calls that ask to accolate memory)
  • Alternatively, “compiled” programs are usually written in C (or C++)
  • A program written in C has to make system calls (eg. malloc) to interact with system resources
    • the vendor of os needs to supply an interface ?
  • System calls are functions that the operating system is required to provide in the C language (glibc)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Processes as a tree

1

A

Processes - As a tree

  • Not a list, they are heirarchical
  • if you shut the terminal, outer process will stop too ?
  • Everything ultimately becomes a child of Process 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Processes - Interation

A

Processes - Interaction

  • System Calls to access resources
  • Environment Variables are static over process lifetime
    • always available
  • Stdin, Stdout, Stderr are ways to accept text input from a keyboard, and write text output to a screen
    • one pipe in, 2 pipes out.
    • stdin = typing on a terminal and submitting
    • stdout = whatever the output is
    • we can chain stdin and stdout
  • Signals are ways for outside processes to trigger your process, similar to an event listener.
    • ctrl + C/kill utilzies OS concept - Signal. (siginterrupt)
    • interrupts the process and indicates wether or not it received the signal.
    • ctrl + D kills the os. Stops the processes at once.
  • Optionally, the process can ask the OS for access to the networking interface, which will assign a port number to your process.
    • To enable networking, process has to make a system call that enables next network socket.
    • Port ultimately gives you access to the network.
    • Warcraft vs Slack? Which packet belongs to which port?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Networking

A
#Networking:
  *OSI Stack - common internet stack: 7 layers*

This is the network layer. It gives us our IP address.

  • Each process can open 1 or more ports
    • Not confined to just one port*
  • Ports determine which process will receive an inbound packet.
  • Ports are either TCP or UDP
    • TCP: stateful, always guaranteed to know if the message makes it. (everything not streaming-like)
    • UDP: sends, doesnt not guaranteed knowledge if msg makes it. (video streaming)
  • Each network device has a MAC address
    • A hardcoded ID - relatively random. No structure for addressing scheme.
    • We look up to the next layer: IP address
  • Any connected device also has an IP address
    • We can have 2 IP addresses. The cable and the wifi.*
    • You only get an IP addy from someone else.
  • IP Routing:
    • Modem is given its IP by whatever its connected to
    • our IP is relatively close to our neighbor
    • from 0:0:0:0 to 255:255:255:255
    • split up jurisdiction oer a range on IP addys.
    • servers that reside under our street corner, it will have its own unique IP.
    • addy is like the front door to our house, we only have one. (192 - 196)
    • “NATS”
    • ISP divvies out pipes
    • be very familiar with IP addresses*
  • Networking: DNS
    • DNS maps a human readable domain to an IP address.
    • DNS is a global directory
    • DNS is hierarchical
    • Domains can have subdomains as children
    • The root of this tree are the 13 root-servers
    • Multiple domains may map to the same IP address
    • Per DNS request, only 1 IP response

Networking: UrL

  • protocol = language/format (http, email, etc)
  • post number
  • Full URL includes password, port, path etc.
  • combines all this information into one address.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a continuous integration server?

A

Jenkins: an open source tool written in Java with plugins built for CI. Plugins allow integrations at various DevOps stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a continuous integration server?

A

Jenkins: an open source tool written in Java with plugins (over 1000) built for CI. Plugins allow integrations at various DevOps stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a continuous integration server?

A

Jenkins: an open source tool written in Java with plugins (over 1000) built for CI. Plugins allow integrations at various DevOps stages.

Its the most widely accepted tool because of its flexibility and abundance of plugins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a continuous integration server?

A

Jenkins: an open source tool written in Java with plugins (over 1000) built for CI. Plugins allow integrations at various DevOps stages.

Its the most widely accepted tool because of its flexibility and abundance of plugins. These plugins can help meet the individual needs of individual devs.

When devs make a change to the source code in the repo, the Jenkins server pulls the code and tries to make a build.

Build application is now deployed onto the test server for testing.

Now devs will be constantly notified of results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are shortcomings of using a single Jenkins server?

A

1) if you need IE tests, we need to run a Windows machine.
2) another build job may require another Linux box

Solution!

Jenkins distributed architecture.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are shortcomings of using a single Jenkins server?

A

1) if you need IE tests, we need to run a Windows machine.
2) another build job may require another Linux box

Solution!

Jenkins distributed architecture.
Jenkins master distributes workload to the slaves.
Slaves provide the required environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

AWS

A
  • AWS allows you to get access to computing hardware, and pay per hour: A physical or virtual machine
  • They offer many different machine sizes (different resource allocations) at different pricing rates
  • On top of this base offering, there are many different services available
  • Other services though are typically just convenience or utility on top of these servers
  • S3 is a little different, its storage that’s pay per data transfer, and storage used. One big file storage system.
  • Most AWS services use open source technology at their core
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Regions

A
  • Everything (except S3) in AWS is specific to a region
  • AWS has data centers across the world and each is managed independently
  • For us in California, there are 2 data centers close to us
  • Of these 2, Oregon (us-west-2) is the cheapest (over N. California). Use Oregon server.
  • Make sure to always double check your region when you log in
  • There could be pricing differences between regions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

EC2 - Instances

A

Instance is the primary location to look at the state of the machines you’re using.

There are machines optimized by memory consumpton

IAM role

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

EC2 - KeyPairs

A

SSH = secure shell that uses encryption;
EC2 uses SSH keys to grant you access to the instances you create
On new instance creation, you’ll be asked to select a key
You can upload an existing public key, to which you have the private (~/.ssh/id_rsa.pub)
Or allow AWS to create a new one, and download the private key
generate by typing into the CLI: open ssl genrsl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

EC2 -VPCs

A
  • VPCs are analogous to an office network
  • Faster way for machines to communicate with each other: a new IP address that is used locally?
  • Inside the VPC, machines can access each other via an internal IP
  • Outside, machines are accessible according to sec group rules
  • We will be using the default VPC setting for all of the exercises
  • Once we pass through one iteration of a CI cycle. We could use a new VPC to use as an instance that we can test?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

RDS (Relational DB Service):

A
  • RDS is a utility to automatically setup an EC2 instance, with a database service running on it
  • RDS offers Mysql, Postgres, Oracle, SQL Server and others
  • RDS also enables you to configure the database settings via the UI
  • Big companies create a backup/snapshot that takes a copy of all data and make it into a text file.
  • RDS can also be used for common database maintenance tasks, such as backups, creating slave dbs, or restoring snapshots.
  • Only supports relational DBs like SQL.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

EC2 - Volumes (EBS)

A
  • Storage on these machines:
  • Its like an external hard drive that can attached to any of the instances. We can only plug it in one machine at a time.
  • By default, instances created are completely clean, and any data stored will not persist are machine termination
  • You can add persistent storage to any instance using EBS Volumes (we dont want to lose data if machine shuts off…I think)
  • Volumes will be auto-mounted in ubuntu under /mount/
  • A volume can only be attached to a single instance at a time
  • We will not be using EBS storage during our exercises
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

EC2 - Load Balancers

A
  • AWS offers load balancers known as ELBs
  • Located in the “Listeners”.
  • ELBs allow you to add instances to the pool of machines
  • Many services integrate with ELBs and will auto-add
    ELBs allow you to configure request routing via inbound port/protocol and output port/protocol
  • ELBs can automatically check the health of an instance by requesting a configurable URL and considering 2XX status code healthy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

EC2 - Security Groups

A
  • All instances we create have security rules. Kind of like firewalls. If we connect to our machine, there’s a chance that we are being blocked by the Security group.
  • Security Groups are applied to all instances and ELBs
  • They are similar to a firewall, and restrict all traffic destined to that machine according to configurable rules: (SSH, Custom TCP rule. In inbound security groups).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

S3

A
  • S3 is a distributed file storage system
  • You can upload files via the AWS API
  • Files can be private or public
  • S3 is outside of region settings
  • These are static files.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Cloudfront

A
  • Cloudfront is a distributed CDN (Content Delivery Network)
  • It is outside of region settings
  • Can be configured to serve content from an application server or an S3 bucket
  • Can be configured to use a custom domain
  • Expiry/Invalidation of cached content can be configured
  • Invalidation can also be caused by API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Certificate Manager

A
  • The certificate manager stores any SSL certificates that you wish to use in other AWS services
  • We will be using this to enable HTTPS in our application
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Route 53

A
  • Route 53 is a domain name manager
  • You can register new domains, or transfer existing domains to AWS nameservers
  • Once you have a domain hosted on AWS, you can configure DNS records for the domain
  • Route 53 integrates gracefully with other AWS services, allowing you to do things that would be difficult on other hosting platforms (geodns, elb)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Cloudwatch

A

Cloudwatch is used to monitor the health of your servers
Makes sure website is up, working.. the goal is to get to the bugs before the users get them.
Monitored resources include:
CPU
RAM
Network
Hard Disk
“Alerts” are configurable in the AWS UI
Alerts notify devs by email when resource usage passes given thresholds
Can be integrated with other notification services (PagerDuty)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Opswork

A

Opsworks enables the automation of machine configuration
IT orchestration or IT automation
Uses “Layers” to specify roles for each machine
Can use chef or puppet to author setup scripts
Scripts get triggered by instance lifecycle events
Deployment scripts can be triggered by API
Integrates well with other AWS services
More detail to come!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

IAM

A

IAM (identity and access management) will come up frequently throughout your AWS usage
For our purposes we will ignore any IAM usage (for NOW, just default)
IAM allows teams of developers to use the same AWS account, with seperate privileges, configured by an admin
IAM also allows you to create roles that are not intended for use by humans, but for API usage

32
Q

API, Tokens & CLI

A

AWS API requires you to authenticate via 2 tokens:
aws_access_key_id
aws_secret_access_key
Tokens can be managed via (Account Name) > “My Security Credentials”
The easiest way to interact with the AWS API from your local machine is to use the aws_cli tool
The tool, as well as other AWS libraries, look for credentials in 2 files under ~/.aws:
credentials
config
The config file stores the default region setting (us-west-2)

33
Q

Jenkins

A

Jenkins is an open source pluggable build workflow manager with a web UI

Plugins and a public plugin repository allow Jenkins to be integrated with many services

Projects are configured in workflow stages:
Source Control
Build Steps
Post Build Steps (even if these fail, its still considered successful)

Each stage of the workflow can impact the next step (e.g. on failure)

Project can be configured to trigger each other

The web UI shows a dashboard that can visualize build status, as well as test results

34
Q

What does it mean to “build”?

A

Uglify:

  • to save space (turn variables to one letter)
  • remove white space
  • condense repetition
  • make code a lot smaller
  • Take source code and create “binary”
35
Q

What is a binary?

A

Any command we run is a compiled program.
The binary is the output of a compilation process.

CI/CD sere builds a “sacred” version of out output for production after testing.

36
Q

Regression

A

Every time we find a bug, write a code that tests that you’ve fixed it.

37
Q

Jenkins Plugins

A

Jenkins plugins allow you to write custom source control integrations, build steps or post build steps

Plugin developers are also able to integrate with the web

UI to provide options for users

Plugins needed for our exercises:
Default Plugins
Install RVM plugin
Install NVM Wrapper plugin

38
Q

Version Locking

A

Make sure the same version we test on is ultimately the same version we are looking to deploy.

39
Q

Gem set

A

When we have multiple projects, we might have different versions or dependencies, so we will have different gem sets.

40
Q

Jenkins Source control

A

Jenkins can integrate with many different source control systems

Traditional source control system would just poll on a regular basis

Github uses web hooks (http requests)

Jenkins Github plugin can support github webhooks

Can be configured to only run on certain branches

41
Q

Build Steps

A

Build steps are steps in the workflow

These steps will be things like:
Run Tests
Build Production Assets
Publish Assets to S3
Deploy via Opsworks

We will use the “bash script” build step

Although we could write our own Jenkins plugins to do each of these steps

42
Q

Post Build Steps

A

The post build stage of the workflow

These are activities that are not considered crucial to the build process

Examples include:
Source Control triggering (build status)
Test Results publishing
Notifying developers

43
Q

RDS

A

The ability to do a SQL join. Mongo is a noSQL therefore not relational. mySQL/postgres are.

Interface on top of EC2 that allows us to automatically create an instance.

  • RDS provides pre-configured instances of common relational databases
  • Because many of the features of these databases are shared across all implementations, the UI provides controls for common features
44
Q

Relational Databases

A
  • DB is just a data structure that instead of being stored in RAM, its stored on disk.
  • Primary key refers the key in a BST.
  • Data is stored in tables
  • Tables stored as rows and columns
  • Underlying data-structure is typically a BST or Array (fixed fields)
  • Queries (searching) is typically performed via SQL
  • Able to relate different tables
  • Able to provide different ways to “Key” the BST
  • “Primary Key” is typically disk represented
  • Able to provide additional “indexes” to increase search speed
  • BST does not have all the data, but it has all the keys. And these keys are associated with that data on record.
  • We want auto incrementing ID.
  • Every time it adds a record, it has to rearrange the search tree.
  • if we search by ID, we g tit in log n. if we go by user name, its much slower.
  • Using hash map we are giving up order ( we typically use them to cache data).
  • BST inserts are logn.
  • Think phonebook (continuguous like LL or array) and index to phonebook (bst)

Reads: O(logn)
Writes: O(logn)

  • When DB gets big, you will see your performances slowdown.
  • Either use index to find data or resort to “table scan” if you need to search linearly.
  • Choose the opportunities to make indexes carefully. Having to write again and again will add up even if the average run time is O log n. 7 O log n or 7000 O log n adds up.
  • reads vs writes. balancing act
45
Q

Table scan

A

Think about apps by thinking about that Data layer first. (“app has users, which have songs that belong to artists etc.. “)

46
Q

Indexes

A

Indexes tell the database to also represent the table in a secondary BST (or Hash), with keys only, values pointing to the original record. When querying, the database will try to use the best index available for optimal speed.
Primary Key Index
Foreign Key Index
Prefix Index
Full Text Index
Multi-column Index
Indexes will increase lookup speed, but slow inserts/deletes.

47
Q

Read Replica

A
  • Master/Slave model
    *One DB is the source of truth. We can write to master, but cant write to slave.
  • Master is read/write
  • Slave is read only, used for large reads, also used for backups
  • Master uses transaction log shipping to update slave
    Indexes can vary between master and slaves
  • Build into the system, build into the system a one true location of the data (single source of truth).
  • When data has more than one source of truth, things get messy really quickly.
  • If we want to spit up the data into, we want to separate data onto different servers based on their joins/associations. CPU usage will be evenly spread across two machines. We are breaking the app into microservices. We are sharding the DB (separate vertically: billing and app DATA., or horizontally: one user mapping to other items on other DBs across multiple servers… a DB per client).

You can do indexed reads on other data bases while your source of truth isn’t indexed.

48
Q

Backups (Snapshots)

A
  • All database implementations support some way to take a backup of the data
  • AWS calls this feature “Snapshots”
  • AWS helps you take snapshots and restore them through the UI
  • You can also manually take backups in postgres:
  • pg_dump –host= > latest.dump
  • pg_restore –host= –dbname= latest.dump
49
Q

Goal of using orchestration software:

A
  • Create Process to create a new application server…

* make sure everything is in sync.

50
Q

Deploying to a large fleet - Series

A
  • Process is error prone esp if done maually.

* connect, run script, disconnect.

51
Q

Deploying to a large fleet - Parallel

A

*Best way to keep machines in sync.
*Neeed a way to tll serve is something is out of sync.
Faster

52
Q

Issues and Concerns

A

Downtime is always a concern, so speed is desired

We want all machines to be in a known state at all times.

We need 2 scripts.
- setup, deployment.

Deployment scripts are comprised of multiple steps

  • Every step has a way to move forward and a way to move back.

Failure of a single step can cause the machine to be in an unknown state

Failures can occur from:
Bugs in scripts
Network Issues
Version Locking Issues !!very person.
Resource Inconsistencies (bits can randomly flip fro 1 to 0 for example).

We can mitigate risks by:
Minimizing network distance

Automating as much as possible
Idempotency with rollback

53
Q

Deployment Strategies - Cutover

A

A manager or server that urns something like Puppet or chef..

Example:
“run version A on all machines. Now run B”.

54
Q

Development Strategies - Pilot

A

Just do one test deployment to make sure it runs properly.

55
Q

Development Strategies: Parallel

A

Deploy half of servers.. and test for errors… is the new version of software more error prone?

Then we switch and update.

We will never have downtime.

56
Q

Other concerns

A

Each one is case by case:

Database Migrations
Check which version and what we want to update to.

Load Balancing
Have multiple load balancers.

A/B testing
Run two version of our code an test it out. Business decisions?

Push vs Pull
Pull report back to check if the versions match.

Multiple server roles with
multiple environments

57
Q

Puppet

A

Puppet scripts define the “final state” of the server

  • blocks of code that have dependencies to each others packaged as a script. those things are what need to happed to move to the next stage. “Dependency graph”.

Each “step” in the script is modelled as a “resource”

Each resource can be dependent on other resources

Puppet makes a graph of changes to apply based on deps

Each resource type has the ability to compare the current state to the desired state

  • to do a puppet deploy. Show new git version. serve will maker graph to get us ready.

We aren’t writing a script. We ware defining the last state, and the server makes a graph that gets there?

Each resource type also has the ability to modify the server’s state, to deploy or rollback

Puppet server communicates bi-directionally with puppet agents on the machines

Can be run in solo mode

Ruby (DSL):

58
Q

Chef

A
  • Still defined in block of resources, but with no resources.

Chef “recipes” define “repeatable steps” to get a server into a desired state.

Recipes are combined into cookbooks (pluggable)

Each “step” in the script is modeled as a “resource”

Each resource is intended to be parameterized and idempotent

Rollback is then just redeploy with previous params

Chef server communicates bi-directionally with chef agents on the machines

Can be run in solo mode

Ruby (DSL: Domain Specific language)

59
Q

Ansible

A

*Does not use middle man role. (not cutover or pilot etc… run it from our local machine). It does not have a central management server.

Connects to your nodes and pushes out small programs, called “Ansible modules”

Parallel

Modules are models of the desired state of the system

Ansible executes these modules (over SSH by default), and removes them when finished

No servers, daemons, or databases

Python

60
Q

SaltStack

A
Client/server model, similar to puppet and chef
Parallel or series
Plugin system
SLS scripts
Python
61
Q

Opsworks

A

AWS service

Uses client/server model

Uses custom server and custom client agent

Uses either puppet or chef scripts

Therefore uses puppet/chef in solo mode

Integrates with lifecycle of AWS EC2 Instances

62
Q

Opsworks Units

A
  • does not enforce how we set up our machines.

Stacks:
Group machines together based on shared resources

Layers:
Group machines together based on roles
Custom recipes per lifecycle event

Apps:
Deployable identifier, scripts can use app name

63
Q

Custom JSON

A

Opsworks (and Chef) use a json configuration scheme to apply Chef Attributes:

  • Stack level
    • Overwritten by Layer level
  • – Overwritten by Deployment activity

Chef internally uses a similar override scheme in its recipes:

  • default
  • force_default
  • normal
  • override
  • force_override
  • automatic
64
Q

“Apps”

A

This is a string. Simply a variable thats passed to a JSON file. In the case of AWS it is only a stinig.

65
Q

Domain Name Service (DNS)

A

when we register a domain, we are buying a right to build onto a server

  • Hierarchical Server structure (this is very important - need to be able to edit DNS records)
  • Responsible for mapping a string url to an IP address
  • 13 Root servers around the world (IANA)
    • [a-m].root-servers.net
  • Nslookup
66
Q

Domain Name service - Record types

A

A - Address - Maps name to IP Addy. It’s an ALIAS.

CNAME - Canonical name - Maps name to other names
(same site hosted with multiple names)

SOA - Authenticate that you own the domain? Text record for DNS?

MX - Mail exchanger record. This allows different emaill providers to communicate w each other. (ex: hotmail sent to gmail). For receiving emails.

NS - Name Server records

TXT - Text records

67
Q

Symmetric Cryptography

A

Whereas hashing goes one way, crypto takes in x and key and it can go both ways.

If the function is symmetricm if key1 === key2. otherwise, its asymmetric.

encryption is like being sent a locked box without the key.

if public/private, the function is asymmetric.

68
Q

Asymmetric Cryptography

A

Get to the point where both parties have a shared key.

69
Q

CDN Integration: Cache

A

Application Cache vs HTTP Cache

Very fast storage, typically in memory

Great for storing results to frequent queries

Invalidation can be a problem

Memcached

  • Older and larger community source base
  • Simple data types
  • HTTP Cache integration

Redis (Database mostly meant for caching)

  • More modern, but less robust
  • Complex data types
  • Often hand-rolled solutions
70
Q

HTTP: Hypertext Transfer Protocol

A

Typically available on port 80

Ascii/UTF8 data format

71
Q

CDNs

A

Spatially located proximal to large populations

Less hops, less delays

Typically backed by an application

If CDN doesn’t have the requested file in cache, get it from the application server

Uses http cache headers to decide how long to keep the file in cache

Some provide invalidation via API

Building your own is hard, best to use:
Akamai
Cloudfront
Cloudera
ChinaCache
72
Q

Cache control/expiry - Cache Control

A

“Cache-Control” http header is used to direct caching rules for any device in the request/response route.
no-store - do not cache or even store the file in the client
no-cache - do not cache the file
public - may be cached by any device (some status codes dont cache by default)
private - store in browser cache only
max-age= - set maximum cache time
must-revalidate - use validation headers

73
Q

Cache validation

A

Cache validation can be configured via a number of different headers.

Etag - fingerprints the response and returns the hash value
- If-none-match client header

Last-Modified - weak validation
- If-modified-since client header

74
Q

Vary Header

A

The “vary” header tells any caching device to cache multiple copies based on another given header (the key).

75
Q

Cloudfront

A

Cloudfront is the AWS CDN

Cloudfront uses Squid under the hood

Can use all the previous mentioned headers

Can also override http headers

Provides “Web” and “RTMP”