General Knowledge Flashcards
What is Redshift
A petabyte scale, fully managed datawarehouse
Is Redshift designed for OLAP or OLTP?
It is specifically designed for OLAP.
A cluster is made up of what?
A leader node and one or more compute nodes
What is the maximum amount of compute nodes you can have in a cluster?
128 max per cluster
Can clusters have more than one database?
Yes, a cluster can contain one or more databases
What node stores user data?
The compute node
What node is responsible for managing communication with the client programs?
The leader node
What node develops execution plans?
The leader node
What node has its own memory, cpu, and attached disk storage?
The compute node.
What are the two types of node types in Redshift?
DS (Dense Storage) Node Type. Uses HDDs and is a low cost option. Comes in two sizes. xlarge and 8xlarge
DC (Dense Compute) Node Types. Used to create high performance data warhouses. This uses SSDs. comes in xlarge and 8xlarge
What is a node slice?
An allocation of memory, cpu, and disk allocated for processing a portion of the workload assigned to that node.
How is the number of node slices determined?
By the size of the node
What does Redshift spectrum do?
It can query exabytes of data in S3 without loading it.
What compression does Redshift spectrum support?
Gzip and Snappy
Why is redshift spectrum so fast?
It uses Massive Parallel Processing, Columnar data storage and column compression
Does Redshift Spectrum scale?
Yes, it scales to handle more parallel processing
How large is the blocksize for Redshift?
1mb
Can you change the column compression after it is created?
No. You cannot change the column compression after the table is created.
Does Redshift replicate?
Yes, within the cluster
Where does Redshift backup to?
S3 and it can be asynchronously replicated to another region
Does Redshift have automated snapshots?
Yes.
What happens when a drive or node fails?
It is automatically replaced
How many AZs can a single cluster span
One. Redshift is a single AZ service
If your Redshift cluster goes offline because the AZ is down and you need to query your data ASAP, how can you make this happen?
You can restore the cluster using the data stored in S3 into a new AZ that is not being impacted.
How does RedShift scale?
Redshift scales both horizontally and vertically
What happens when you scale Redshift from a process perspective
A new cluster is created while your old one remains available for Reads. The CName is flipped and data moved in parallel to the new compute nodes.
How does the distribution style Auto work?
Redshift figures it out and bases it on the size of the data
How does the distribution style Even work?
Rows distributed across slices in a round robin fashion
How does the distribution style “Key” Work?
Rows are distributed based on one column
How does the distribution style All work?
The entire table is copied to every node.
What are Redshift Sort Keys?
They are similar to indexes in a traditional relational database
What is a single column sort key?
A sort key that points to a single column. Such as date