Redshift Flashcards

1
Q

What is Redshift?

A

petabyte scale data warehouse service

Starts at $.25 / hour with no commitment

Scales to petabyte or more for $1,000 / terabyte per year, less than one-tenth of most other data ware house solutions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

OLAP

A

Online Analytics Processing

one transaction pulls in large numbers of records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data warehousing uses a different architecture

A

columnar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Redshift Configuration (Nodes)

A

Start with Single Node

Grow to Multi Node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is in redshift multi node configuration?

A

Leader Node
Manages client connections, receives queries

Compute Node
stores data, performs queries, computations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How many compute nodes can redshift have?

A

128

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Columnar Data Storage Overview, why is it efficient?

A

only columns involved in queries are processed

Columnar data stored sequentially on storage media

Requires fewer I/Os

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe Redshift’s compression

A

columnar data can be compressed more than row based data because it’s stored sequentially on disk

Redshift uses multiple compression techniques, it samples your data and selects best one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe Redshift’s Massively Parallel Processing

A

automatically distributes loads across all nodes

Makes it easy to add nodes, maintain fast performance as data grows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe Redshift Pricing for computes

A

Compute Node Hours

Total hours you run across all compute nodes for billing period

Billed 1 unit per node per hour
3 node cluster running for 1 month = 2,160 instance hours

Not charged for leader node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe redshift pricing for backup and data transfer

A

You’re billed for backups and for data transfer within a VPC (not outside a VPC)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Redshift security

A

encrypted in transit, SSL
encrypted at rest, AES 256

by default it takes care of keys for you

Can you use HSM or KMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is it multi-AZ?

A

no

only available in one AZ

Can restore snapshots to other AZ’s if outage occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly