Redshift Flashcards

1
Q

What is Amazon Redshift?

A

Amazon Redshift is a fast, powerful, fully-managed, petabyte-scale data warehouse service in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What, at a high level, is the financial benefit of using Redshift?

A
  • Customers start at $0.25/hr with no commitments or upfront costs
  • and scale to a petabyte or more for $1000/terrabyte/yr, less than one tenth of most other data warehousing solutions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the storage available when Amazon Redshift is configured for Single Node?

A

160GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the setup for Multi-Node AWS Redshift?

A
  • A Leader Node that manages clinet connections and receives queries
  • Compute Nodes that store data and perform queries and computations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the maximum number of compute nodes you can have in Redshift?

A

128

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are two built-in ways that Amazon Redshift maximizes performance?

A
  • Advanced Compression based around columns
  • Massive Parallel Processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does MPP stand for?

A

Massive Parallel Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Are AWS Redshift Backups enabled by default?

A

Yes, with a 1-day retention period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the maximum retention period for AWS Redshift Backups?

A

35 days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does AWS Redshift do to help ensure redundancy?

A

It always tries to maintain at least 3 copies of the data

  • Original
  • replica on the compute nodes
  • a backup in S3
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does AWS Redshift do to help with disaster recovery of your data?

A

It can asynchronously replicate your snapshots to S3 in another region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the pricing model for AWS Redshift?

A
  • Backups
  • Data Transfer (w/in VPC Only, not outside)
  • Compute Node Hours = total number of hours run across all compute nodes in the given billing period (you are NOT charged for leader node hours)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does AWS Redshift account for security considerations?

A
  • Communications with Redshift are encrypted in transit using SSL
  • encrypted at rest using AES-256
  • By default, Redshift takes care of Key Management
    • but, you CAN manage your own keys through an HSM and KMS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does HSM stand for?

A

Hardware Security Module

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does Redshift handle availability concerns?

A
  • Currently, Redshift is only available in 1 AZ
  • You can restore snapshots to new AZs in the event of an outage (these are called cross-region snapshots)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly