Redshift Flashcards
What is Amazon Redshift?
A fully managed, petabyte-scale data warehouse service in the cloud.
What are some key benefits of using Amazon Redshift?
- High performance: Up to 10x faster than other data warehouse solutions.
- Cost-effective: Pay only for what you use.
- Scalable: Easily scale your cluster up or down to meet changing needs.
- Secure: Data is encrypted at rest and in transit.
- Durable: Data is replicated within the cluster and backed up to S3.
What is Redshift Spectrum?
Allows you to query exabytes of data stored in S3 without loading it into Redshift.
How does Redshift achieve high performance?
Utilizes Massively Parallel Processing (MPP) to distribute queries across multiple nodes.
What are the different Redshift distribution styles?
- AUTO: Redshift automatically chooses the best distribution style.
- EVEN: Data is distributed evenly across all nodes.
- KEY: Data is distributed based on a chosen key column.
- ALL: The entire table is copied to all nodes.
What is the purpose of the COPY command?
Efficiently load large amounts of data from external sources into Redshift.
What is the function of the UNLOAD command?
Unload data from Redshift tables to files in S3.
What is Redshift Workload Management (WLM)?
Prioritizes different types of queries and manages resources to optimize performance.
What is Concurrency Scaling?
Automatically adds cluster capacity to handle spikes in concurrent read queries.
Automatically adds cluster capacity to handle spikes in concurrent read queries.
Automatically adds cluster capacity to handle spikes in concurrent read queries.
What are the two types of cluster resizing in Redshift?
Elastic Resize: Quick resizing with minimal downtime.
Classic Resize: More time-consuming but allows for changing node types.
What is the purpose of the VACUUM command?
Reclaims disk space by removing deleted rows and sorting data.
What are RA3 nodes?
A new generation of Redshift nodes that allow for independent scaling of compute and storage.
What is Redshift Data Lake Export?
Enables unloading data from Redshift to S3 in Parquet format for efficient data lake integration.
What are Materialized Views?
Pre-compute and store query results for faster performance on complex queries.