AWS Redshift Flashcards
When you create a cluster, what do you get as a base configuration?
You get two nodes, leader and a data node, giving 160GB.
Do you get to select the disk size for RedShift?
No, you do not get to select the dist size. You do get to select the overall size of the Redshift cluster, through a slider in the console or parameter in CLI & API. AWS will then figure the number of disks in each data node.
I need to add capacity to my redshift cluster, how can I do this?
You have two options, you can scale up or out. Scaling up means you can change the size of the instance or you can add more node by scaling out.
What interfaces does RedShift support?
- ODBC
- JDBC
- Postgres
What is RedShift built on?
AWS Postgress, AWS separated the storage from the query engine and then replaced the storage engine with a columnar database.
What is RedShift used for?
- Data Wherehouse
- Analytics
I have data in S3, is it possible to query this data from RedShift?
Yes, RedShift has a service called RedShift Spectrum, the data in S3 must be in a CVS format.
What type of database is Redshift?
It is a columnar database, it is designed to scan columns of data fast. With columnar data, it is easy, to sum up, a column or find the min and max fast.
What is the architecture of a RedShift cluster?
You have a leader node and data nodes, data nodes have slices and these slices are the storage where data is stored and searched.
What is the purpose of the leader node?
The leader node distributes the query to the data nodes in the cluster, the leader node is the query planner node..
Is RedShift an OLAP or OLTP?
It is OLAP (online analytic processing).
Is RedShift a regional, Global?
Redshift just lives in a single Subnet in a single AZ, the reason for this is the components need to be fast and keeping them together requires the components kept local.
Is data compressed in Redshift?
You can have data compressed in Redshift, this is not blanket compression but is defined when you create a table and is defined per field in the table.
Is Redshift a service or do you get a cluster of nodes?
You get a cluster of nodes, one leader and the rest are data nodes.
Dose Redshift support encryption?
Yes, you can use KMW or CloudHSM, with KMW you can use AWS Managed CMK’s or you can use you own CMK
Can I resize a cluster?
You have two options, elastic resize and classic resize. Elastic resize makes a new cluster and copies from one node to another. Elastic resize just adds node and rebalance the data.
I wnat to increase the size of the Redshift cluster nodes, how can I do this?
You have to use classic resize as it enables the resizing of nodes. A new cluster will be created and the data will be copied over to the new cluster.
How is Redshift backed up?
When the cluster is created the default is, automatic backups, backup snapshots are taken of the Redshift cluster and you can also use manual snapshots. Snapshot data is stored in S3.
What services can push or load data into Redshift?
- Kinesis
- S3
- DataPipeline
How often does AWS take snapshots of the Redshift cluster?
every 6 - 8 hrs or every 5gb of data changes.
Is it possible to take a manual snapshot of the Redshift cluster?
Yes, 100%, you also set how long you wnat the snapshot to be retained, -1 forever.
I am concerned about DR for my Redshift, what options do I have?
You can configure to have the snapshots replicated to another region, you select the region and retention period.
If I wnat to restore a table form a snapshot/backup, is this possible?
Yes, you can select the backup/snapshot and then the database and the table.
I wnat to be able to restore my cluster in the event of a disaster, what options do I have?
You can have the Redshift cluster take snapshots/backups and then you will be able to restore.
What is the Max data the RedShift can manage?
2PB
What type of database is RedShift
Colomer database
Is Redshift an OLAP or OLTP?
OLTP
I what to increase the amount of data in my Redshift cluster, how can I do this?
Increase the number of nodes as each node is a computer and storage unit.
What types of nodes do you get in a redshift cluster?
You get a leader node and data nodes
For data nodes, are there different types of nodes?
Yes, you have two instance type options,
- Instance DC2 (SSD)
- Instance DS2 (Magnetic)
I have one large file (1TB), what should I do when loading into Redshift and why?
You need split the file into a smaller file so that each of the files will get loaded on to separate nodes in the RedShift cluster.
What are the two operations you perform on a Redshift cluster to get dat in and out?
load and unload
Where is Redshift deployed to?
VPC
Can you purchase reservations?
Yes
I wnat to be able to store user information and update individual user data fields, is Redshift suitable, give reson?
Redshift is an OLAP (Colum DB) and not suitable for OLTP type data.
Can you make a redshift cluster public?
Yes
How do backups work on Redshift?
You get to take snapshots manually and automatically, these are incremental and like other databases, you can restore to any point in time
What are the AWS services that can put data into Redshift?
- Datapipeline
- Kinesis firehose
- S3
How can I increase the DR capabilities of Redshift?
Ensure snapshot are automatically take/configured, enable cross-regions snapshots to copy the s3 snapshot to another region.
Can I just restore a Table and not the whole database?
Yes, you have the ability to restore just a table.
What is RedShift?
Redshift is a fully managed, fast and powerful, petabyte-scale data warehouse service
What is the smallest redshift cluster you can have?
1 one it acts are both the compute and lead node.