AWS Redshift Flashcards
When you create a cluster, what do you get as a base configuration?
You get two nodes, leader and a data node, giving 160GB.
Do you get to select the disk size for RedShift?
No, you do not get to select the dist size. You do get to select the overall size of the Redshift cluster, through a slider in the console or parameter in CLI & API. AWS will then figure the number of disks in each data node.
I need to add capacity to my redshift cluster, how can I do this?
You have two options, you can scale up or out. Scaling up means you can change the size of the instance or you can add more node by scaling out.
What interfaces does RedShift support?
- ODBC
- JDBC
- Postgres
What is RedShift built on?
AWS Postgress, AWS separated the storage from the query engine and then replaced the storage engine with a columnar database.
What is RedShift used for?
- Data Wherehouse
- Analytics
I have data in S3, is it possible to query this data from RedShift?
Yes, RedShift has a service called RedShift Spectrum, the data in S3 must be in a CVS format.
What type of database is Redshift?
It is a columnar database, it is designed to scan columns of data fast. With columnar data, it is easy, to sum up, a column or find the min and max fast.
What is the architecture of a RedShift cluster?
You have a leader node and data nodes, data nodes have slices and these slices are the storage where data is stored and searched.
What is the purpose of the leader node?
The leader node distributes the query to the data nodes in the cluster, the leader node is the query planner node..
Is RedShift an OLAP or OLTP?
It is OLAP (online analytic processing).
Is RedShift a regional, Global?
Redshift just lives in a single Subnet in a single AZ, the reason for this is the components need to be fast and keeping them together requires the components kept local.
Is data compressed in Redshift?
You can have data compressed in Redshift, this is not blanket compression but is defined when you create a table and is defined per field in the table.
Is Redshift a service or do you get a cluster of nodes?
You get a cluster of nodes, one leader and the rest are data nodes.
Dose Redshift support encryption?
Yes, you can use KMW or CloudHSM, with KMW you can use AWS Managed CMK’s or you can use you own CMK
Can I resize a cluster?
You have two options, elastic resize and classic resize. Elastic resize makes a new cluster and copies from one node to another. Elastic resize just adds node and rebalance the data.
I wnat to increase the size of the Redshift cluster nodes, how can I do this?
You have to use classic resize as it enables the resizing of nodes. A new cluster will be created and the data will be copied over to the new cluster.
How is Redshift backed up?
When the cluster is created the default is, automatic backups, backup snapshots are taken of the Redshift cluster and you can also use manual snapshots. Snapshot data is stored in S3.
What services can push or load data into Redshift?
- Kinesis
- S3
- DataPipeline
How often does AWS take snapshots of the Redshift cluster?
every 6 - 8 hrs or every 5gb of data changes.
Is it possible to take a manual snapshot of the Redshift cluster?
Yes, 100%, you also set how long you wnat the snapshot to be retained, -1 forever.
I am concerned about DR for my Redshift, what options do I have?
You can configure to have the snapshots replicated to another region, you select the region and retention period.
If I wnat to restore a table form a snapshot/backup, is this possible?
Yes, you can select the backup/snapshot and then the database and the table.
I wnat to be able to restore my cluster in the event of a disaster, what options do I have?
You can have the Redshift cluster take snapshots/backups and then you will be able to restore.