Redshift Flashcards

Question

What is the benefit of Amazon Aurora zero-ETL integration with Redshift?

Answer 1

Enables automatic data replication from Aurora to Redshift without needing ETL processes.

Answer 2

* Using Redshift for small datasets (better suited for RDS). * Using Redshift for OLTP workloads (consider RDS or DynamoDB). * Storing unstructured data directly in Redshift (use ETL and/or Redshift Spectrum). * Storing BLOB data directly in Redshift (store references to files in S3 instead).

Answer 3

* Use a Hardware Security Module (HSM) for encryption key management. * Define granular access privileges for users and groups using GRANT and REVOKE commands.

Answer 4

* Standard data shares: For sharing data across Redshift clusters. * AWS Data Exchange data shares: For sharing data with subscribers. * AWS Lake Formation-managed data shares: For sharing data with fine-grained access control.

Answer 5

* Application integration: Build applications that interact with Redshift data. * ETL orchestration: Use with AWS Step Functions to create serverless data processing workflows. * Event-driven ETL: Trigger ETL jobs based on events using Amazon EventBridge. * Access from SageMaker notebooks: Run queries and analyze data in Jupyter notebooks.

Answer 6

* No support for parameter groups. * No workload management capabilities. * Limited AWS Partner integration. * No public endpoints (must be accessed within a VPC).

Answer 7

* Monitoring views: SYS_QUERY_HISTORY, SYS_LOAD_HISTORY, SYS_SERVERLESS_USAGE. * CloudWatch logs: Connection and user logs, optional user activity logs. * CloudWatch metrics: QueriesCompletedPerSecond, QueryDuration, QueriesRunning.

Answer 8

* COPY command: For loading large datasets from external sources (S3, EMR, DynamoDB, etc.) * INSERT INTO ... SELECT: For loading data from existing tables within Redshift. * CREATE TABLE AS: For creating a new table based on the results of a query. * Streaming ingestion: For continuously loading data from Kinesis Data Streams or Amazon MSK.

Answer 9

* Use a manifest file when loading from S3. * Leverage IAM roles for secure access to data sources. * Consider using compression (Gzip, Lzop, bzip2) to speed up data loading. * Use the AUTOMATIC COMPRESSION option to let Redshift determine the optimal compression scheme.

Answer 10

A point-in-time backup of your Redshift cluster that can be used for disaster recovery or to create a new cluster.

Answer 11

* Use the appropriate distribution style for your tables. * Choose the right sort keys for your tables. * Use materialized views for frequently accessed data. * Optimize your queries by avoiding unnecessary joins and scans. * Utilize Redshift Workload Management (WLM) to prioritize queries.

Answer 12

* CPU utilization * Disk space usage * Query runtime * Throughput (queries per second) * Concurrency (number of active queries)

Answer 13

* AWS S3: For data storage and loading. * Amazon EMR: For data preprocessing and ETL. * AWS Data Pipeline: For orchestrating data movement and transformation. * AWS Database Migration Service (DMS): For migrating data to Redshift. * Amazon QuickSight: For data visualization and business intelligence.

Answer 14

* Redshift is a data warehouse optimized for analytical workloads. * Aurora is a relational database optimized for transactional workloads.

Answer 15

* Encrypt your cluster using AWS KMS. * Control network access using security groups and VPCs. * Implement least privilege access control using IAM roles and policies. * Regularly audit user activity and monitor for suspicious behavior.

Answer 16

* Choose the right node type and cluster size for your workload. * Utilize Redshift Spectrum to query data directly in S3. * Leverage reserved instances for predictable workloads. * Monitor your cluster usage and identify opportunities for optimization. * Consider using Redshift Serverless for variable workloads.

Answer 17

* AWS documentation * AWS Redshift blogs and whitepapers * AWS training courses and certifications * AWS community forums and support

Redshift Flashcards

(41 cards)