Chapter 5 - ANALYTICS: Amazon Athena, Amazon EMR, Amazon Kenesis, Amazon Redshift, AWS Glue, AWS Data Pipeline, Amazon Quick-Sight, AWS Lake Formation, ELASTICSEARCH Flashcards
Which AWS service you will use for real time analytics of streaming data such as IoT telemetry data, application logs, and website clickstreams. ?
- Amazon Athena
- Amazon Kinesis
- Amazon Elasticsearch Service
- Amazon QuickSight
- Amazon Athena
- Amazon Kinesis
- Amazon Elasticsearch Service
- Amazon QuickSight
Which of the following are Kinesis services? Choose 4.
- Kinesis Video Streams
- Kinesis Data Streams
- Kinesis Data Firehose
- Kinesis QuickSight
- Kinesis Data Analytics
- Kinesis Video Streams
- Kinesis Data Streams
- Kinesis Data Firehose
- Kinesis QuickSight
- Kinesis Data Analytics
You want to collect log and event data from sources such as servers, desktops, and mobile devices and then have a custom application continuously process the data, generate metrics, power live dashboards, and emit aggregated data into stores such as Amazon S3. Which is the main AWS service you will use?
- Kinesis Data Streams
- Kinesis Data Firehose
- Kinesis Video Streams
- Kinesis Data Analytics
- Kinesis Data Streams
- Kinesis Data Firehose
- Kinesis Video Streams
- Kinesis Data Analytics
Which of the following are ideal use case for Kinesis Data Streams? Choose 3.
- Real time data analytics
- Long term data storage and analytics
- Log and data feed intake and processing
- Real time metrics and reporting
- ETL Batch jobs
- Real time data analytics
- Long term data storage and analytics
- Log and data feed intake and processing
- Real time metrics and reporting
- ETL Batch jobs
What are features of AWS Redshift? Choose 3.
- Fully managed data warehouse service.
- Allows you to run complex analytic queries against petabytes of structured data using sophisticated query optimization, columnar storage on high-performance storage, and massively parallel query execution.
- Also includes Amazon Athena, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
- Also includes Amazon Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
- Fully managed data lake service.
- Fully managed data warehouse service.
- Allows you to run complex analytic queries against petabytes of structured data using sophisticated query optimization, columnar storage on high-performance storage, and massively parallel query execution.
- Also includes Amazon Athena, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
- Also includes Amazon Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
- Fully managed data lake service.
You are working as a solution architect for a financial services company which is planning to create a new data warehouse solution leveraging AWS Redshift. The raw data will be fist exported to S3 and EMR cluster and then copied into Redshift. The query results will be exported to another S3 data lake. How can you ensure that all data exchange (COPY, UNLOAD) between Redshift and other AWS resources should not traverse through internet and also to leverage the VPC security and monitoring features?
- Use AWS Glue to copy and upload data to Redshift cluster
- Use AWS Data pipeline to copy and upload data to Redshift cluster
- Enable enhanced VPC routing on your Redshift cluster
- Enable VPC flow logs on your Redshift cluster
- Use AWS Glue to copy and upload data to Redshift cluster
- Use AWS Data pipeline to copy and upload data to Redshift cluster
- Enable enhanced VPC routing on your Redshift cluster
- Enable VPC flow logs on your Redshift cluster
Which AWS service you will use for business analytics dashboards and visualizations?
- Amazon Athena
- Amazon EMR
- Amazon Elasticsearch Service
- Amazon QuickSight
- Amazon Athena
- Amazon EMR
- Amazon Elasticsearch Service
- Amazon QuickSight
You are the solution architect for a national retail chain having stores in major cities. Each store use an on premise application for sales transaction. At the end of the day at 11 pm data from each store should be uploaded to Amazon storage which will be in excess of 30TB of data, the data then should be processed in Hadoop and results stored in data warehouse. What combination of AWS services you will use?
- Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon DynamoDB
- Amazon Data Pipeline, Amazon Elastic Block Storage, Amazon S3, Amazon EMR, Amazon Redshift
- Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon Redshift
- Amazon Data Pipeline, Amazon Kinesis, Amazon S3, Amazon EMR, Amazon Redshift, Amazon EC2
- Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon DynamoDB
- Amazon Data Pipeline, Amazon Elastic Block Storage, Amazon S3, Amazon EMR, Amazon Redshift
- Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon Redshift
- Amazon Data Pipeline, Amazon Kinesis, Amazon S3, Amazon EMR, Amazon Redshift, Amazon EC2
Which AWS Analytics services gives you the ability to process nearly unlimited streams of data?
- Amazon Kinesis Streams
- Amazon Kinesis Firehose
- Amazon EMR
- Amazon Redshift
- Amazon Kinesis Streams
- Amazon Kinesis Firehose
- Amazon EMR
- Amazon Redshift
Which of the following are scenarios where Amazon Quicksight cannot be used?
- Highly formatted canned Reports
- Quick interactive ad-hoc exploration and optimized visualization of data. Create and share dashboards and KPI’s to provide insight into your data
- Analyze and visualize data in various AWS resources, e.g., Amazon RDS databases, Amazon Redshift, Amazon Athena, and Amazon S3.
- Analyze and visualize data from on premise databases like SQL Server, Oracle, PostgreSQL, and MySQL
- Analyze and visualize data in data sources that can be connected to using JDBC/ODBC connection.
- Highly formatted canned Reports
- Quick interactive ad-hoc exploration and optimized visualization of data. Create and share dashboards and KPI’s to provide insight into your data
- Analyze and visualize data in various AWS resources, e.g., Amazon RDS databases, Amazon Redshift, Amazon Athena, and Amazon S3.
- Analyze and visualize data from on premise databases like SQL Server, Oracle, PostgreSQL, and MySQL
- Analyze and visualize data in data sources that can be connected to using JDBC/ODBC connection.
Which of the following AWS services you can leverage to analyze logs for customer facing applications and websites? Choose 2.
- Amazon S3
- Amazon Elasticsearch
- Amazon Athena
- Amazon Cloudwatch
- Amazon S3
- Amazon Elasticsearch
- Amazon Athena
- Amazon Cloudwatch
Which AWS service you will use for data warehouse and analytics requirements?
- DynamoDB
- Aurora
- Redshift
- S3
- DynamoDB
- Aurora
- Redshift
- S3
Which AWS database service will you choose for Online Analytical Processing (OLAP)?
- Amazon RDS
- Amazon Redshift
- Amazon Glacier
- Amazon DynamoDB
- Amazon RDS
- Amazon Redshift
- Amazon Glacier
- Amazon DynamoDB
Which AWS service reduces the complexity and upfront costs of setting up Hadoop by providing you with fully managed on-demand Hadoop framework?
- Amazon Redshift
- Amazon Kinesis
- Amazon EMR
- Amazon Hadoop
- Amazon Redshift
- Amazon Kinesis
- Amazon EMR
- Amazon Hadoop
Which of the following use cases is not well suited for Amazon EMR?
- Log processing and analytics
- Large extract, transform, and load (ETL) data movement
- Ad targeting and click stream analytics
- Genomics, Predictive analytics, Ad hoc data mining and analytics
- Small Data Set and ACID transaction requirements
- Risk modeling and threat analytics
- Log processing and analytics
- Large extract, transform, and load (ETL) data movement
- Ad targeting and click stream analytics
- Genomics, Predictive analytics, Ad hoc data mining and analytics
- Small Data Set and ACID transaction requirements
- Risk modeling and threat analytics