Missed Practice Exam Questions Flashcards
What are some tool options for public online interactive data visualization at low cost?
HighChart and D3.js
What is the data capacity limit in Aurora?
64TB
What are some ways that HBase can integrate with S3?
Read replicas on S3, Store HBase StoreFiles and metadata on S3, Snapshots of HBase data on S3
What needs to be done to ensure that all VPC flow logs for Redshift COPY and UNLOAD commands are logged?
Enable Enhanced VPC Routing (forces all commands to use the VPC)
What visualization type should be used when there is multi-dimensional data that needs to be analyzed for outliers and trends?
Heatmap
What should be done if you are using EMR with S3 data and you encounter consistency issues?
Enable EMRFS
How should you implement a real-time, multi-AZ replica of a Redshift cluster?
Spin up separate Redshift clusters across AZs, use Kinesis Streams to simultaneously write data to each other, use Route53 to route users to the nearest cluster
What two methods can be employed to outside access Kibana deployed within a VPC?
Setup an SSH tunnel with port forwarding to allow access on port 5601 or Set up a reverse proxy server between your browser and Elasticsearch Service
What security option should be employed on an S3 bucket when you need to restrict user access at a file level?
SSE-KMS
What is the maximum buffer time in Kinesis Firehose?
5 minutes
Does Glue integrate with Elasticsearch?
No
Which services should be avoided if requirements demand low maintenance?
Kinesis Data Streams, EMR
Which IoT authentication protocol is most popular with mobile devices?
Cognito
What are best practices for loading large amounts of data between S3 and Redshift regularly?
Split files into 1 to 124MB files, use GZIP compression, use a single COPY command, load data in the same order as your sort key
What tool should be used for integrating data between relational databases and EMR?
Sqoop