Analytics Flashcards
a ______ _______ is a data storage solution that aggregates massive amounts of historical data from disparate sources
Data Warehouse
Data Warehouses are not used for transaction processing. They are primarily used for _____ and _____
reporting, analytics
________ is AWS’s data warehousing solution. It improves querying speed and efficiency, and handles _____-____ data
Redshift, exabyte-scale
______ is a query service that allows you to analyze data in ___ , as if they were _____ data, using standard ___
Athena, S3, relational, SQL
____ prepares your data for analytics, and is an ___ service that generates ___ code
Glue, ETL, ETL (ETL = Extract, Transform, and Load)
______ allows you to analyze data and video streams in real time, and also helps you analyze ___ and ____ _____ in near real time for application monitoring or fraud detection
Kinesis, logs, video streams
____ is used for processing large amounts of data, and simplifies running big data frameworks like _______ and ______
EMR (Elastic MapReduce), Hadoop, Apache Spark
____ ____ helps you move data between compute and storage services running either on AWS or on-premises
Data Pipeline
________ helps you visualize your data using by building interactive ________ that are embedded in your applications
QuickSight, dashboards
Data use case:
Search data in S3 as if it were ______
Athena, relational