AI, Machine Learning, Analytics Technology and Services Flashcards

Question

What tools can be used for analytics after data is loaded by Kinesis Data Firehose?

Answer 1

Business intelligence tools can be used for analytics after data is loaded into its final destination by Kinesis Data Firehose.

Answer 2

Kinesis Data Firehose includes integrated monitoring with CloudWatch.

Answer 3

Kinesis Data Firehose has automatic error retries if something goes wrong.

Answer 4

No, Kinesis Data Firehose does not retain data, even temporarily.

Answer 5

A data lake refers to a large-scale data repository for storing streaming data.

Answer 6

You can use AWS Lambda to transform data in Kinesis Data Firehose.

Answer 7

Use cases include real-time analytics, feeding data into data lakes, log data management, and IoT data integration.

Answer 8

Common destinations include Amazon S3, Amazon Redshift, and Amazon OpenSearch Service.

Answer 9

Kinesis Data Streams capture and store streaming video and data, whereas Kinesis Data Firehose captures, transforms, and loads data continuously into data stores.

Answer 10

Amazon Athena is an interactive query service that enables you to run standard SQL queries on data stored in Amazon S3.

Answer 11

You can run standard SQL queries with Amazon Athena.

Answer 12

Amazon Athena is serverless, meaning there is nothing to provision and manage.

Answer 13

You pay per query and per terabyte scanned when using Amazon Athena.

Answer 14

No, there is no need for complex extract, transform, and load (ETL) processes when using Amazon Athena. It works directly with data stored in S3.

Answer 15

Use cases for Amazon Athena include querying log files stored in S3, analyzing AWS cost and usage reports, generating business reports on data stored in S3, and running queries on clickstream data stored in S3.

Answer 16

AWS Glue is used to prepare your data for analytics and machine learning.

Answer 17

AWS Glue is important because it prepares and transforms data, making it ready for use by analytics applications and machine learning models.

Answer 18

The data catalog serves as the central repository containing metadata about the data, including its type and format.

Answer 19

Transformed data can be loaded into AWS services like RDS, Redshift, S3, or Athena.

Answer 20

AWS Glue can categorize data, clean it, remove duplicates, and join multiple datasets.

Answer 21

AWS Glue crawls your data and creates the data catalog, which is the central repository containing the metadata, such as the type or format of your data.

Answer 22

After creating the data catalog, AWS Glue can extract data from various sources, transform it (e.g., categorize, clean, remove duplicates, or join multiple datasets), and then load it into other AWS services.

Answer 23

AWS Data Exchange allows you to securely exchange and use data provided by third parties on a subscription basis.

Answer 24

Data products are available from a variety of suppliers, including financial services, healthcare, weather, manufacturing, and telecommunications.

Answer 25

The data can be used for analytics, machine learning workloads, and decision-making.

Answer 26

An example use case is analyzing customer spending patterns based on geographic location using data products provided by companies like MasterCard, Experian, and Equifax.

Answer 27

Elastic Map Reduce (EMR) is a big data platform provided by AWS that supports large-scale parallel data processing and petabyte-scale interactive analysis.

Answer 28

EMR supports structured data (e.g., financial transaction data), semi-structured data (e.g., text or documentation), and unstructured data (e.g., application logs or click-stream data).

Answer 29

One example of a use case for EMR is processing genomic data using statistical algorithms and predictive models to discover hidden patterns and find correlations.

Answer 30

EMR can analyze click-stream data to understand customer preferences or market trends.

Answer 31

EMR can extract data from sources like S3, DynamoDB, or Redshift.

Answer 32

EMR can be used to analyze events from streaming data sources in real time using Amazon Kinesis.

Answer 33

EMR supports popular open-source frameworks like Apache Spark, Apache Hive, Presto, and Hadoop.

Answer 34

The benefits of using EMR include not having to worry about provisioning and managing infrastructure, configuring and managing open-source applications, capacity planning, and it can dynamically scale as required by the workload. It is also optimized for performance and is claimed to be faster and less costly than deploying an on-premises big data solution.

Answer 35

AWS claims that EMR is less than 50% of the cost of deploying your own big data solution on-premises.

Answer 36

Amazon OpenSearch is a fully-managed service based on open-source Elasticsearch technology, compatible with Elasticsearch open-source APIs, Logstash for data collection and processing, and Kibana for search and data visualization.

Answer 37

Amazon OpenSearch is compatible with industry-standard Elasticsearch open-source APIs, Logstash, and Kibana.

Answer 38

A business might choose to use Amazon OpenSearch because it is a fully-managed service that simplifies the use of Elasticsearch open-source technology, while also supporting data collection, processing, and visualization tools like Logstash and Kibana. It is suitable for various analytics use cases, including log, application, security, and business data analytics

Answer 39

You can ingest data into Amazon OpenSearch from AWS services such as CloudWatch Logs, S3, DynamoDB, and Firehose.

Answer 40

Logstash is used for data collection and processing in conjunction with Amazon OpenSearch.

Answer 41

Kibana is used with Amazon OpenSearch for search and data visualization.

Answer 42

Use cases for Amazon OpenSearch include log analytics, application monitoring, security analytics, and business data analytics.

Answer 43

Amazon OpenSearch is a fully-managed service that is based on open-source Elasticsearch technology and is compatible with Elasticsearch open-source APIs.

Answer 44

Using Amazon OpenSearch, you can perform log analytics, application monitoring, security analytics, and business data analytics.

Answer 45

Yes, you can use Amazon OpenSearch with AWS CloudWatch Logs by ingesting data from CloudWatch Logs into Amazon OpenSearch.

Answer 46

AWS Data Exchange

Answer 47

Amazon Comprehend

Answer 48

Kinesis Data Firehose

Answer 49

Amazon MSK (Managed Streaming for Apache Kafka)

Answer 50

Kinesis enables you to collect, process, and analyze streaming data in real time.

Answer 51

Amazon Textract

Answer 52

Athena is an interactive query service for data in S3. It enables you to query data stored in S3 using standard SQL.

Answer 53

Amazon EMR (Elastic MapReduce)

Answer 54

Amazon CloudWatch

Answer 55

Trusted Advisor

Answer 56

AppStream will handle hosting, scaling, and user management for your application and help you convert it into a SaaS product for your employees or customers.

Answer 57

Generate insights and recommendations to help you adhere to the Well-Architected Framework.

Answer 58

Indefinitely

Answer 59

The Well-Architected Tool helps you use the Well-Architected Framework as a set of lenses through which to analyze your workloads. You can use it to learn about the Well-Architected Framework and generate action plans to bring your architectures into alignment with it.

Answer 60

AWS Config allows you to set up account-wide rules and detect non-compliant resources.

Answer 61

AWS Health Dashboard will give you a view of all outages across AWS, as well as a personal dashboard that displays only those services and Regions that are relevant to your cloud resources.

Answer 62

CloudWatch alarms can be used to send notifications or trigger automated events when metrics reach defined thresholds.

AI, Machine Learning, Analytics Technology and Services Flashcards

(90 cards)