Analytics Flashcards

1
Q

Amazon Athena

A

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using
standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the
queries that you run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Amazon EMR

A

Amazon EMR provides a managed Hadoop framework that makes it easy, fast, and cost-effective to
process vast amounts of data across dynamically scalable Amazon EC2 instances. You can also run
other popular distributed frameworks such as Apache Spark, HBase, Presto, and Flink in Amazon EMR, and interact with data in other AWS data stores such as Amazon S3 and Amazon DynamoDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Amazon CloudSearch

A

Amazon CloudSearch is a managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application. Amazon CloudSearch supports 34 languages and popular search features such as highlighting, autocomplete, and geospatial search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon Elasticsearch Service

A

Amazon Elasticsearch Service makes it easy to deploy, secure, operate, and scale Elasticsearch to search,
analyze, and visualize data in real-time. With Amazon Elasticsearch Service, you get easy-to-use APIs and real-time analytics capabilities to power use-cases such as log analytics, full-text search, application
monitoring, and clickstream analytics, with enterprise-grade availability, scalability, and security. The service offers integrations with open-source tools like Kibana and Logstash for data ingestion and visualization. It also integrates seamlessly with other AWS services such as Amazon Virtual Private Cloud (Amazon VPC), AWS Key Management Service (AWS KMS), Amazon Kinesis Data Firehose, AWS Lambda, AWS Identity and Access Management (IAM), Amazon Cognito, and Amazon CloudWatch, so that you can go from raw data to actionable insights quickly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Amazon Kinesis

A

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.

Amazon Kinesis currently offers four services: Kinesis Data Firehose, Kinesis Data Analytics, Kinesis Data
Streams, and Kinesis Video Streams.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Amazon Kinesis Data Firehose

A

Amazon Kinesis Firehose is the easiest way to reliably load streaming data into data stores and analytics tools. It can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, Amazon
Elasticsearch Service, and Splunk, enabling near real-time analytics with existing business intelligence
tools and dashboards you’re already using today. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, transform, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon Kinesis Data Analytics

A

Amazon Kinesis Data Analytics is the easiest way to analyze streaming data, gain actionable insights, and respond to your business and customer needs in real time. Amazon Kinesis Data Analytics reduces the complexity of building, managing, and integrating streaming applications with other AWS services. SQL users can easily query streaming data or build entire streaming applications using templates and an interactive SQL editor. Java developers can quickly build sophisticated streaming applications using open
source Java libraries and AWS integrations to transform and analyze data in real-time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Amazon Kinesis Data Streams

A

Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service.
KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources such
as website clickstreams, database event streams, financial transactions, social media feeds, IT logs, and
location-tracking events. The data collected is available in milliseconds to enable real-time analytics use
cases such as real-time dashboards, real-time anomaly detection, dynamic pricing, and more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Amazon Kinesis Video Streams

A

Amazon Kinesis Video Streams makes it easy to securely stream video from connected devices to AWS
for analytics, machine learning (ML), playback, and other processing. Kinesis Video Streams automatically
provisions and elastically scales all the infrastructure needed to ingest streaming video data from millions of devices. It also durably stores, encrypts, and indexes video data in your streams, and allows you to access your data through easy-to-use APIs. Kinesis Video Streams enables you to playback video for live and on-demand viewing, and quickly build applications that take advantage of computer vision and video analytics through integration with Amazon Recognition Video, and libraries for ML frameworks such as Apache MxNet, TensorFlow, and OpenCV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Amazon Redshift

A

Amazon Redshift is a fast, scalable data warehouse that makes it simple and cost-effective to analyze all your data across your data warehouse and data lake. Redshift delivers ten times faster performance than other data warehouses by using machine learning, massively parallel query execution, and columnar
storage on high-performance disk. You can setup and deploy a new data warehouse in minutes, and run
queries across petabytes of data in your Redshift data warehouse, and exabytes of data in your data lake built on Amazon S3. You can start small for just $0.25 per hour and scale to $250 per terabyte per year,
less than one-tenth the cost of other solutions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Amazon QuickSight

A

Amazon QuickSight is a fast, cloud-powered business intelligence (BI) service that makes it easy for you to deliver insights to everyone in your organization. QuickSight lets you create and publish interactive
dashboards that can be accessed from browsers or mobile devices. You can embed dashboards into your
applications, providing your customers with powerful self-service analytics. QuickSight easily scales to tens of thousands of users without any software to install, servers to deploy, or infrastructure to manage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

AWS Data Pipeline

A

AWS Data Pipeline is a web service that helps you reliably process and move data between different
AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and
efficiently transfer the results to AWS services such as Amazon S3 (p. 60), Amazon RDS (p. 23), Amazon DynamoDB (p. 23), and Amazon EMR (p. 10).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

AWS Glue

A

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers
to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

AWS Lake Formation

A

AWS Lake Formation is a service that makes it easy to set up a secure data lake in days. A data lake is a centralized, curated, and secured repository that stores all your data, both in its original form and
prepared for analysis. A data lake enables you to break down data silos and combine different types of
analytics to gain insights and guide better business decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amazon Managed Streaming for Apache Kafka

Amazon MSK

A

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache
Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use Apache Kafka APIs to populate data lakes, stream changes to and from
databases, and power machine learning and analytics applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly