Module 9: AWS analytics services Flashcards

Question 1

Q

What is Amazon Athena?

Answer

A

Is a service used to Analyze data stored in S3. Wuth Athena you can do SQL, JSON and other Querys on S3. Is a serverless aplication. Is integrated with Amazon Quicksight to visualize your data. Use KMS to decrypt or encrypt data. It sotres Querys in S3

Question 2

Q

What is Amazon Cloud Search?

Answer

A

Is a self manage AWS service used when you want to add a ‘search application or functionality’ to your web or app aplication. It atoescale by itself, patching, you dont have to configure anything.

Question 3

Q

What is Amazon elastic search?

Answer

A

Is a service that allows you to search and analyze your data in real time. This service has automanaging capabilities

Question 4

Q

What is Amazon EMR? (map reduce)

Answer

A

Is a service that is used to analyze big amount of data with Spark or Hadoop frameworks. Is integrated with cloud trail.

Question 5

Q

What is Amazon Kinesis?

Answer

A

Is a tool used to collect, process and analyze real time data (streaming data) is used in videogame, movie industry and IoT

Question 6

Q

What is Amazon kinesis Video Streams?

Answer

A

Is amazon kinesis but for video streaming

Question 7

Q

What is Amazon Kinesis DataStream?

Answer

A

Is a serverless service used to collect stream data

Question 8

Q

What is Kinesis firehose?

Answer

A

Is a service used to upload real time data to S3, Redshift, elastic search,

Question 9

Q

What is Kinesis Data analytics?

Answer

A

Is a service to analyze real time data, do querys on that data. An also is serverless.

Question 10

Q

What is Amazon Quick sight?

Answer

A

Is a business analytics tools, it has data visualization tools, reports tool, you can also do machine learning insgihts.

Question 11

Q

What is Amazon Redshift?

Answer

A

Is auto managed warehouse service, is used for OLAP aplications (data report and analytics). Is also a relational databas where you can do Querys. Use Redshift spectrum to do Querys on S3 .

Question 12

Q

What is AWS Data pipeline?

Answer

A

Is a service used to make the movement from a data to different stages, from example pass data to analyse phase and then pass that analyzed data to Quicksight to display the data.

Question 13

Q

What is AWS Data pipeline?

Answer

A

Is a service used to make the movement from a data to different stages, from example pass data to analyse phase and then pass that analyzed data to Quicksight to display the data.

Question 14

Q

What is AWS Glue?

Answer

A

ETL is the process of transferring data from the source database to the destination data warehouse. In the process, there are 3 different sub-processes like E for Extract, T for Transform, and L for Load. The data is extracted from the source database in the extraction process which is then transformed into the required format and then loaded to the destination data warehouse. For performing all these functions there are certain tools that are called the ETL tools.
AWS Glue is a serverless data integration and ETL service that makes discovering, preparing, and combining data for data analysis, Machine Learning, and application development simple. To enable the data integration process smoother, Glue offers both visual and code-based tools.

Amazon Glue consists of three components namely, the AWS Glue Data Catalog, an ETL engine that creates Python or Scala code automatically, and a configurable scheduler that manages dependence resolutions, task monitoring, and restarts.

The Glue Data Catalog allows users to quickly locate and retrieve data. Customization, orchestration, and monitoring of complicated data streams are also available through the Glue service.

Question 15

Q

What is AWS Glue?

Answer

A

ETL is the process of transferring data from the source database to the destination data warehouse. In the process, there are 3 different sub-processes like E for Extract, T for Transform, and L for Load. The data is extracted from the source database in the extraction process which is then transformed into the required format and then loaded to the destination data warehouse. For performing all these functions there are certain tools that are called the ETL tools.
AWS Glue is a serverless data integration and ETL service that makes discovering, preparing, and combining data for data analysis, Machine Learning, and application development simple. To enable the data integration process smoother, Glue offers both visual and code-based tools.

Amazon Glue consists of three components namely, the AWS Glue Data Catalog, an ETL engine that creates Python or Scala code automatically, and a configurable scheduler that manages dependence resolutions, task monitoring, and restarts.

The Glue Data Catalog allows users to quickly locate and retrieve data. Customization, orchestration, and monitoring of complicated data streams are also available through the Glue service.