Module 9: AWS analytics services Flashcards
What is Amazon Athena?
Is a service used to Analyze data stored in S3. Wuth Athena you can do SQL, JSON and other Querys on S3. Is a serverless aplication. Is integrated with Amazon Quicksight to visualize your data. Use KMS to decrypt or encrypt data. It sotres Querys in S3
What is Amazon Cloud Search?
Is a self manage AWS service used when you want to add a ‘search application or functionality’ to your web or app aplication. It atoescale by itself, patching, you dont have to configure anything.
What is Amazon elastic search?
Is a service that allows you to search and analyze your data in real time. This service has automanaging capabilities
What is Amazon EMR? (map reduce)
Is a service that is used to analyze big amount of data with Spark or Hadoop frameworks. Is integrated with cloud trail.
What is Amazon Kinesis?
Is a tool used to collect, process and analyze real time data (streaming data) is used in videogame, movie industry and IoT
What is Amazon kinesis Video Streams?
Is amazon kinesis but for video streaming
What is Amazon Kinesis DataStream?
Is a serverless service used to collect stream data
What is Kinesis firehose?
Is a service used to upload real time data to S3, Redshift, elastic search,
What is Kinesis Data analytics?
Is a service to analyze real time data, do querys on that data. An also is serverless.
What is Amazon Quick sight?
Is a business analytics tools, it has data visualization tools, reports tool, you can also do machine learning insgihts.
What is Amazon Redshift?
Is auto managed warehouse service, is used for OLAP aplications (data report and analytics). Is also a relational databas where you can do Querys. Use Redshift spectrum to do Querys on S3 .
What is AWS Data pipeline?
Is a service used to make the movement from a data to different stages, from example pass data to analyse phase and then pass that analyzed data to Quicksight to display the data.
What is AWS Data pipeline?
Is a service used to make the movement from a data to different stages, from example pass data to analyse phase and then pass that analyzed data to Quicksight to display the data.
What is AWS Glue?
ETL is the process of transferring data from the source database to the destination data warehouse. In the process, there are 3 different sub-processes like E for Extract, T for Transform, and L for Load. The data is extracted from the source database in the extraction process which is then transformed into the required format and then loaded to the destination data warehouse. For performing all these functions there are certain tools that are called the ETL tools.
AWS Glue is a serverless data integration and ETL service that makes discovering, preparing, and combining data for data analysis, Machine Learning, and application development simple. To enable the data integration process smoother, Glue offers both visual and code-based tools.
Amazon Glue consists of three components namely, the AWS Glue Data Catalog, an ETL engine that creates Python or Scala code automatically, and a configurable scheduler that manages dependence resolutions, task monitoring, and restarts.
The Glue Data Catalog allows users to quickly locate and retrieve data. Customization, orchestration, and monitoring of complicated data streams are also available through the Glue service.
What is AWS Glue?
ETL is the process of transferring data from the source database to the destination data warehouse. In the process, there are 3 different sub-processes like E for Extract, T for Transform, and L for Load. The data is extracted from the source database in the extraction process which is then transformed into the required format and then loaded to the destination data warehouse. For performing all these functions there are certain tools that are called the ETL tools.
AWS Glue is a serverless data integration and ETL service that makes discovering, preparing, and combining data for data analysis, Machine Learning, and application development simple. To enable the data integration process smoother, Glue offers both visual and code-based tools.
Amazon Glue consists of three components namely, the AWS Glue Data Catalog, an ETL engine that creates Python or Scala code automatically, and a configurable scheduler that manages dependence resolutions, task monitoring, and restarts.
The Glue Data Catalog allows users to quickly locate and retrieve data. Customization, orchestration, and monitoring of complicated data streams are also available through the Glue service.