Past paper 3 Flashcards
What is Apache Spark?
distributed processing framework and programming model that helps you do machine learning, stream processing, or graph analytics using Amazon EMR clusters.
What is Amazon EMR?
a web service that makes it easy for you to process and analyze vast amounts of data using applications in the Hadoop ecosystem
What could you use you transform data into RecordIO-Protobuf format?
Apache Spark
WHat is AWS Glue?
serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development
Can AWS Glue transform data into RecordIO-Protobuf format?
No it cannot
What is AWS Step Functions?
a low-code visual workflow service used to orchestrate AWS services, automate business processes and build serverless applications
What is Lambda not suited for?
Long -running processes such as transforming large datasets
What is Kinesis Firehose used for?
capture, transform, and load streaming data into Amazon repositories
Which Amazon repositories can Kinesis Firehose load streaming data into?
Amazon S3, Amazon Redshift, Amazon Elasticsearch Service and Splunk
What type of processing should Kinesis Firehose not be used for?
Batch processing
What does a VPC endpoint allow connections between?
a virtual private cloud (VPC) and supported services, such as SageMaker, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
What is an interface endpoint?
an elastic network interface with a private IP address from the IP address range of your subset.W
What traffic does an interface endpoint serve?
traffic destined for a service that is owned by AWS or owned by an AWS customer or partner.
What is a gateway endpoint used for?
used for traffic destined into either S3 or DynamoDB
What is the AWS Key Management Service used for?
to create and control the cryptographic keys used to protect your data