AWS Big Data, Serverless, Security, Automation Flashcards
What are the 3 V’s of big data?
Volume
Variety
Velocity
What is Redshift ?
RedShift is a relational data warehousing service for BI applications that can store up to 16 PetaBytes of data, Not highly available and single AZ
What is ETL?
Extract Transform Load is a data processing pipeline
What is EMR?
Elastic MapReduce is an amazon service for launching EC2 clusters for ETL processing using open-source ETL engines
What is Amazon Kensis?
Kinesis is a service for streaming data in real time or near real time.
What is the difference between Kensis Firehose and Kensis Streams?
Kensis Firehose is fully managed, automatically scales, Easier to configure but slower and only allows for preconfigured consumers such as S3. Kensis Streams is realtime but requires you to scale the streams manually and develop your own consumers using the Kensis SDK
What is Kensis Analytics ?
Kensis analytics allows you to transform your data as it passes through the stream using SQL
What is amazon Athena ?
It is a serverless SQL solution that allows you to query data stored in S3. i.e Logs / BI application
What is Amazon Glue?
Glue is a serverless ETL service that allows you to process your data without having to worry about EC2 instances and third party software, unlike Elastic MapReduce
What is amazon QuickSight?
Amazon Quicksight is a fully managed business intelligence (BI) data visualization service
What is amazon data pipeline?
A managed ETL Service that automates movements and transformations of your data
What storage integrations does data pipeline support?
DynamoDB
RDS
Redshift
S3
What are three key features for amazon data pipeline?
Integrates with EC2 and EMR
Integrates with SNS
Data-driven workflows
automatic-retries
What is Amazon MSK?
Amazon managed stream for Kafka is a fully managed service for running and building Apache Kafka data streaming applications
What is amazon OpenSearch?
It is a managed service allowing you to run open source search and analytics engines for various use cases it is the successor to amazon ElasticSearch.
What is Lambda?
Lambda is a serverless compute service
What are the limitations for a Lambda?
10GB Ram
Max 15mins execution time
What are five configuration aspects that are vital for Lambda?
- Runtime
- Permissions (Defining access to other resources)
- Networking (Accessing Endpoint for other resources)
- Resources (CPU, RAM, Maz execution time)
- Triggers
What is AWS Serverless Application Repository?
It is a repository solution for serverless applications which allows you to publish or deploy public or private serverless applications that use lambda compute
What is a AWS SAM Template?
AWS Serverless application model template are used to define serverless application stacks and are private by default
What is ECR?
Elastic Container Registry is a managed repository to store your OCI repositories, Docker images and intergrates with ECS and EKS
What is ECS?
Elastic Container Service is a fully managed AWS service that allows you to run and orchestrate large numbers of containers.
What is EKS?
Elastic Kubernetes Service allow you to run a Kubernetes in the AWS cloud
What is the difference between EKS and ECS?
ECS is proprietary to amazon therefore it can’t be run on-premises without AWS outposts. ECS provides quicker and easier integration with AWS services