Amazon Web Services Cloud Platform Flashcards
AWS Management Console
Access and manage Amazon Web Services, simple and intuitive user interface
AWS Console Mobile Application - quickly view resources on the go
AWS Command Line Interface
unified tool to manage your AWS services
tool to download and configure - control multiple AWS services from command line and automate them through scripts
software development kits (SDKs)
simplify using AWS services in your applications with Application Program Interface (API) tailored to your programming language/platform.
Amazon Athena
interactive query service - makes it easy to analyse data in Amazon S3 using standard SQL
serverless and pay only for queries you run
how to use Amazon Athena
- point to your data in Amazon S3
- define schema.
- start querying using standard SQL.
results delivered in seconds
no need for extract, transform, load (ETL) jobs to prepare data for analysis
AWS Glue Data Catalog
out-of-the-box integration with Athena
create unified metadata repository across various services
crawl data sources to discover schemas and populate Catalog with new and modified table and partition definitions
maintain schema versioning
Glue’s fully-managed ETL capabilities to transform data or convert it to columnar formats to optimise cost and improve performance
Amazon EMR
provides managed Hadoop framework that makes it easy, fast and cost-effective to process vast amounts of data across dynamically scalable Amazon EC2 instances
securely and reliably handles broad set of big data use cases including:
log analysis web indexing data transformations (ETL) machine learning financial analysis scientific simulation bioinformatics
Amazon EMR and other platforms
run other popular distributed frameworks: Apache Spark HBase Presto Flink
interact with data in AWS data stores e.g. Amazon S3 and Amazon DynamoDB
EMR Notebooks
based on Jupyter Notebook
provide development and collaboration environment for ad hoc querying and exploratory analysis
Amazon CloudSearch
managed service in AWS Cloud that makes it simple and cost-effective to set up, manage, and scale search solution for website/application
supports 34 languages
features:
highlighting
autocomplete
geospatial search
Amazon Elasticsearch Service
makes it easy to deploy, secure, operate and scale Elasticsearch to search, analyse and visualise data in real-time
get easy-to-use APIs and real-time analytics capabilities to power use-cases such as: log analytics full-text search application monitoring clickstream analytics
(with enterprise-grade availability, scalability, security)
Amazon Elasticsearch integrations
open-source tools:
Kibana
Logstash
(for data ingestion and visualisation)
AWS services: Amazon Virtual Private Cloud (Amazon VPC) AWS Key Management Service (AWS KMS) Amazon Kinesis Data Firehose AWS Lambda AWS Identity and Access Management (IAM) Amazon Cognito Amazon CloudWatch
(from raw data to actionable insights)
Amazon Kinesis
collect, process, analyse real-time, streaming data - can get timely insights and react to new info
Amazon Kinesis 4 services
Kinesis Data Firehose
Kinesis Data Analytics
Kinesis Data Streams
Kinesis Video Streams
Amazon Kinesis Data Streams (KDS)
massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources:
clickstreams database event streams financial transactions social media feeds IT logs location-tracking events
data collected is available in milliseconds to enable real-time analytics use cases:
real-time dashboards
real-time anomaly detection
dynamic pricing
Amazon Kinesis Data Firehose
easiest way to reliably load streaming data into data stores and analytics tools.
Amazon Kinesis Data Analytics
easiest way to analyze streaming data, gain actionable insights, and respond to your business and customer needs in real time.
Amazon Kinesis Data Streams
massively scalable and durable real-time data streaming service.
Amazon Kinesis Video Streams
makes it easy to securely stream video from connected devices to AWS for analytics, machine learning (ML), playback, and other processing.
Amazon Redshift
fast, scalable data warehouse that makes it simple and cost-effective to analyze all your data across your data warehouse and data lake.
Amazon QuickSight
fast, cloud-powered business intelligence (BI) service that makes it easy for you to deliver insights to everyone in your organisation
AWS Data Pipeline
web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources at specified intervals
AWS Glue
fully managed extract, transform, and load (ETL) service that makes it easy for customer to prepare and load their data for analytics.
Data lake
centralised, curated, and secured repository that stores all your data, both in its original form and prepared for analysis.
enables you to break down data silos and combine different types of analytics to gain insights and guide better business decisions
AWS Lake Formation
service that makes it easy to set up a secure data lake in days.
Apache Kafka
open-source platform for building real-time streaming data pipelines and applications
Amazon Managed Streaming for Apache Kafka (Amazon MSK)
fully managed service that makes it easy for you to build and run applications using Apache Kafka to process streaming data
can use Apache Kafka APIs to populate data lakes, stream changes to and from databases, power machine learning and analytics applications