5 Vs of Analaytics - Velocity Flashcards
Typ eof batch processing representing data that is processed in large volumes on a regular basis.
Scheduled, Periodic, Real Time, Streaming
Scheduled.
Type of batch processing workload that usually has the same amount of data at the end. Often referred to as ‘predictable’.
Scheduled, Periodic, Near Real Time, Streaming, Real-Time
Scheduled
Type of batch process, where its schedule is i-regular
Scheduled, Periodic, Near Real Time, Streaming, Real-Time
Periodic
type of batch process where they are continually processed and collected within minutes of the data being generated.
Scheduled, Periodic, Near Real Time, Streaming, Real-Time
Near-Real time - streaming data
type of batch process where they are continually processed and collected within milliseconds of the data being generated.
Scheduled, Periodic, Near Real Time, Streaming, Real-Time
Real_time streaming
Big data solution for petabyte scale data processing, interactive analytics and ML.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon EMR
Fully managed, highly available apache kaftka service.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon MSK
Cost-effective service to process and analyze streaming data at any scale as a fully managed service.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon Kinesis.
Serverless, event-driven compute service that lets you run code without provisiing or managing servers
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon Lambda
Runs big data applications and petabytes scale analytics quickly. Less than half the cost of on-premises solutions. Seamlessly integrates with amazon SageMaker. Performs machine learning tasks on large datasets. Uses open-source big data frameworks to distrubute data processing tasks. Uses parallel processing capacbilites to ensure that data can be ingested, transformed and analyzd rapidly.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon EMR
Provisions servers, configures apache Kafka clusters, and replaces servers when they fail. Orchestrates server patches and upgrades. Architects clusters for high availbility and ensures data is durably stored and secured. Sets up monitoring and alarms. Runs scaling to support load changes. Provides the operations for creating and updating and deleting apache kafka clusters.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon MSK
Collects, processes and analyzes real-time streaming data. Helps you recive timely insights and react quickly to new information. Makes its convenient to capture, process and store data streams at any scale.
Amazon EMR, Amazon MSK, Amazon Kinesis, Amazon Lambda
Amazon Kinesis
A data streaming service that continously captures data in real time from hundreds of soruces.
Amazon Kinesis Data Firehose, Amazon Kinesis Data streams, Amazon Kinesis, Amazon Managed service for Apache Flink
Amazon Kinesis Data streams
Near real time analytics with existing BI tools for capturing, transforming and loading data streams into aws data stores.
Amazon Kinesis Data Firehose, Amazon Kinesis Data streams, Amazon Kinesis, Amazon Managed service for Apache Flink
Amazon Kinesis Data Firehose.
Build and run apache flink applications and query and analyze streaming data without setting up infrastructure and clusters.
Amazon Kinesis Data Firehose, Amazon Kinesis Data streams, Amazon Kinesis, Amazon Managed service for Apache Flink
Amazon Managed service for Apache flink.