streaming Flashcards

1
Q

why we need streaming data process

A

1)latency: switching to streaming achieving a lower latency.
2)workload balancing, process data while they arrive, yielding more consistent and predictable consumption of resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is streaming, and the key characteristics

A

a type of data process engine that is designed with infinite datasets in mind.

1)infinite data
2)infinite computation
3)low-latency result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

data stream model

A

1)time series model (track the changes in an element’s state over time)
2)cash register model(track the increments)
3)turnstile model: record updates, both positive & negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

streaming style architecture components

A

1)data provider
2)collecting
3)message queuing
4)analysis
5)data access
6)data consumer
7)long-term storage
8)in-memory storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is message queuing, and whats the benefits?

A

it handles data exchange between components in a streaming architecture, primarily moving data from collection to analysis tier.

1)decouple the operations, simplify the design of the system and improve fault isolation.

2)load management: funnels multiple data streams to multiple consumers, configuring efficient distribution.

3)safe communication: provide reliability in data transfer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what’s producer-broker-consumer model

A

producer generates data and sends to broker, broker manages queues organized by topics and partitions data for distribution.

consumer retrieves data from queues when ready.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is durable queues

A

it means the message queue should ensure data is stored until its safely consumed, it supports offline and slow consumers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what’s the role of analysis in message queueing?

and whats the key features

A

its the central of the architecture, it processes data streams in near real-time using specialized algorithms and models. on a per time or per window basis.

its a continuous query model

1)issued once and Continuously executed as new data arrive.

2)may require maintain a state

3)stateless queries, independent executions

4)stateful queries, maintain and update state for processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what’s windowing, and whats the different types of windowing?

A

group and process data in manageable chunks, defined by length & processing period.

1)sliding windows :fixed windows; overlapping windows, sampling windows.

2)data-driven windows: length determined by data patterns, its useful for user behavior analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is stream time and what is event time

A

event time is the actual time when the event occurred, as recorded by its source.

stream time is the time when the event enters the streaming system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

whats the difference between windowing by event time and windowing by stream time

A

windowing by stream time is more straightforward implementation, no need to handle the out-of-date data.

it always closed based on system defined timings. and it provides immediate insights.

but ignoring event time could cause inaccurate insights.

windowing by event time is like the golden standard of windowing.

but its impossible to precisely know when the window will be closed. and extend window lifetime means more buffering of data

and most of processing lacks of native support.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly