1_Data Processing fundamentals Flashcards

1
Q

Data Lifecycle

  • Ingestion is the process of bringing application data, streaming data, and batch data into the cloud.
  • Storage stage focuses on persisting data to an appropriate storage system.
  • Processing and analyzing is about transforming data into a form suitable for analysis.
  • Exploring and visualizing focuses on testing hypotheses and drawing insights from data.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Batch Data

  • Batch data is ingested in bulk, typically in files.
    • Examples of batch data ingestion include uploading files of data exported from one application to be processed by another.
  • Large sets of data tha ‘pool’ up over time.
  • Low latency is not as important.
  • Both batch and streaming data can be transformed and processed using Cloud Dataflow.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Streaming Data

  • Streaming data is a set of data that is sent in small messages that are transmitted continuously from the data source.
  • Streaming data may be telemetry data, which is data generated at regular intervals, and event data, which is data generated in response to a particular event.
  • Stream ingestion services need to deal with potentially late and missing data.
  • Requires low latency.
  • Streaming data is often ingested using Cloud Pub/Sub.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Processing Solutions

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Levels of structure of data

  • These levels are structured, semi-structured, and unstructured.
  • Structured data has a fixed schema, such as a relational database table.
  • Semi-structured data has a schema that can vary; the schema is stored with data.
  • Unstructured data does not have a structure used to determine how to store data.
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Choosing a datastore

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly