01 - Explore Core Data Concepts Flashcards

1
Q

What is Data

A

Collection of facts, numbers, descriptions, objects, stored in a structured, semi-structured, unstructured way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Online Transactional Processing (OLTP)

A

Data is stored one transaction at a time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Online Analytical Processing (OLAP)

A

Data is periodically loaded, aggregated and stored in a cube

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Transactional Workloads

A

Atomicity

Consistency

Isolation

Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Atomicity

A

Each transaction is treated as a single unit, which success completely or fails completely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Consistency

A

Transactions can only take the data in the database from one valid state to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Isolcation

A

Concurrent execution of transactions leave the database in the same state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Durability

A

Once a transaction has been committed, it will remain committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Analytical Workloads

A

Used for data analysis and decision making

  • Summaries
  • Trends
  • Business information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Data Processing

A

Convert Raw Data to Meaningful Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Batch Processing

A

Data elements are collected into a group. Whole group is then processed at a future time as a batch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Stream Processing

A

Each new piece of data is processed when it arrives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Online Transaction Processing (OLTP)

A

For example order systems that perform many small transactional updates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Warehousing

A

Large amount of

fill in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tables

A

Data is stored in a table

Table consists of rows and columns

All rows have same # of columns

Each column is defined by a datatype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Entity

A

Representation of an item which can be physical (such as a customer or a product), or virtual (such as an order).

Connected by relations

fill in

17
Q

What is Normalization

A

Data is normalized to

Reduce storage
Avoid data duplication
Improve data quality

18
Q

Normalized database schema

A

Primary Key and Foreign keys are used to define relationships

No data duplication exists (other than key values in 3rd Normal Form (3NF))

Data is retrieved by joining tables together in a query

19
Q

Relational Database

A

Type of DB that uses the relational data model

20
Q

SQL

A

Structured Query Language

21
Q

Index

A

Optimizes search queries for faster data retrieval

Reduces the amount of data pages that need to be read to retrieve the data in an SQL Statement

Data is retrieved by joining tables together in a query

22
Q

View

A

View is a virtual table based on the result set of query

Views are created to simplify the query

Combine relational data into a single pane view

Restrict access to table while allowing users to access non-confidential data

23
Q

Non-relational collections can have

A

Multiple entities in the same collection or container with different fields

Have a different, non-tabular schema

Are often defined by labeling each field with the name it represents

24
Q

What is semi-structured data

A

Data structure defined within the actual data by fields. Format/file types include

JSON
AVRO
ORC
Parquet

25
What is Unstructured Data
Does not naturally contain fields EG: video, audio, media streams, documents Often used to extract data organization and categorize or identify "structures" fill in
26
What is NoSQL
Loose term, to describe non-relational Key-value stores Document based Column family databases Graph databases
27
What is a graph database
Stores entities centric around relationships Enables applications to perform queries traversing a network of nodes and edges
28
Steps in Data Journey
Data Ingestion Data Processing Data Exploration
29
Data Ingestion
Process of obtaining and importing data for immediate use or storage in a database
30
Data Processing
Takes the data in raw form, cleans it, and converts it into a more meaningful format ETL - Extract, Transform, and Load ELT - Extract, Load, and Transform
31
Data Exploration
Query the data and create graphical representations of information and data
32
Data Visualization
Business model can contain an enormous amount of information - there are techniques to analyze and understand the information in your models Reporting Business Intelligence (BI) Data Visualization
33
Data analytics
Discipline that covers the entire range of data management tasks - analysis, data collection, organization, storage, and the tools and techniques ``` Descriptive Diagnostic Predictive Prescriptive Cognitive ```
34
Prescriptive analytics
What ACTIONS to take to achieve goal or target?
35
Diagnostic analytics
WHY something happened? Take what we collected from descriptive analytics and dive deeper
36
Predictive analytics
What WILL happen Use historical data to determine what will happen in future
37
Prescriptive analytics
What ACTIONS to take to achieve goal or target?
38
Cognitive analytics
What MIGHT happen if CIRCUMSTANCES change Intelligent technologies Combine AI, ML, Deep Learning to apply human brain-like intelligence to do certain tasks