01 - Explore Core Data Concepts Flashcards

1
Q

What is Data

A

Collection of facts, numbers, descriptions, objects, stored in a structured, semi-structured, unstructured way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Online Transactional Processing (OLTP)

A

Data is stored one transaction at a time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Online Analytical Processing (OLAP)

A

Data is periodically loaded, aggregated and stored in a cube

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Transactional Workloads

A

Atomicity

Consistency

Isolation

Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Atomicity

A

Each transaction is treated as a single unit, which success completely or fails completely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Consistency

A

Transactions can only take the data in the database from one valid state to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Isolcation

A

Concurrent execution of transactions leave the database in the same state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Durability

A

Once a transaction has been committed, it will remain committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Analytical Workloads

A

Used for data analysis and decision making

  • Summaries
  • Trends
  • Business information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Data Processing

A

Convert Raw Data to Meaningful Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Batch Processing

A

Data elements are collected into a group. Whole group is then processed at a future time as a batch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Stream Processing

A

Each new piece of data is processed when it arrives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Online Transaction Processing (OLTP)

A

For example order systems that perform many small transactional updates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Warehousing

A

Large amount of

fill in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tables

A

Data is stored in a table

Table consists of rows and columns

All rows have same # of columns

Each column is defined by a datatype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Entity

A

Representation of an item which can be physical (such as a customer or a product), or virtual (such as an order).

Connected by relations

fill in

17
Q

What is Normalization

A

Data is normalized to

Reduce storage
Avoid data duplication
Improve data quality

18
Q

Normalized database schema

A

Primary Key and Foreign keys are used to define relationships

No data duplication exists (other than key values in 3rd Normal Form (3NF))

Data is retrieved by joining tables together in a query

19
Q

Relational Database

A

Type of DB that uses the relational data model

20
Q

SQL

A

Structured Query Language

21
Q

Index

A

Optimizes search queries for faster data retrieval

Reduces the amount of data pages that need to be read to retrieve the data in an SQL Statement

Data is retrieved by joining tables together in a query

22
Q

View

A

View is a virtual table based on the result set of query

Views are created to simplify the query

Combine relational data into a single pane view

Restrict access to table while allowing users to access non-confidential data

23
Q

Non-relational collections can have

A

Multiple entities in the same collection or container with different fields

Have a different, non-tabular schema

Are often defined by labeling each field with the name it represents

24
Q

What is semi-structured data

A

Data structure defined within the actual data by fields. Format/file types include

JSON
AVRO
ORC
Parquet

25
Q

What is Unstructured Data

A

Does not naturally contain fields
EG: video, audio, media streams, documents

Often used to extract data organization and categorize or identify “structures”

fill in

26
Q

What is NoSQL

A

Loose term, to describe non-relational

Key-value stores
Document based
Column family databases
Graph databases

27
Q

What is a graph database

A

Stores entities centric around relationships

Enables applications to perform queries traversing a network of nodes and edges

28
Q

Steps in Data Journey

A

Data Ingestion
Data Processing
Data Exploration

29
Q

Data Ingestion

A

Process of obtaining and importing data for immediate use or storage in a database

30
Q

Data Processing

A

Takes the data in raw form, cleans it, and converts it into a more meaningful format

ETL - Extract, Transform, and Load

ELT - Extract, Load, and Transform

31
Q

Data Exploration

A

Query the data and create graphical representations of information and data

32
Q

Data Visualization

A

Business model can contain an enormous amount of information - there are techniques to analyze and understand the information in your models

Reporting

Business Intelligence (BI)

Data Visualization

33
Q

Data analytics

A

Discipline that covers the entire range of data management tasks - analysis, data collection, organization, storage, and the tools and techniques

Descriptive
Diagnostic
Predictive
Prescriptive
Cognitive
34
Q

Prescriptive analytics

A

What ACTIONS to take to achieve goal or target?

35
Q

Diagnostic analytics

A

WHY something happened?

Take what we collected from descriptive analytics and dive deeper

36
Q

Predictive analytics

A

What WILL happen

Use historical data to determine what will happen in future

37
Q

Prescriptive analytics

A

What ACTIONS to take to achieve goal or target?

38
Q

Cognitive analytics

A

What MIGHT happen if CIRCUMSTANCES change

Intelligent technologies

Combine AI, ML, Deep Learning to apply human brain-like intelligence to do certain tasks