Introduction to Data Literacy Flashcards

1
Q

Can help us learn how data can be used to connect the dots and create value?

A

Data Literacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The ability to read, work with, analyze, and communicate insights with data.

A

Data Literacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Three main components of data literacy?

A

Reading data
Working with and analyzing data
Communicating insights with data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does reading data consist of?

A

Identifying data sources
Collect data
Manage data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Allow you to store organize and share your data

A

Databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Main tools for communication?

A

Visualizations and Storytelling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In the DIKW pyramid, this consists of raw observations or measurements?

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In the DIKW pyramid, this refers to unorganized, unprocessed, and does not have meaning (yet)

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In the DIKW pyramid, this refers to raw data placed into context.

A

Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In the DIKW pyramid, this is typically done by organizing or aggregating data.

A

Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In the DIKW pyramid, this refers to combining information and making connections to learn and gain meaning.

A

Knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the DIKW pyramid, this is typically done by detecting patterns, making generalizations or predictions.

A

Knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In the DIKW pyramid, this is applied knowledge, or knowledge in action, as it allows to act proactively.

A

Wisdom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the DIKW pyramid, this is typically done by combining knowledge logically to determine the course of action.

A

Wisdom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Characteristics of insights?

A

Allow to get closer to wisdom
Valuable, realistically achieved
Apply knowledge and take action
Approached, but not quite reached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The process of using data to make an informed decision about a specific problem and acting upon it.

A

Data-driven decision making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

5 main steps that underpin every data-driven process:

A

Problem statement
Data Collection
Data Analysis
Communication
Action and reflection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Problem statement answers the question:

A

What is the problem that you want to solve?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Step in data-driven decision making that guides the data-driven process?

A

Problem statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Typical problem categories:

A

Describing the state of an organization or process
Diagnosing causes of events
Detecting anomalies or predicting events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Guiding questions on how to define a problem:

A

What is the current situation?
What do we need to know?
Where do we want to be?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A good problem statement is:

A

Clearly defined
Actionable
Realistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Data comes in different forms

A

Images and text
Network and spatial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Different sources of data?

A

Open Data and Internal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Open data includes:

A

Public databases and records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The importance of data type has an effect on:

A

How to collect the data
How to store the data
How to analyze the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Data in tabular form

A

Structured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Easy to search and organize

A

Structured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Requires less preprocessing

A

Structured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Stored in relation databases

A

Structured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Data without pre-defined structure

A

Unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Difficult to search and organize

A

Unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Requires more preprocessing

A

Unstructured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Stored in document databases

A

Unstructured Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Examples of structured data

A

Spreadsheets
Data tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Examples of unstructured data

A

Images
Videos
Sound
Text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Describes something with numbers

A

Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Can be measured or counted

A

Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Wider range of statistics and analysis methods

A

Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Describes something with categories

A

Qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Can be observed

A

Qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

More restricted range of statistics and analysis methods

A

Qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

allows the user to store, retrieve, and access the data

A

Database management system (DBMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Different type of databases

A

Relational vs. document databases
Data warehouse vs. data lake

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Document databases stores what type of data?

A

Unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Relational databases stores what type of data?

A

Structured Data

46
Q

Contains processed, organized data in preparation for future analysis

A

Data warehouse

47
Q

Used to store raw data that has not been prepared yet.

A

Data Lake

48
Q

Designing and optimizing database systems is typically the responsibility of a _______

A

Data engineer

49
Q

Data is stored o remote servers and accessed over the internet

A

Data storage in cloud

50
Q

Data storage in the cloud has services provided by a specialized third party

A

true

51
Q

Cost-effective, but still rely on third party for security dependent

A

True

52
Q

The purpose of ___________ move data from one database to another.

A

Pipelines

53
Q

Pipelines can be automated collection and storage via the _____________

A

ETL Process

54
Q

ETL process stands for?

A

Extract, transform, and Load

55
Q

Making use of pipelines ensures what?

A

The availability of up-to-date and accurate data

56
Q

Accessing and Retrieving data from databases?

A

Querying

57
Q

Industry standard for querying?

A

SQL

58
Q

SQL stands for?

A

Structured Querying Language

59
Q

Another way to leverage the data available in databases?

A

Dashboards

60
Q

Alternative non-technical way to collecting, managing and sharing data between teams.

A

Dashboards

61
Q

Provides information at a glance?

A

Dashboards

62
Q

Receives data from a linked database

A

Dashboards

63
Q

Data is presented in a very visual way

A

Dashboards

64
Q

A multipurpose tool used for exploratory analysis of the data and communicating

A

Dashboards

65
Q

Dirty data is categorized as what?

A

Incorrect
Incomplete
Inconsistent

66
Q

Caused by human error, technical issues, or issues with the data collection process

A

Dirty data

67
Q

Consists of data that is incorrect or inconsistent

A

Data Errors

68
Q

Data errors are typically cause by _____________ error in recording the value or the format

A

Human or Technical error

69
Q

Techniques to counter data errors:

A

Original value or valid format is known: correct data
If unknown: drop data

70
Q

When data is incomplete, what do we call it?

A

Missing data

71
Q

Missing data will be problematic if:

A

Many data points are missing
There are underlying patterns in the missing data

72
Q

What techniques should we do to counter missing data?

A

Dropping data
Imputation

73
Q

Societal bias can be reflected in data

A

Data Bias

74
Q

Leads to unrepresentative data and results

A

Data Bias

75
Q

Techniques to counter to avoid data bias:

A

Sound data collection process
Awareness in conclusions
Explainable AI models

76
Q

Set of techniques to counter data problems

A

Data Cleaning

77
Q

Important preparation step for any data analysis

A

Data Cleaning

78
Q

Not all data problems are completely solvable

A

True

79
Q

Four main types of analytics:

A

Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Prescriptive Analytics

80
Q

What is being asked in Descriptive analytics?

A

Why is it happening?

81
Q

What is being asked in Diagnostic Analytics?

A

Why is it happening?

82
Q

What is being asked in Predictive Analytics?

A

What will happen?

83
Q

What is being asked in Prescriptive Analytics?

A

What should we do?

84
Q

What type of analytics responsible for finding the root causes of events?

A

Diagnostic Analytics

85
Q

What type of analytics summarizes and visualizes the data?

A

Descriptive Analytics

86
Q

What type of analytics identifies the possible outcomes and the probability that they will happen?

A

Predictive Analytics

87
Q

What type of analytics determines the best course of action given the outcome we want to achieve?

A

Prescriptive Analytics

88
Q

Common techniques for Descriptive analytics

A

Descriptive statistics
Visualizations
Outlier Detection
EDA

89
Q

Why should we use descriptive analytics?

A

Get to know the data
Investigate relationships in the data
Preparation for more advance techniques

90
Q

Focus on exploring the data:
Assessing main characteristics
Finding relationships, patterns or groups
Suggesting hypotheses for future analysis

A

Exploratory Data Analysis

91
Q

Groundwork for further analysis but also valuable on its own

A

EDA

92
Q

Why use diagnostic analytics?

A

Find potential causes of events or reasons for behaviors
Investigate casual relationships
Suggest solutions based on the identified causes

93
Q

Common techniques of Diagnostic Analytics:

A

Drill-down analytics
Correlation and regression analysis
Hypothesis testing
Root cause analysis

94
Q

Formal set of steps to look beyond superficial causes that have a direct effect

A

Root cause Analysis

95
Q

Steps of Root cause analysis

A

Define the event
Collect relevant data
Determine Contributing factors
Find root causes
Recommend possible solutions

96
Q

Why use Predictive analytics?

A

Anticipate most likely outcomes
Forecast a process or sequence
Estimate an unknown based on the information that is available

97
Q

Two types of machine learning models:

A

Classification-Based
Regression-Based

98
Q

Common techniques used in Predictive Analytics:

A

Machine Learning Models
Time Series forecasting
Predictive text analysis

99
Q

Predicting housing prices based on neighborhood characteristics

A

Regression-based

99
Q

Predicting cancellation of subscriptions

A

Classification-based

99
Q

Predicting sales revenue over time

A

Time series Forecasting

100
Q

Predicting whether an email is spam or not

A

Predictive text analysis

101
Q

Steps in Predictive Modeling

A

Define the outcome
Collect and Prepare data
Build Predictive model
Interpret and evaluate the model
Implement / Fine-tune

102
Q

In the predictive modeling phase, data is split into __________________ to build the predictive model

A

Training and Test Set

103
Q

Predictions are interpreted and evaluated on the test data, using pre-determined metrics like (accuracy) percentage of correct predictions

A

True

104
Q

Primary purpose of prescriptive analytics

A

To help decide what best to do

105
Q

Why use prescriptive analytics?

A

Make informed, data-driven decision
Optimize processes
Mitigate Risks

106
Q

Common techniques used in Prescriptive Analytics

A

Rule-based systems
Reinforcement Learning
Scenario and simulation analysis

107
Q

Consist of generating a set of rules or decision logic to get the best outcome

A

Rule-based systems

108
Q

An algorithm learns to achieve a particular objective or optimize an outcome by receiving positive and negative feedback when running though a set of actions.

A

Reinforcement Learning

109
Q

Running through a set of pre-determined scenarios or simulating multiple outcomes to help select the decision that leads to the best outcome

A

Scenario and simulation analysis

110
Q

Predicts interest based on past behavior and Provides recommendations based on predicted interests

A

Recommendation engine