Introduction to Data Literacy Flashcards
Can help us learn how data can be used to connect the dots and create value?
Data Literacy
The ability to read, work with, analyze, and communicate insights with data.
Data Literacy
Three main components of data literacy?
Reading data
Working with and analyzing data
Communicating insights with data
What does reading data consist of?
Identifying data sources
Collect data
Manage data
Allow you to store organize and share your data
Databases
Main tools for communication?
Visualizations and Storytelling
In the DIKW pyramid, this consists of raw observations or measurements?
Data
In the DIKW pyramid, this refers to unorganized, unprocessed, and does not have meaning (yet)
Data
In the DIKW pyramid, this refers to raw data placed into context.
Information
In the DIKW pyramid, this is typically done by organizing or aggregating data.
Information
In the DIKW pyramid, this refers to combining information and making connections to learn and gain meaning.
Knowledge
In the DIKW pyramid, this is typically done by detecting patterns, making generalizations or predictions.
Knowledge
In the DIKW pyramid, this is applied knowledge, or knowledge in action, as it allows to act proactively.
Wisdom
In the DIKW pyramid, this is typically done by combining knowledge logically to determine the course of action.
Wisdom
Characteristics of insights?
Allow to get closer to wisdom
Valuable, realistically achieved
Apply knowledge and take action
Approached, but not quite reached
The process of using data to make an informed decision about a specific problem and acting upon it.
Data-driven decision making
5 main steps that underpin every data-driven process:
Problem statement
Data Collection
Data Analysis
Communication
Action and reflection
Problem statement answers the question:
What is the problem that you want to solve?
Step in data-driven decision making that guides the data-driven process?
Problem statement
Typical problem categories:
Describing the state of an organization or process
Diagnosing causes of events
Detecting anomalies or predicting events
Guiding questions on how to define a problem:
What is the current situation?
What do we need to know?
Where do we want to be?
A good problem statement is:
Clearly defined
Actionable
Realistic
Data comes in different forms
Images and text
Network and spatial data
Different sources of data?
Open Data and Internal data
Open data includes:
Public databases and records
The importance of data type has an effect on:
How to collect the data
How to store the data
How to analyze the data
Data in tabular form
Structured Data
Easy to search and organize
Structured Data
Requires less preprocessing
Structured Data
Stored in relation databases
Structured Data
Data without pre-defined structure
Unstructured data
Difficult to search and organize
Unstructured data
Requires more preprocessing
Unstructured Data
Stored in document databases
Unstructured Data
Examples of structured data
Spreadsheets
Data tables
Examples of unstructured data
Images
Videos
Sound
Text
Describes something with numbers
Quantitative
Can be measured or counted
Quantitative
Wider range of statistics and analysis methods
Quantitative
Describes something with categories
Qualitative
Can be observed
Qualitative
More restricted range of statistics and analysis methods
Qualitative
allows the user to store, retrieve, and access the data
Database management system (DBMS)
Different type of databases
Relational vs. document databases
Data warehouse vs. data lake
Document databases stores what type of data?
Unstructured data