Module 2 - Understand the power of data Flashcards
What is data inspired decision making?
Explores different data sources to find out what they have in common
What is an algorithm?
A process or set of rules to be followed for a specific task
What are the potential dangers of relying entirely on data-driven decision making?
- over-reliance on historical data
- a tendency to ignore qualitative insights
- potential biases in data collection and analysis
Quantitative Data
Specific and objective measures of numerical facts
Qualitative Data
Subjective or explanatory measures of qualities and characteristics
What are the two main types of data visualizations?
Reports and Dashboards
What is a report?
Static collection of data given to stakeholders periodically
What is a dashboard?
Monitors live, incoming data
Report pros and cons
Pros:
1. Designed and sent out periodically
2. High level historical data
3. Easy to design
4. Pre-cleaned and sorted data
Cons:
1. Continual maintenance
2. Less visually appealing
3. Static
Dashboard pros and cons
Pros:
1. Dynamic, automatic, and interactive
2. More stakeholder access
3. Low maintenance
4. Nice to look at (visually appealing
Cons:
1. Labor-intensive design
2. Can be confusing
3. Potentially uncleaned data
Pivot table
A data summarization tool that is used in data processing. Pivot tables are used to summarize, sort, reorganize, group, count, total , or average data stored in a database
Metric
Single, quantifiable type of data that can be used for measurement
Revenue
Number of sales x sales price
Return on Investment (ROI)
Net profit / Time + Cost of Investment
Customer retention rate
Companies ability to keep its customers over time
Metric Goal
A measurable goal set by a company and evaluated using metrics
Dashboard Centralization Benefit
Data Analysts:
- Share a single source of data with all stakeholders
Stakeholders:
- Work with a comprehensive view of data, initiatives, objectives, projects, processes, and more
Dashboard Visualization Benefit
Data Analysts:
- Show and update live, incoming data in real time
Stakeholders:
- Spot changing trends and patterns more quickly
Dashboard Insightfulness Benefit
Data Analysts:
- Pull relevant information from different datasets
Stakeholders:
- Understand the story behind the numbers to keep track of goals and make data-driven decisions
Dashboard Customization Benefit
Data Analysts:
- Create custom views dedicated to a specific person, project, or presentation of the data
Stakeholders:
- Drill down to more specific areas of specialized interest or concern
What are the four dashboard benefits?
- Centralization
- Visualization
- Insightfulness
- Customization
What are the three most common categories of business dashboards?
- Strategic: focuses on long term goals and strategies at the highest level of metrics
- Operational: short-term performance tracking and intermediate goals
- Analytical: consists of the datasets and the mathematics used in these datasets
Characteristics of small data
- Describes a dataset made up of specific metrics over a short, well-defined time period
- Usually organized and analyzed in spreadsheets
- Likely to be used by small and midsize businesses
- Simple to collect, store, manage, sort, and visually represent
- Usually already a manageable size for analysis
Characteristics of big data
- Describes large, less-specific datasets that cover a long time period
- Usually kept in a database and queried
- Likely to be used by large organizations
- Takes a lot of effort to collect, store, manage, sort, and visually represent
- Usually needs to be broken into smaller pieces in order to be organized and analyzed effectively for decision-making
The four Vs of big data
- Volume - the amount of data
- Variety - the different kinds of data
- Velocity - how fast the data can be processed
- Veracity - the quality and reliability of the data
Attribute
An attribute is a characteristic or quality of data used to label a column in a table
Operator
A symbol that names the type of operation or calculation to be performed
Examples: =, +
Cell reference
A cell or a range of cells in a worksheet that can be used in a formula
Range of cells
A collection of two or more cells
DIV/0 error
A formula is trying to divide a value in a cell by 0 or by another empty cell
ERROR (in Google Sheets only)
A formula can’t be interpreted as input (also known as a parsing error)
N/A
Data in a formula can’t be found by the spreadsheet
NAME?
A formula or function name isn’t understood
NUM!
A formula or function calculation can’t be performed as specified
VALUE!
A general error that could indicate a problem with a formula or referenced cells
REF!
A formula is referencing a cell that is no longer valid or has been deleted
Function
A preset command that automatically performs a specific process or task using the data
Problem Domain
The specific area of analysis that encompasses every activity affecting of affected by the problem
Structured Thinking
The process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying the options
Scope of work (SOW)
An agreed-upon outline of the work you’re going to perform on a project
Statement of work (SOW)
A document that clearly identifies the products and services a vendor or contractor will provide to an organization. It includes objectives, guidelines, deliverables, schedule, and costs.
Context
The condition in which something exists or happens
Stakeholders
People that have invested time, interest, and resources into the projects you’ll be working on as a data analyst