Unit 5 Flashcards
4 Types of Data Visualization
- Bar Charts
- Histograms
- Cross Tabs
- Scatter Plots
Data wise, what are bar charts good for?
- To find present values, the range, clusters of values, etc in 1 column
- Find how data is structured
- Few values
Data wise, what are histograms good for?
- To find present values, the range, clusters of values, etc in 1 column
- Find how data is structured
- Many values
Data wise, what are cross tabs good for?
- To find relationships and patterns across different columns to identify changes to one value impact another
- String values and few values
Data wise, what are scatter plots good for?
- To find relationships and patterns across different columns to identify changes to one value impact another
- No strings but there’s many values
What do single column visualizations help with?
Evaluating data structure
What do two column visualizations help with?
Finding the relationship and patterns between values
Crowdfunding
the practice of obtaining funds or money from a large number of people via the Internet
Causation
result or cause of the relationship between two pieces of data
crowdsourcing
the practice of obtaining input or information from a large number of people via the Internet
Correlation
a relationship between two pieces of data, typically referring to the amount that one varies in relation to the other.
Crowdsourced data
Collecting data from others so you can analyze it
Others give you data for you to analyze
Open data
Sharing data with others so they can analyze it
You give data to others for them to analyze
Big data
Collect huge amounts of data to learn from it
You collect data to learn from it
Citizen science
scientific research conducted in whole or part by distributed individuals, many of whom may not be scientists, who contribute relevant data to research using their own computing devices
Cleaning data
a process that makes the data uniform without changing its meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word). (
Filtering data
choosing a smaller subset of a data set to use for analysis, for example by eliminating / keeping only certain rows in a table (does not change data, but looks at a specific part of it)
Data Bias
data that does not accurately reflect the full population or phenomenon being studies