QM - Data Visualization Flashcards
- is the graphical representation of data through visual elements such as charts, graphs, maps, or timelines.
Data visualization
- Human brains decode information through patterns. Researchers found that human brains try to detect patterns in our environment all the time because it _____________.
makes learning easier.
- helps you break down, process, and present information in a visual context. This way, it takes less time and effort for the brain to digest the information than analyzing data tabularly.
Data visualization
The effectiveness of data visualization Research by _______ at Stanford University
Robert Horn
- % of participants made an immediate decision following presentations that used an overview map.
64%
- The use of visual language is proven to increase meeting effectiveness and efficiency, leading to % shorter meetings.
24%
- Groups using visual language have experienced a % increase in reaching consensus compared to groups that did not use visuals.
21%
- Combined verbal and visual communication increase credibility and influence rate to % compared to 17% when only verbal communication is used.
43%
Advantages of Data Visualization
* Easily sharing information.
* Interactively explore opportunities.
* Visualize patterns and relationships.
Disadvantages of Data Visualization
* Biased or inaccurate information.
* Correlation doesn’t always mean causation.
* Core messages can get lost in translation.
TYPES OF DATA VISUALIZATION
- tables
- pie charts and stacked bar charts
- line charts and area charts
- histograms
- scatter plots
- heat maps
- tree maps
- These graphs are divided into sections that represent parts of a whole. They provide a simple way to organize data and compare the size of each component to one other.
Pie charts and stacked bar charts
- : These visuals show change in one or more quantities by plotting a series of data points over time and are frequently used within predictive analytics. Line graphs utilize lines to demonstrate these changes while area charts connect data points with line segments, stacking variables on top of one another and using color to distinguish between variables.
Line charts and area charts
utilize lines to demonstrate these changes while area charts connect data points with line segments, stacking variables on top of one another and using color to distinguish between variables.
Line graphs
- : This graph plots a distribution of numbers using a bar chart (with no spaces between the bars), representing the quantity of data that falls within a particular range. This visual makes it easy for an end user to identify outliers within a given dataset
Histograms
- : These visuals are beneficial in reveling the relationship between two variables, and they are commonly used within regression data analysis. However, these can sometimes be confused with bubble charts, which are used to visualize three variables via the x-axis, the y-axis, and the size of the bubble.
Scatter plots
- : These graphical representation displays are helpful in visualizing behavioral data by location. This can be a location on a map, or even a webpage.
Heat maps
- , which display hierarchical data as a set of nested shapes, typically rectangles. Treemaps are great for comparing the proportions between categories via their area size.
Tree maps
Continuous Numeric Data
** * Appropriate charts: Histograms, box plots
* Appropriate descriptive statistics: Mean, Standard Deviation or Variance, Range (Maximum/Minimum), Median (more appropriate skewed data)**
- When your data is a set of numbers between a range. The data can take any value over that interval. If the interval is 1-10, then the data can take values of 2, 3.4, 7.6, .391234 and so on.
Continuous Numeric Data
- Is when your data fits into categories without ranks, for example: ‘Red’, ‘Green’, ‘Blue’, or ‘yes’/’no’. While the colors or response are different Red is not higher or lower than Green.
Nominal Categorical Data
Nominal Categorical Data
* Appropriate charts: Bar charts
* Appropriate descriptive statistics: Frequency table, mode
- This is when you have data that has distinct categories, that have an order to them, like ‘low, ‘medium’, ‘high’ setting on a machine or ‘3 months’, ‘4 months’, ‘5 months’ as distinct time units. Months could be Continuous if you were measuring time of survival in Months, but you could have them set, for example, as Month on a treatment, then they would be Ordinal.
Ordinal Categorical Data
Ordinal Categorical Data
* Appropriate charts: Bar charts
* Appropriate descriptive statistics: Frequency table, Median, 1st and 3rd Quartile
Relationship between 2 ordinal or nominal variables
* Appropriate charts: Grouped Barcharts, Side by Side Barcharts
* Appropriate descriptive statistics: Crosstabs
Relationship between 1 ordinal or nominal variable and 1 continuous variable:
*** Appropriate charts: Side by Side Boxplots, Stacked or Side by Side Histograms.
* Appropriate descriptive statistics: Grouped means and standard deviations
**
Data Visualization Best Practices
- know your audience
- chooose an effective visual
- keep it simple
- : Think about who your visualization is designed for and then make sure your data visualization fits their needs.
Know your audience(s)
- : Specific visuals are designed for specific types of datasets.
Choose an effective visual
- : Data visualization tools can make it easy to add all sorts of information to your visual.
Keep it simple