Lesson 3: Data Visualization Flashcards
Sources of Data
- Manual
- Automated
__________ is a measure that captures the typical user’s experience.
Median
__________ is a measure of symmetry
Skewness
We use ____________ if the NUMBER OF DATA” is BIGGER than the range.
Frequency distribution
We use _______________ if the RANGE is bigger than the number of data.
Histogram
Title
Horizontal axis = Bins
Vertical axis = Frequency
Overflow
Underflow
Histogram Distribution
________ is used for qualitative particularly nominal value
Mode
____________ The statistical measure that identifies a single value as representative of an entire distribution.
Central Tendency
____________ A graph or dataset organized to show the frequency of occurence of each possible outcome of a repeatable event observed many times.
Frequency Distribution
The middle number in a sorted ascending or descending list of numbers.
Median
A type of distribution in which more values are concentrated on the right side of the distribution graph while the left tail of the distribution graph is longer.
Negatively skewed
A set of data, difference between the largest and smallest values
Range
The value that appears most frequently in a dataset.
Mode
Excel formula to get the mean of a dataset
=AVERAGE (defined dataset)
The distributions that occur when the long tail is on the right side of the distribution.
Positively Skewed
The average of the given set of values
Mean
A chart that plots the distribution of a numeric variable’s values as a series of bars.
Histogram
> +1
Positively skewed
<-1
Negatively skewed
(-1, +1)
Relatively Symmetric
How to find the typical Value?
- Get the range
- Central Tendencies
- No. of data
- Check if it is Frequency distribution or Histogram
- Visualize
Steps for Histogram distribution
- Define column
- Find range
- Count data set
- If range is bigger than the data = Histogram - Highlight DATA SET/ Define column
- Insert - Histogram
Steps for Frequency Distribution
- Get the range
- Check if frequency or Histogram =COUNT(Dataset)
- Create a list from Minimum value to Maximum Value
- Get the Frequency “Frequency of (name of data) =COUNTIF (Dataset, specific value)
- Highlight both column minimum & maximum list and the frequency of dataset
- Insert - frequency distribution
- Since we only need the frequency of a dataset, delete the min&max list
Format of skewdness
=SKEW (Dataset)
answer,
comment if within (-1,1) symmetrical distribution, Typical Value = mean
if <1 positively skewed, Typical Value = median
if <-1 negatively skewed, Typical Value = median
Visualization Histogram: Positively skewed.
•Skewed to the right
•The right tail is longer
•Mean dragged to the right
•Values of data extend to the right
Visualization Histogram: Negatively skewed.
•Skewed to the left
•The left tail is longer
•Mean to the left of the median
•Values of data extend to the left
Visualization Histogram: Symmetric Distribution
the mean, median, mode almost coincide
Scatter Plots and Correlation Examples:
- Positive Correlation
- Negative Correlation
- No Correlation
Scatter Plot Relationship
- Positive Relationship
- Negative Relationship
- No Relationship