Class 3 Spring π· Flashcards
What are the three measures of Central Tendency?
Mean, median, mode
What are the measures of Dispersion?
Range, IQR, variance, standard deviation
What type of data is the mode primarily used for?
Categorical data
What is the definition of βmodeβ?
The value with the most occurrences in the data set
What is the formula to calculate the range?
Highest number - Lowest number
What is variance represented by?
sΒ²
What is the relationship between variance and standard deviation?
Standard deviation is the square root of variance
What does the Interquartile Range (IQR) measure?
Dispersion related to the median
How is the median represented in a Box-and-Whisker plot?
A dark line denoting the median
What percent of data falls between Q1 and the median in a Box-and-Whisker plot?
50%
What is a characteristic of a right (positive) skewed distribution?
Tail on the right side
Fill in the blank: The _______ is the average squared distance from the mean.
Variance
What is the typical distribution of data within one standard deviation of the mean?
About 70%
What type of data is IQR used with?
Numerical data
What does a Box-and-Whisker plotβs whiskers represent?
Data outside of the box attempting to capture the spread
True or False: Outliers are defined by hard-and-fast rules.
False
What is the purpose of identifying outliers in data?
Useful for various reasons in statistics
What are the shapes/modalities that a distribution can have?
Uniform, unimodal, bimodal, multimodal
What does skewness describe in a dataset?
Asymmetry of the distribution
Common examples of right skewed data include _______.
Peopleβs incomes, house prices, number of accident claims
What is the measure of centrality that is primarily used for numerical data?
Mean and median
What is the main question to answer when describing a dataset regarding central tendency?
Where is the βmiddleβ of the dataset?
What is the primary measure of dispersion for categorical data?
Range
What does the term βdeviationβ refer to in statistics?
Distance from the mean
What is the first step in building a Box-and-Whisker plot?
Drawing a line denoting the median
Fill in the blank: The _______ is the typical deviation of observations from the mean.
Standard deviation
What percent of data typically falls within two standard deviations of the mean?
About 95%
What statistical notation is used for the standard deviation of a sample?
s
What is a common method to visualize median and IQR?
Box-and-Whisker plots
What is the significance of the first and third quartiles in a Box-and-Whisker plot?
They define the boundaries of the box representing the middle 50% of the data
What is a common example of right/positively skewed data?
Peopleβs incomes
Other examples include mileage on used cars, reaction times, house prices, and number of accident claims.
What is a common example of left/negatively skewed data?
Number of fingers
Most people have ten fingers, but some may lose one or more. The age at death in wealthy countries is also negatively skewed.
What are two top choices for visualizing skewed data?
- Histograms
- Box-and-whisker plots
In a skewed distribution, where does the mode typically lie?
Under the peak of the distribution.
What happens to the mean in a skewed distribution?
The mean gets pulled in the direction of the skew.
What is the relationship between skewness and the difference between the mean and median?
The greater the skewness, the greater the difference between the mean and the median.
If the data are skewed, which measure of central tendency may not provide a good estimate?
The mean.
Fill in the blank: The median and IQR are only sensitive to numbers near _______.
Q1, the median, and Q3.
What is the interquartile range (IQR)?
A measure of statistical dispersion.
Which measure is likely more useful for understanding a typical individual loan?
The median.
Which measure is likely more useful for understanding the total amount needed for 1,000 loans?
The mean.
True or False: In very skewed data, the mean provides a good estimate of the data center.
False.
What happens to the mean and median in right-skewed data?
Median < Mean.
What happens to the mean and median in left-skewed data?
Mean < Median.
What is the summary statistic for centrality of data in symmetrical data?
Mean.
What is the summary statistic for data spread?
Standard deviation.
What statistical tools may not be usable with skewed data?
- t-test
- ANOVA
What does the median represent in skewed data?
A better estimate of the center than the mean.
What is a characteristic of robust statistics in relation to skewness?
They are stable in the presence of extreme observations.
What are examples of potentially skewed datasets?
- Sea Turtle Sizes
- Stats Test Scores
- Swim Times