Lecture 2 Flashcards
central tendency and dispersion
what are the measures of tendencies
-Mean
-Mode
-Median
Mean- Define, how to calculate
Definition: the average value of the set.
How to calculate: add up all samples then divide by sample size. (amount of samples)
Explain variation with sample size (N). (Mean)
Variation with sample size: as sample size increases it gets closer to the population mean
Median define and calculate
Definition: is the number in middle of the sample (ranked in order)
How to calculate: The middle sample when the samples are ranked in order. (if there are two middle numbers add them then divide by 2)
Explain variation with sample size (N). (median)
Variation with sample size: as sample size increases it gets closer to the population median
Mode define and how to calculate
Definition: the the most repeated or frequent number that appears
How to calculate: find the most repeated value.
Explain variation with sample size (N).(Mode)
Variation with sample size: as sample size increases it gets closer to the population mode
what are the measures of dispersion?
-Range
-Interquartile range
- Standard error (SE)
-Standard deviation (SD)
- coefficient of variation (CV)
Define and how to calculate Range
Definition: the Max to min value
How to calculate: Maximum value - Minimum value
Explain variation with sample size (N). (Range)
Variation with sample size: as sample size increases it gets closer to the population range
explain variation with magnitude of the mean (μ) (Range)
Variation with magnitude of the mean (μ) : larger magnitude with higher mean
Note:a larger range with a higher mean indicates that the data points are spread further apart from each other, even though the central tendency (represented by the mean) is shifted towards a higher value
Define and calculate interquartile range
Definition: first to third quartile
Calculation: median of the lower half of the data (Q1) - median of the upper half of the data (Q3)
Variation with sample size (Interquartile range-IQR)
Variation with sample size: as sample size increases it gets closer to the populations IQR
Variation with magnitude of mean (μ)
(IQR)
Variation with magnitude of mean (μ): larger magnitude with higher mean Note: it tells you how much variation exists within the middle half of your data,
Standard deviation (SD) define and what does it indicate for a data set?
Definition: The average distance to the mean.
Indicates how much variation exists within a dataset; a low standard deviation means data points are close to the mean, while a high standard deviation means data points are spread further from the mean.
Left skewed distributions
the peak is on the right and it has a long tail on the left.
mean<median<mode (on x-axis)
Symmetrical distribution
left and right sides mirror each other
mean=median=mode
right skewed/Poisson distribution
the peak is on the left side and has a tail on the right .
mean>median>mode (on the x)
Uniform distribution
a probability distribution where all possible outcomes are equally likely
Bimodal distribution
a type of probability distribution that exhibits two distinct peaks indicating the presence of two separate groups within the same dataset.
Multimodal
Has more than two peaks on a graph, indicating multiple clusters or distinct groups within the data.
Standard error
a measure of the statistical accuracy of an estimate. The expected average difference between your sample
mean and the population mean.
Which of the 3 measures of dispersion is negatively related to sample size?
(smaller dispersion for larger sample sizes)
Standard error
which of the measures of dispersion does not get bigger when when the mean of the sample size gets bigger?
Coefficient of variation (CV)
Coefficient of variation
The SD controlling for the size of the mean; or the SD of
the sample if it’s mean were 1.0