Wk 8: All About Means Flashcards
4x participants will ______ (equal/double/triple/half) the accuracy of the results because the variance of the difference of the means between the two groups will halve.
double
What are 3 steps to measure the quality of prediction value?
- Measure prediction errors - the difference between your prediction value and each existing data point.
- x = prediction value (can be any number)
- x̅ = prediction value that minimises the sum of squared prediction erros = sample mean
- Square each prediction errors (so they will be positive numbers for differentiation later)
- Add them all together: When the sum of squared prediction error is 0, it means the prediction value is most accurate.
What is the equation of sample mean?
- x̅ = sample mean
- x1, x2 etc. = value of each data point
- n = number of data
What is the equation of sample standard deviation?
The sample mean is an ______ of the population mean
unbiased estimator
What are the 3 features of smaple mean as an unbiased estimator of the population mean?
- The variability (spread) of this estimate becomes smaller as the sample size (n) increases (proportional to 1 = √n).
- This implies that the sample mean is a more precise estimator of the population mean for larger samples.
- The shape depends on sample size. The larger the sample size, the more symmetrical and concentrated the shape.
What are 2 features of robustness?
- Outliers can affect the sample mean so always visualise data first (boxplots) before analysis.
- Median is a more robust measure of centrality and skewed distributions
Outliers can affect the _____ so always visualise data first (boxplots) before analysis.
sample mean
_______ is a more robust measure of centrality and skewed distributions
Median
What are 3 features of outliers?
- Note the outliers and dig into it
- If they represent the population as an atypical member, then explore why they are a outlier
- If they do not represent the population, then exclude them from analysis
What are 2 features of smaple variance?
- Range
- Interquartile range (suitable for more data)
Sum of squared prediction errors is ______. What does that mean?
sample variance
- But if there is more data, then the sum of squared prediction errors will increase, therefore we need normalisation
What is normalisation?
Divide the sum of squared prediction errors by n-1.
What are 3 features of normalisation?
- As the standard deviation from sample mean is always 0, we could deduce the value of the last one, so there is n-1 pieces of independent information.
- Degrees of freedom: The number of independent pieces of information.
- Need to take square root afterwards so the units are not squared.
What is parameter?
a numerical characteristic of a population (e.g. Height, weight).