Wk 8: All About Means Flashcards

1
Q

4x participants will ______ (equal/double/triple/half) the accuracy of the results because the variance of the difference of the means between the two groups will halve.

A

double

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 steps to measure the quality of prediction value?

A
  1. Measure prediction errors - the difference between your prediction value and each existing data point.
    • x = prediction value (can be any number)
    • x̅ = prediction value that minimises the sum of squared prediction erros = sample mean
  2. Square each prediction errors (so they will be positive numbers for differentiation later)
  3. Add them all together: When the sum of squared prediction error is 0, it means the prediction value is most accurate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the equation of sample mean?

A
  • x̅ = sample mean
  • x1, x2 etc. = value of each data point
  • n = number of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the equation of sample standard deviation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The sample mean is an ______ of the population mean

A

unbiased estimator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 features of smaple mean as an unbiased estimator of the population mean?

A
  1. The variability (spread) of this estimate becomes smaller as the sample size (n) increases (proportional to 1 = √n).
  2. This implies that the sample mean is a more precise estimator of the population mean for larger samples.
  3. The shape depends on sample size. The larger the sample size, the more symmetrical and concentrated the shape.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 2 features of robustness?

A
  1. Outliers can affect the sample mean so always visualise data first (boxplots) before analysis.
  2. Median is a more robust measure of centrality and skewed distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Outliers can affect the _____ so always visualise data first (boxplots) before analysis.

A

sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

_______ is a more robust measure of centrality and skewed distributions

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are 3 features of outliers?

A
  1. Note the outliers and dig into it
  2. If they represent the population as an atypical member, then explore why they are a outlier
  3. If they do not represent the population, then exclude them from analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are 2 features of smaple variance?

A
  1. Range
  2. Interquartile range (suitable for more data)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sum of squared prediction errors is ______. What does that mean?

A

sample variance

  • But if there is more data, then the sum of squared prediction errors will increase, therefore we need normalisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is normalisation?

A

Divide the sum of squared prediction errors by n-1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 3 features of normalisation?

A
  1. As the standard deviation from sample mean is always 0, we could deduce the value of the last one, so there is n-1 pieces of independent information.
  2. Degrees of freedom: The number of independent pieces of information.
  3. Need to take square root afterwards so the units are not squared.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is parameter?

A

a numerical characteristic of a population (e.g. Height, weight).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a statistic?

A

We use statistics to estimate unknown population parameters.

17
Q

________ describes the shape of a population distribution

A

Normal (Gaussian) distribution

18
Q

What are 3 features of Normal (Gaussian) distribution?

A
  1. Describe the distribution of observations
  2. Describe the distribution of statistics (e.g. Sample mean, sample proportion
  3. Normal distribution is determined by mean (location) and standard deviation (spread).
19
Q

What does Normal quantile-quantile plot compare to?

A

the quantiles from our data against the quantiles of the theoretical normal distribution.

  • If out data is normally distributed, it should lie on the black line on normal quantile-quantile plot.
20
Q

What is the 68-95-99.7 Rule?

A
  1. If out data is normally distributed, its histogram should follow a bell curve.
  2. Within 1 standard deviation of the mean are 68% of the values (area under the bell curve)
  3. Within 2 standard deviations of the mean are 95% of the values
  4. Within 3 standard deviations of the mean are 99.7% of the values
21
Q

What are 2 features of Central Limit Theorem?

A
  1. If the population is normally distributed, then the sample mean is normal for any sample size.
  2. If the population is not normally distributed, then the sample mean is still approximately normal, and gets more normal as the sample size increases
22
Q

What are 5 features of confidence intervals?

A
  1. Confidence interval is an interval estimate that might contain the true value of an unknown population parameter.
  2. Confidence level (usually 95%) is the probability that any data from the population will land within the confidence interval.
  3. Margin of error of the interval is one side of the confidence interval.
  4. Increased sample size will narrow the confidence interval.
  5. A higher confidence interval is wider on the bell curve.
    • 99% confidence interval is ~33% wider than a 95% confidence interval.
    • 100% confidence interval is infinitely wider than a 95% confidence interval.
23
Q

What is a confidence interval?

A

an interval estimate that might contain the true value of an unknown population parameter.

24
Q

What is a confidence level?

A

(usually 95%) is the probability that any data from the population will land within the confidence interval.

25
Q

What is a margin of error of the interval?

A

one side of the confidence interval.

26
Q

Increased sample size will _____ (narrow/widen) the confidence interval.

A

narrow

27
Q

A higher confidence interval is ____ (wider/narrower) on the bell curve. What is the confidence interval?

A

wider

  • 99% confidence interval is ~33% wider than a 95% confidence interval.
  • 100% confidence interval is infinitely wider than a 95% confidence interval.