WEEK 2: Variance and Sampling Flashcards

1
Q

Learning objectives:

A
  • Understand and explain the difference between a standard deviation and a standard error
  • Understand and explain a 95% confidence interval
  • Understand and explain error bar graphs
  • Understand and explain why a t value less than 1 means there is no effect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is variance?

A

It is the variability of the data, how spread out the data is around a certain point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is variance calculated?

A

Calculated by determining how much each score differs from the mean average of the sample, squaring each value, then adding then all up and dividing by the number of scores

Squaring the values accounts for there being both negative and positive values

Dividing by n gives the variance in the sample (when using whole population)

Dividing by n-1 gives an estimate of variance in the population when working with a sample of a population

It is difficult to see how variance values relate to the measure you have (the dependent variable) so you take the square root in order to get back to where you started before squaring everything - this is standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is standard deviation?

How is it calculated?

A

Standard deviation determines how data is dispersed around the mean in a comparative unit of measurement, it is the square root of variance

Gives a unit of measurement to determine outliers e.g. ±1SD, ±2SD, any data above 3SD either side of the mean is a statistical outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the aims of stats?

A
  • To generalise from a sample
  • If you could test an entire population there would be no need for inferential stats as any effects found would be found in the entire population
  • We usually take a sample from a population and test it using stats to assess the probability that any effects found will also be found in the whole population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is sampling error?

A

Sampling from a population introduces an element of error

We need to know how much error there is in the sample data (how much the data differs from that we would see in the entire population)

We estimate the amount of deviation between the population and the sample to get standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is standard error?

A

The estimated amount of ‘deviation’ between the population and the sample, by using the SD of the sample as we do not have lots of samples from the population, allows us to compare effects against error

Its the standard deviation of the sampling distribution

Standard error of the mean = Sample SD / √n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is sampling distribution?

A

So say we take several samples from a population and calculate their mean values e.g. how fast on average a person drives in mph. These mean values would be different in each sample.

We put these means values into a histogram and see if they’re normally distributed (form a bell curve). This is the sampling distribution.

To estimate the deviation we will find between our sample and the population we need to calculate the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The standard deviation of the sampling distribution…

A

So imagine the mean of each sample is just one data point, it becomes it own sort of larger experiment concentrated down - we calculate the standard deviation in exactly the same way we would for a normal experiment.

SO once again…

scores divided by n = mean

Then we see how far the scores are from the mean

Variance - deviation between the scores and the mean; squared, totalled = sum of squares

Divide sum of squares by n-1 = the mean square

Standard dev - square root of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Definitions for variance shit:

A

Variance - deviations between the scores and the mean, added up and squared (gives the sum of squares)

SD - square root of variance

Mean square - the sum of squares (variance added up and squared) divided by n-1

Standard error = Sample SD / √n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Effect vs Error

A

Effect is divided by the standard error (Effect/ Error), compares error to effect, establishes whether you have more effect than error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Effect

A

Effect is the mean difference between scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Error (Revisited)

A

Standard error –> So the sample standard deviation divided by the square root of the number of observations

This shows how much our sample mean deviates from the population mean. Its the standard deviation of the sampling distribution of the mean

**Looking at the T-test output:

  • -> t statistic = mean 1 - mean 2 (Effect) / standard error of the differences
  • -> If error is larger than effect the t value will be less than one
  • -> meaning the amount of deviation that would be expected between a sample and the population is larger than your effect (mean difference between scores)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Socrative answers

A
  1. What does SD represent?
    The standardised amount difference between scores and the mean
  2. Why do the graphs for SD vs SE look so different?
    So SD is measuring the standardised amount of difference between scores and the mean within a SAMPLE, and SE is an estimation of how much the sample mean deviates from the POPULATION mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Error bars and confidence intervals

A

Error bar graphs with 95% confidence intervals - has upper value, lower value and mean on the bars

When looking at 95% confidence intervals we assume that the data is normally distributed, and that 95% of the data will fall 2SD from the mean.

**95% CI and Standard error

So we can be 95% confident that the population mean will fall between the upper and lower boundaries

–> How much do the groups overlap?

small effect/ large error = large overlap

large effect/ small error = no overlap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Certain and Uncertain Data

A
  • If the amount of error introduced by the sampling method is large, you will have a large standard error.
  • If you have a large standard error you are less likely to find an effect (because the error will cancel out an effect)
  • If you have a small sample size the data will be more variable and tend to have larger standard errors
  • -> This results in uncertain data
  • Larger sample sizes tend to be more normally distributed, less variable and have smaller standard errors
  • -> This results in more certain data
17
Q

Key points on sample size

A
  • The sample must be normally distributed
  • The more normal the distribution the better the estimate of error
  • The size of the sample must be sufficient to result in a normal distribution
  • The larger the better