STATS Flashcards

1
Q

what is the average deviation?

A

calculates the average deviation from the mean across all scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how do you calculate the average deviation?

A
  1. calc mean
  2. calc deviation of each score
  3. calc mean deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you calculate the standard deviation?

A
  1. calc mean
  2. calc deviation of each score
  3. square your deviations
  4. calc variance - calc mean using total N-1
  5. square root answer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a sampling error?

A

error associated with examining stats from a sample rather than a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a sampling distribution?

A

the spread of a sample statistic obtained by representative samples from a population - tells us how a statistic changes between samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the sampling distribution of the mean also known as?

A

standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does the standard error tell us about the means of a sample from a population?

A

tells us if a sample is representative of its population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the characteristics of a normal distribution curve?

A
  • bell shaped
  • area = 1
  • symmetrical, centered on pop mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are z scores?

A

the position of a raw score in terms of its distance from the mean when measured in standard deviation units from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how would you answer this question: ‘what is the probability of picking someone with an IQ of more than 112 from a normally distributed population (100,15)?

A
  1. calc z score = x - (pop mean/pop sd)
  2. use p table to find probability value
    as this is a more question, you must use the below column, unless it has a negative sign which means you must use the opposite column
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how would you answer this question: ‘what is the probability of picking someone with an IQ between 70 and 115 from a normally distributed population (100,15)?

A
  1. calculate z scores for both values = x - (pop mean/pop sd)
  2. find p scores of both
  3. subtract them from each other to get the area
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

would we expect a larger or smaller standard error with a bigger sample?

A

smaller standard error, more representative of parent population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how do you calculate the standard error?

A

s.d of parent pop/ rootN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does a 95% confidence interval tell you?

A

if we repeated our sampling over, calculated a new confidence interval each time using a new sample mean, the population mean would land in the intervals 95% of the time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how do you calculate a 95% confidence interval?

A

UPPER = pop mean -(1.96 x standard dev)
LOWER = pop mean + (1.96 x standard dev)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when you don’t have the population mean or standard dev?

A

use sample mean or standard dev to get an indication of what the population parameters would look like

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is null hypothesis significance testing?

A

works out if a sample mean is different from the theoretical population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is a type 1 error?

A

incorrect rejection of a true null - even if p is small, data was unusually extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is a type 2 error?

A

not rejecting a null, even though it was false
- if sample somehow biased, or unrepresentative that didn’t accurately represent data

20
Q

why do we set the significance value at 0.05?

A

if it was lower (i.e 0.35) data would always seem extreme and significant = more T1
if it was higher (i.e 0.00001) it would be too conservative and data would always seem insignificant = more T2

21
Q

what is a ‘p value’?

A

The p-value for a particular inferential statistical test is the probability of finding the pattern of results in a particular study if the relevant null hypothesis were true. This is a conditional probability.

22
Q

what is the alpha value is psychological testing?

A

Alpha (α) is the criterion for statistical significance that we set for our analyses. It is the probability level that we use as a cut-off below which we are happy to assume that our pattern of results is so unlikely as to render our research hypothesis as more plausible than the null hypothesis.

23
Q

summarise the logic behind null significance hypothesis testing:

A
  1. involves calculating the probability of observing a difference/relationship in a population if the null hypothesis were true
  2. to do this, we get a sample and calculate the probability that its sample mean is closer or largely different from the population mean
  3. if the population mean is largely different, this suggests that there is a small probability that this occurred by chance/standard error alone, meaning we must reject the null and assume that either their is no relationship in our population or that this sample is statistically significant
24
Q

how does null significance hypothesis testing work? (2 key points)

A
  • we measure the relationship between variables from one sample and work out the probability of obtaining a relationship by sampling error alone if the null were true. If the probability is small, we conclude that a genuine difference exists in the population.
  • we measure the relationship between variables from one sample and measure the probability of obtaining a relationship by sampling error alone, we assume that a genuine relationship exists in the population
25
Q

if a researcher concludes that his data is statistically significant (i.e probability of observing change lower than 0.05%), what does this actually mean?

A

means if the researcher conducted the study 20 times, only one observed difference would have
occurred by chance

26
Q

how would you use null significance hypothesis testing to tackle the following problem:
- Test which measures statistics ability in the 1st year UK undergrads
- The population test scores follow N(67.5, 8.3)
- Hypothesised that your cohort is better at stats than the general UK 1st year undergrad population
- You check whether a sample mean of test scores looks significantly better than the a typical sample of the same size coming from that population.

A
  1. form research hypothesis and null hypothesis
    - null = the sample doesn’t have a typically high mean and therefore came from a population with mean of 67.5
    - research = the sample mean is significantly higher and therefore did not come from a sample with this mean, or there is no relationship within the sample
  2. collect data
    - sample mean = in this case it’s 70.7
    - calculate the standard error (parent pop s.d/rootN) in this case it’s 1.66
  3. evaluate inconsistency with null
    - we want to see how close it is with the population mean, therefore we calculate a z score
    - (70.7-67.5)/1.66 = 1.93 (NB: we use 1.66 (the sample distribution), not the population distribution as this is data representing a sample, not individual scores
    - use the ‘p’ table to calculate the probability that another score would fall ABOVE our sample mean = 0.0268
  4. do we accept or reject
    - to reject the null, the probability must be smaller than 0.05 that we would obtain another result as higher or higher
    - in this example, we can reject the null and conclude that this sample is different/came from a population with a mean greater than 67.5
27
Q

how would you use null significance hypothesis testing to tackle the following problem:
- Test which measures statistics ability in the 1st year UK undergrads
- The population test scores follow N(67.5, 8.3)
- Hypothesised that your cohort will differ in stat scores than the general UK 1st year undergrad population
- You check whether a sample mean of test scores looks significantly different than the a typical sample of the same size coming from that population.

A
  1. form your research hypothesis
    - as this is a non-directional hypothesis
    - null = there is no difference
    - research = there is a difference
  2. collect your data
    - calculate sample mean = 70.7
    - calculate SDM
    (as this is 2 tailed, the equi score on the other side of pop mean is 64.3, but as this is normally distributed data the area on either side of these scores are the same)
  3. evaluate inconsistency
    - calculate z score = (70.7-67.5)/1.66 = 1.93 (NB: we use 1.66 (the sample distribution), not the population distribution as this is data representing a sample, not individual scores
    - use the ‘p’ table to calculate the probability that another score would fall ABOVE our sample mean on one side, or BELOW our equiv sample mean on the other side = 0.0268
    - as this is a non-directional, we double the p score (as there is double the are to consider)
    = 0.0536 (5.36%)
  4. accept or reject the null
    - p is larger than 0.05
    - can’t reject null hypothesis because it’s likely that we could have got a sample mean this extreme even if there was a relationship between out sample and the parent population
    - we fail to reject the null hypothesis; we can’t say we have evidence to suggest that our cohort is different in stats ability compared to the UK population of 1st year UG
28
Q

when are t distributions used?

A

act as a stand in for SND when we don’t know the population parameters

29
Q

what do t distributions and normal distributions have in common?

A

bell-shaped, symmetrical, tails are similar but heavier and arch is lower, area under =1

30
Q

how do you calculate a t score?

A

o T = (sample mean – population mean)/(sample deviation/root of N aka the estimated SDM)

31
Q

what do t scores tell us? what are some key points about this score?

A

estimation of what population parameters could look like
will give us something that looks like a z-score, but it won’t follow a standard normal distribution, but rather a T distribution
This is where the parameters of this distribution = (v=N-1) – where v = the parameters, and N is sample size
So, if there was a sample size of 10, the t-distribution would follow a v parameter of 9

32
Q

as the sample size increases, the t distribution becomes _____ to the SND

A

more similar

33
Q

why is t-distribution used for small sample sizes?

A

if the size of the sample is more than 30, then the distribution of the t-test and the normal distribution will not be distinguishable - t test will be too big

34
Q

how would you answer this question: We have a sample of 6 people from a normal population. What is the probability that the t score will be greater than 3.0? (i.e more than 3 estimated standard errors above the population mean?)

A
  1. Calculate your v parameter
    o V = 5 (N-1)
  2. Look at the area you are trying to calculate (demonstrated by diagram).
  3. Use your critical values table to calculate the area (in this case to the right of 3)
    o As there is no critical value ‘3’ in our table, we have to use the closest possible value
    o We know that a t-value f 2.571 isolates 2.5% of the area.
    o We also know that he value 3.365 isolates 1% of the area in the top tail.
    o Therefore we can say for sure that he probability of the t-score being greater than 3 is more than 0.025 but less than 0.01
35
Q

how do we calculate the 95% interval when we don’t have the population parameters?

A
  • use a t score (rather than z statistic)
  • , if we repeated out data collection with new, but same sized samples, 95% of repeats with sample mean m would be within:
    o Some number ‘c’ estimated standard errors of the sample mean
    o ‘C’ being the value that isolates 2.5% of the area in the top tail according to out sample size (rather than 1.96 as we can’t use z-scores)
36
Q

how do you calculate a 95% confidence interval using t distributions?

A

goes from: sample mean - (c x sample s.d/root N)
to: sample mean + (c x sample s.d/rootN)
key:
equation in brackets = estimated sample error
this is the equation for confidence interval centered on a sample mean, as we don’t now the parameters of the parent pop

37
Q

how would you answer this question: You have a sample of N=6 with a mean of 7.33, sample s.d is 3.78 and the estimated standard error is 1.54. What is the 95% confidence interval for the population mean centred on the sample mean?

A
  • centered on the sample mean and parameters of parent pop not given, means a t distribution will be used
    1. confidence interval must go from:
    7.33 - (c x (3.78/root6)
    to:
    7.33 + (c x (3.78/root6)
    2. calculate value of c
    v parameter = 5 (n-1)
    using t-distributions table - go to V=5 and 0.05 significance = 2.571 (value that separates 2.5% in both tails, or just 5% in one)
    3. put into the formula
    7.33 – (2.571 x 1.54) to 7.33 + (2.571 x 1.54)
    LOWER END = 3.37, UPPER END = 11.30  95% (3.37, 11.30)
38
Q

how would you answer this question: : You are interested in anxiety levels in UK undergraduates and collect data on a well-established anxiety questionnaire. The mean of your sample is 10.4, the sd is 3.4383
Using the critical values table, calculate the 95% confidence interval for the population mean anxiety centred on the sample mean

A
  • no pop parameters and centered on sample mean = t distribution
    1. calculate the estimated standard error = 1.0872859279
    this means that the 95% CI goes from:
    10.4 - (c x 1.087) to
    10.4 + (c x 1.087)
    2. calculate v parameter
    = 9
    3. table at 0.05 significance
    = 2.262
    4. plug into the formula
    = 95% CI (7.94, 12.8594)
39
Q

what is a 1 sample t test?

A

statistical hypothesis test used to determine whether an unknown population mean is different from a specific value.
population parameters are unkown and therefore cannot conduct z test as part of NSHT

40
Q

how would you answer this question: Population mean height on Ziltodia 10 is known to follow a normal distribution with mean 24.5cm, the s.d in the population is unknown. The Ziltodian president asks you to test his concern that his population in shorter on earth than on their home planet.

A
  • we don’t have s.d = no z test and parent pop parameter are unknown.
  • means we must conduct a one sample t test
    1. formulate your hypothesis
    null = they are not shorter on earth and their sample came from a population with mean 24.5
    research = they are shorter on earth than their home planet and they came from a population with a mean less than 24.5
    2. collect data
    sample = 36
    sample mean = 23.5
    sample s.d (as pop sd is unknown) = 2.4
    estimated standard error = 0.4
    3. how inconsistent is data with the null
    can’t conduct z-test, t test used instead = (sample mean – s.d)/estimated standard error) = in this case it came to -2.5
    using table at v=35, we read that p is definitely less than 0.05
    4. accept or reject
    because p is less than 5%, we can reject the null in favour for the research hypothesis
41
Q

how would you answer this question: Angela wants to test the hypothesis that move scores for children with ADHD will be significantly higher than 0.
aka she is suggesting that her sample came from a population with mean larger than 0
her sample mean and s.d for a sample of 10 children are 0.2320 and 0.2625
conduct a hypothesis test and pick an answer of how this should be reported

A
  • pop parameters are not mentioned, meaning a t test must be used
    1. formulate hypothesis
    null = the sample did not come from a population with a score significantly higher than 0
    research = the sample did come from a population with scores significantly higher than 0
    2. collect data
    sample mean = 0.2310
    sample s.d = 0.2625
    estimated standard error = sample s.d/rootN = 0.08300978857
    3. calculate inconsistency
    cannot do z test due to unknown parameters, conduct t test instead
    sample mean - pop mean / estimated SE
    = (0.2310-0)/0.08300978857 = 2.78280434126 (ignore minus due to symmetry rule)
    using table at parameter =9 and 0.05 significance
    = t = 2.7831, p < 0.05
    4. reject or accept the null
    she is unable to reject the null hypothesis because the number is greater than 0.05
42
Q

when we fail to reject the null, does that mean we are accepting it as true?

A

no, the null cannot be proven.
it is just saying that in this case, we have not got evidence to prove their is statistically significant difference or relationship in the data

43
Q

what is the point in calculating the z score for sample means?

A

when you have multiple samples and want to describe the deviation of those sample means from the population mean

44
Q

how do you calculate the z score for parent population means
how do you calculate the z score for sample means?

A
  1. (x data point - parent pop mean)/sigma
  2. z = (sample mean - population mean)/(sigma/rootN)
45
Q

standard error must always be smaller than…

A

the standard deviation of the parent population