Stats only Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is important to consider when looking at sample size?

A
  • Size matters
  • Sampling error can result if your sample is not large enough
  • Trade off between size and time/cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Factors in deciding sample size?

A

o Design
o Response rate
o Heterogeneity of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a population parameter?

A
  • a quantity that describes some characteristic of a population with respect to a specific variable
  • E.g., population mean, population range etc.
  • Not usually possible to calculate
  • Might be given to you if available
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a sample statistic?

A

– a quantity that describes some characteristic of a sample with respect to a specific variable

  • E.g., sample mean, sample range etc.
  • We can always calculate these from a sample
  • Sample statistics provide an estimate of population parameters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is it important to summarise data?

A
  • Data can be very complex and therefore it is useful to summarise it
  • Allows for interpretation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are measures of central tendency?

A

They provide an indication of a “typical” score in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mean?

A

o Provides and estimate of the average score in the data set

o Is affected by extreme data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the median?

A

o Is insensitive to extreme scores in the data set

o Doesn’t reflect the shape of the scores e.g., doesn’t care how far away the extreme scores are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the mode?

A

o Easy to calculate from a histogram and easy to understand – the most common value
o Data might have more than 1 mode or no mode at all

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the range?

A

o Difference between min and max scores

o Range doesn’t always change for distributions with different shapes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a deviation?

A

o The signed distance of a score from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to calculate simple variance?

A
o	Calc mean
o	Calc deviations
o	Square deviations
o	Calc a slightly adjusted average squared deviation
        - You divide by n-1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the issue with simple variance?

A

potential issue is the units used, so if deviations are in hours, when squared the units would become hours squared which isn’t comprehendible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to calculate sample standard deviation?

A
o	Calc mean
o	Calc deviations
o	Square deviations
o	Calc sample variance
o	Take square root of sample variance – now back in comprehendible units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a histogram?

A
-	Good way to inspect data
o	Can see if there’s any odd-looking scores
o	Can see the mode
o	Can see how spread out the scores are
o	Can see how the data is distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a box plot?

A

Seems to be plotted vertically instead of horizontal?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a scatter plot?

A

shows the relationship between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a data summary plot?

A
  • Plot bar showing mean (categorical data) or line graph (numerical data)
  • Plot error bars showing +/- 1s.d.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is distribution of data?

A

the manner in which data for a particular variable is spread over its range is commonly referred to a its distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is normally distributed data?

A
  • Many naturally occurring variables are Normal
  • E.g., height, IQ (not naturally occurring but has been defined as this)
  • If we don’t have much data then the normality can be difficult to see in a histogram
  • As sample size increases, the normality will emerge
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is non-normal data?

A
  • Has a tail either to the right or left – skewed data
  • Positively skewed = long tail to the right, peaks at the left
  • Negatively skewed = long tail to left, peaks at the right
  • E.g., reaction time – tends to be positively skewed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the danger of non-normal data?

A
  • Danger – mean is distorted by the tails which are the more extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the danger of bimodal data?

A
  • Danger – mean is not representative

- Tends to suggest an issue with your experiment – more than one underlying population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is bimodal data?

A

Data that has two modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the normal distribution?

A
  • Bell-shaped
  • Symmetric about the centre
  • Tails never reach 0 – go towards infinity
  • The area under the centre is always equal to 1
  • Very close to 0 by the time it gets to 3 SD from the mean – can use this to draw a rough idea of a normal distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is probability?

A

– a measure of how likely it is that an uncertain event will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is conditional probability?

A
  • Probability of an event given that something else is known/assumed e.g., A|B
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a z-score?

A
  • Z measures how far away your sample is from the population mean in multiples of the SD
  • If you were to find z-scores for all points on a normal distribution, you would find that it would form a normal distribution with mean 0 and SD 1 – N (0, 1)
  • The area underneath a normal distribution above/below some variable value of x EQUALS the area underneath N (0, 1) above/below z
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How do you obtain a z-score?

A
  • Obtained by subtracting the population mean from x and then dividing by the population SD – (x-µ)/σ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is a SND table and how do you use it?

A
  • Table that provides values of areas underneath the SND in different ranges
  • Find z-score (first column) then decide if you want the area above or below this score
  • If z-score is negative, use the positive value in the table but be careful when choosing above or below because the scores will be flipped
    o E.g., z-score = -2 and you want the area below. On table you will use z-score 2 but use the area above
  • If you have a range that is bounded e.g., 70
31
Q

What is a sampling error?

A

Sampling error – the error associated with examining statistics calculated from a sample rather than the population

32
Q

Why do sampling errors occur?

A
  • It occurs because in our sample we don’t have all the members of the population
33
Q

What does the magnitude of a sampling error depend on?

A

The sample size

  • Bigger sample = big sampling error less likely
  • Smaller sample = big sampling error more likely
34
Q

How do we generate a sampling distribution?

A
  • Take a sample (size N) from a population
  • Calculate a sample statistic (e.g., mean, SD etc.)
  • Add the new statistic to a frequency plot (a histogram) of the sample statistic
  • Repeated the above 3 steps multiple times
35
Q

What does the sampling distribution tell us?

A
  • Tells us important info about how a statistic changes from sample to sample
  • What is the mean value of the statistic over all samples?
  • How variable is the statistic over all samples?
  • What shape is the distribution of the statistic over all samples?
36
Q

What are the properties of the sampling distribution of the mean (SDM)?

A
  • Mean which is the same as the parent population
  • SD is different to that of the parent population – find by calculating σ (of p pop)/√N (sample size)
  • SD is called the standard error of the mean (s.e.m.) or standard error (s.e.)
  • S.e.m. must be smaller than SD of the parent population because you are diving by something that is bigger than one
37
Q

What is a parent population a distribution of?

A

Parent population is a distribution of individual scores x (e.g., from an individual person or thing)

38
Q

What is SDM a distribution of?

A

SDM is a distribution of sample means for samples of size N drawn at random from the parent population

39
Q

What is central limit theorem?

A
  • Given a population with a mean and SD, the sampling distribution of the mean approaches a normal distribution with a mean and SD sigma/ square root N as N increases
  • This is true regardless of the underlying distribution – so even if your population is not normal, the distribution of means sampled from it will be
40
Q

How do you find a z-score for a SDM?

A

z-score = (x-µ)/(σ/√N)

41
Q

What is a point estimate?

A

– a single value estimate of a population parameter e.g., sample mean

42
Q

What is an interval estimate?

A

– a range of possible values of a population parameter e.g., confidence interval

43
Q

What is a confidence interval?

A

– describes an interval (e.g., a range) of values for our population parameter, together with a specified level of confidence that the parameter is in that range

44
Q

For a sample drawn at random from a normal population N (µ, σ) with known s.d. σ ,the 95% CI for the population mean is centred on the sample mean m and goes from?

A

m – (1.96 x σ/√N) to m + (1.96 x σ/√N)

45
Q

What does a 95% confidence interval mean?

A

A 95% confidence level means that if we repeated our sampling many times and worked out a new CI each time centred on our new sample mean we would expect the population mean to be in the interval on 95% of those repeats

46
Q

True or False, if centred on sample mean, there is a 95% chance that the population mean is also in the range and vice versa (if looking for a 95% confidence interval)?

A

TRUE

47
Q

True or False, if centred on sample mean, there is a 5% chance that the population mean falls outside of this range and vice versa (for a 95% confidence interval)?

A

TRUE

48
Q

What are the steps for null hypothesis testing?

A
- Formulate research hypothesis  	
o	Null hypothesis (H0)
o	Research hypothesis (H1)
- Collect data
- Evaluate inconsistency with H0 and data
o	How inconsistent are the data with H0?
- Reject or fail to reject H0?
- Interpret in context
49
Q

True or false, If we were able to reject the null (H0) in favour of the research hypothesis (H1) then we can claim to have evidence for the research hypothesis?

A

True

50
Q

True or false, If we fail to reject the null (H0) then we can claim to have evidence for the null hypothesis?

A

False

51
Q

What do values of p > α suggest?

A

suggest not inconsistent with H0: fail to reject null

52
Q

What do values of p > α suggest?

A

suggest not inconsistent with H0: fail to reject null

53
Q

What do values of p < α suggest?

A

suggest inconsistent with H0: reject the null

54
Q

What is the value of α in stats?

A

α = 0.05

55
Q

What is the p-value?

A

p-value = the conditional probability associated with your sample statistic

56
Q

How do you conduct a z-test?

A
  • Use NHST framework
  • Calculate inconsistency with mean by calculating the z-score, use the table to find the associated p-value and compare this to 0.05 to decide whether to reject or fail to reject the null hypothesis
57
Q

When is a z-test used?

A
  • To check if a sample mean that has been obtained is different from some population mean
58
Q

What is a 1 tailed hypothesis that is right hand tailed?

A
  • Something is better than the population
  • H1: sample mean > population mean
  • Looking for p-value above score
59
Q

What is a 1 tailed hypothesis that is left hand tailed?

A
  • Something is worse than the population
  • H1: sample mean < population mean
  • Looking for p-value below score
60
Q

What is a two tailed hypothesis?

A
  • Something is different than the population
  • H1: sample mean =/= to population mean
  • Looking for p value above and below score – have sample mean and then also find another value the same distance away from the population mean but on the other side. E.g., population mean = 67.5, sample mean = 70.7, the difference is 3.2 so the other value you should consider is 64.3 (z-score will be the same for the two)
  • Conditional probability = 2 x p-value
61
Q

When can you formulate a 1 tailed hypothesis?

A
  • There is previous research

- You can predict the effect

62
Q

What is a type I error? Why does it occur?

A
  • Rejecting the null hypothesis when it was correct – occur due to sampling error
63
Q

What is a type II error? Why does it occur?

A
  • Failing to reject the null hypothesis when it was incorrect
  • Arise due to a number of reasons such as a biased sample, an error in the experimental task, sample size was too small etc.
64
Q

Why do we use α = 0.05?

A
  • It is small so it is difficult to reject the null hypothesis but not so small that it is impossible to do so
  • It is a compromise between type I and type II errors
65
Q

How is a student’s t distribution similar to SND?

A
  • Bell-shaped, symmetric, uni-modal
66
Q

How is a student’s t distribution different to SND?

A
  • Has a lower peak, higher tails, have more variance
67
Q

When is a student’s t distribution used?

A
  • When population s.d. is unknown
68
Q

Does student’s t distribution include a variety of t tests?

A

yes

69
Q

How do you find the t statistic?

A

T(m) = (m-µ) / (s/√N)

70
Q

How do you find the estimated standard error?

A

(s/√N) – estimated standard error

71
Q

When using t distribution table, what value should you use for v?

A

When using t table – t (v = N-1) – subtract 1 off of sample size

72
Q

How do you find confidence intervals when population s.d. is unknown?

A
  • For 95% of repeat sample mean m would be within:
    o Some number c e.s.e.’s of µ
    o (µ- (c x s/√N) to µ+ (c x s/√N))
  • To find c:
    o Find t value for 0.025% in one tail (or 0.05% for 2 tails)
73
Q

How do you conduct a 1 sample t test?

A
  • Same as a z test except:
  • Work out e.s.e.
  • Find t statistic
  • Find if t stat is inconsistent with critical value for corresponding t(n) and significance level
  • Reject or fail to reject H0
  • Interpret in context
74
Q

When do you use a 1 sample t test?

A
  • Use to test whether sample mean you have is different from some given or hypothetical population mean