Stats only Flashcards

Question 1

Q

What is important to consider when looking at sample size?

Answer

A

Size matters
Sampling error can result if your sample is not large enough
Trade off between size and time/cost

Question 2

Q

Factors in deciding sample size?

Answer

A

o Design
o Response rate
o Heterogeneity of population

Question 3

Q

What is a population parameter?

Answer

A

a quantity that describes some characteristic of a population with respect to a specific variable
E.g., population mean, population range etc.
Not usually possible to calculate
Might be given to you if available

Question 4

Q

What is a sample statistic?

Answer

A

– a quantity that describes some characteristic of a sample with respect to a specific variable

E.g., sample mean, sample range etc.
We can always calculate these from a sample
Sample statistics provide an estimate of population parameters

Question 5

Q

Why is it important to summarise data?

Answer

A

Data can be very complex and therefore it is useful to summarise it
Allows for interpretation

Question 6

Q

What are measures of central tendency?

Answer

A

They provide an indication of a “typical” score in the data set

Question 7

Q

What is the mean?

Answer

A

o Provides and estimate of the average score in the data set

o Is affected by extreme data points

Question 8

Q

What is the median?

Answer

A

o Is insensitive to extreme scores in the data set

o Doesn’t reflect the shape of the scores e.g., doesn’t care how far away the extreme scores are

Question 9

Q

What is the mode?

Answer

A

o Easy to calculate from a histogram and easy to understand – the most common value
o Data might have more than 1 mode or no mode at all

Question 10

Q

What is the range?

Answer

A

o Difference between min and max scores

o Range doesn’t always change for distributions with different shapes

Question 11

Q

What is a deviation?

Answer

A

o The signed distance of a score from the mean

Question 12

Q

How to calculate simple variance?

Answer

A

o	Calc mean
o	Calc deviations
o	Square deviations
o	Calc a slightly adjusted average squared deviation
        - You divide by n-1

Question 13

Q

What is the issue with simple variance?

Answer

A

potential issue is the units used, so if deviations are in hours, when squared the units would become hours squared which isn’t comprehendible

Question 14

Q

How to calculate sample standard deviation?

Answer

A

o	Calc mean
o	Calc deviations
o	Square deviations
o	Calc sample variance
o	Take square root of sample variance – now back in comprehendible units

Question 15

Q

What is a histogram?

Answer

A

-	Good way to inspect data
o	Can see if there’s any odd-looking scores
o	Can see the mode
o	Can see how spread out the scores are
o	Can see how the data is distributed

Question 16

Q

What is a box plot?

Answer

A

Seems to be plotted vertically instead of horizontal?

Question 17

Q

What is a scatter plot?

Answer

A

shows the relationship between variables

Question 18

Q

What is a data summary plot?

Answer

A

Plot bar showing mean (categorical data) or line graph (numerical data)
Plot error bars showing +/- 1s.d.

Question 19

Q

What is distribution of data?

Answer

A

the manner in which data for a particular variable is spread over its range is commonly referred to a its distribution

Question 20

Q

What is normally distributed data?

Answer

A

Many naturally occurring variables are Normal
E.g., height, IQ (not naturally occurring but has been defined as this)
If we don’t have much data then the normality can be difficult to see in a histogram
As sample size increases, the normality will emerge

Question 21

Q

What is non-normal data?

Answer

A

Has a tail either to the right or left – skewed data
Positively skewed = long tail to the right, peaks at the left
Negatively skewed = long tail to left, peaks at the right
E.g., reaction time – tends to be positively skewed

Question 22

Q

What is the danger of non-normal data?

Answer

A

Danger – mean is distorted by the tails which are the more extreme values

Question 23

Q

What is the danger of bimodal data?

Answer

A

Danger – mean is not representative

- Tends to suggest an issue with your experiment – more than one underlying population

Question 24

Q

What is bimodal data?

Answer

A

Data that has two modes

Question 25

Q

What is the normal distribution?

Answer

A

Bell-shaped
Symmetric about the centre
Tails never reach 0 – go towards infinity
The area under the centre is always equal to 1
Very close to 0 by the time it gets to 3 SD from the mean – can use this to draw a rough idea of a normal distribution

Question 26

Q

What is probability?

Answer

A

– a measure of how likely it is that an uncertain event will occur

Question 27

Q

What is conditional probability?

Answer

A

Probability of an event given that something else is known/assumed e.g., A|B

Question 28

Q

What is a z-score?

Answer

A

Z measures how far away your sample is from the population mean in multiples of the SD
If you were to find z-scores for all points on a normal distribution, you would find that it would form a normal distribution with mean 0 and SD 1 – N (0, 1)
The area underneath a normal distribution above/below some variable value of x EQUALS the area underneath N (0, 1) above/below z

Question 29

Q

How do you obtain a z-score?

Answer

A

Obtained by subtracting the population mean from x and then dividing by the population SD – (x-µ)/σ

Question 30

Q

What is a SND table and how do you use it?

Answer

A

Table that provides values of areas underneath the SND in different ranges
Find z-score (first column) then decide if you want the area above or below this score
If z-score is negative, use the positive value in the table but be careful when choosing above or below because the scores will be flipped
o E.g., z-score = -2 and you want the area below. On table you will use z-score 2 but use the area above
If you have a range that is bounded e.g., 70

Question 31

Q

What is a sampling error?

Answer

A

Sampling error – the error associated with examining statistics calculated from a sample rather than the population

Question 32

Q

Why do sampling errors occur?

Answer

A

It occurs because in our sample we don’t have all the members of the population

Question 33

Q

What does the magnitude of a sampling error depend on?

Answer

A

The sample size

Bigger sample = big sampling error less likely
Smaller sample = big sampling error more likely

Question 34

Q

How do we generate a sampling distribution?

Answer

A

Take a sample (size N) from a population
Calculate a sample statistic (e.g., mean, SD etc.)
Add the new statistic to a frequency plot (a histogram) of the sample statistic
Repeated the above 3 steps multiple times

Question 35

Q

What does the sampling distribution tell us?

Answer

A

Tells us important info about how a statistic changes from sample to sample
What is the mean value of the statistic over all samples?
How variable is the statistic over all samples?
What shape is the distribution of the statistic over all samples?

Question 36

Q

What are the properties of the sampling distribution of the mean (SDM)?

Answer

A

Mean which is the same as the parent population
SD is different to that of the parent population – find by calculating σ (of p pop)/√N (sample size)
SD is called the standard error of the mean (s.e.m.) or standard error (s.e.)
S.e.m. must be smaller than SD of the parent population because you are diving by something that is bigger than one

Question 37

Q

What is a parent population a distribution of?

Answer

A

Parent population is a distribution of individual scores x (e.g., from an individual person or thing)

Question 38

Q

What is SDM a distribution of?

Answer

A

SDM is a distribution of sample means for samples of size N drawn at random from the parent population

Question 39

Q

What is central limit theorem?

Answer

A

Given a population with a mean and SD, the sampling distribution of the mean approaches a normal distribution with a mean and SD sigma/ square root N as N increases
This is true regardless of the underlying distribution – so even if your population is not normal, the distribution of means sampled from it will be

Question 40

Q

How do you find a z-score for a SDM?

Answer

A

z-score = (x-µ)/(σ/√N)

Question 41

Q

What is a point estimate?

Answer

A

– a single value estimate of a population parameter e.g., sample mean

Question 42

Q

What is an interval estimate?

Answer

A

– a range of possible values of a population parameter e.g., confidence interval

Question 43

Q

What is a confidence interval?

Answer

A

– describes an interval (e.g., a range) of values for our population parameter, together with a specified level of confidence that the parameter is in that range

Question 44

Q

For a sample drawn at random from a normal population N (µ, σ) with known s.d. σ ,the 95% CI for the population mean is centred on the sample mean m and goes from?

Answer

A

m – (1.96 x σ/√N) to m + (1.96 x σ/√N)

Question 45

Q

What does a 95% confidence interval mean?

Answer

A

A 95% confidence level means that if we repeated our sampling many times and worked out a new CI each time centred on our new sample mean we would expect the population mean to be in the interval on 95% of those repeats

Question 46

Q

True or False, if centred on sample mean, there is a 95% chance that the population mean is also in the range and vice versa (if looking for a 95% confidence interval)?

Question 47

Q

True or False, if centred on sample mean, there is a 5% chance that the population mean falls outside of this range and vice versa (for a 95% confidence interval)?

Question 48

Q

What are the steps for null hypothesis testing?

Answer

A

- Formulate research hypothesis  	
o	Null hypothesis (H0)
o	Research hypothesis (H1)
- Collect data
- Evaluate inconsistency with H0 and data
o	How inconsistent are the data with H0?
- Reject or fail to reject H0?
- Interpret in context

Question 49

Q

True or false, If we were able to reject the null (H0) in favour of the research hypothesis (H1) then we can claim to have evidence for the research hypothesis?

Question 50

Q

True or false, If we fail to reject the null (H0) then we can claim to have evidence for the null hypothesis?

Question 51

Q

What do values of p > α suggest?

Answer

A

suggest not inconsistent with H0: fail to reject null

Question 52

Q

What do values of p > α suggest?

Answer

A

suggest not inconsistent with H0: fail to reject null

Question 53

Q

What do values of p < α suggest?

Answer

A

suggest inconsistent with H0: reject the null

Question 54

Q

What is the value of α in stats?

Answer

A

α = 0.05

Question 55

Q

What is the p-value?

Answer

A

p-value = the conditional probability associated with your sample statistic

Question 56

Q

How do you conduct a z-test?

Answer

A

Use NHST framework
Calculate inconsistency with mean by calculating the z-score, use the table to find the associated p-value and compare this to 0.05 to decide whether to reject or fail to reject the null hypothesis

Question 57

Q

When is a z-test used?

Answer

A

To check if a sample mean that has been obtained is different from some population mean

Question 58

Q

What is a 1 tailed hypothesis that is right hand tailed?

Answer

A

Something is better than the population
H1: sample mean > population mean
Looking for p-value above score

Question 59

Q

What is a 1 tailed hypothesis that is left hand tailed?

Answer

A

Something is worse than the population
H1: sample mean < population mean
Looking for p-value below score

Question 60

Q

What is a two tailed hypothesis?

Answer

A

Something is different than the population
H1: sample mean =/= to population mean
Looking for p value above and below score – have sample mean and then also find another value the same distance away from the population mean but on the other side. E.g., population mean = 67.5, sample mean = 70.7, the difference is 3.2 so the other value you should consider is 64.3 (z-score will be the same for the two)
Conditional probability = 2 x p-value

Question 61

Q

When can you formulate a 1 tailed hypothesis?

Answer

A

There is previous research

- You can predict the effect

Question 62

Q

What is a type I error? Why does it occur?

Answer

A

Rejecting the null hypothesis when it was correct – occur due to sampling error

Question 63

Q

What is a type II error? Why does it occur?

Answer

A

Failing to reject the null hypothesis when it was incorrect
Arise due to a number of reasons such as a biased sample, an error in the experimental task, sample size was too small etc.

Question 64

Q

Why do we use α = 0.05?

Answer

A

It is small so it is difficult to reject the null hypothesis but not so small that it is impossible to do so
It is a compromise between type I and type II errors

Answer 61

A

Bell-shaped, symmetric, uni-modal

Answer 62

A

Has a lower peak, higher tails, have more variance

Answer 63

A

When population s.d. is unknown

Answer 64

A

T(m) = (m-µ) / (s/√N)

Answer 65

A

(s/√N) – estimated standard error

Answer 66

A

When using t table – t (v = N-1) – subtract 1 off of sample size

Answer 67

A

For 95% of repeat sample mean m would be within:
o Some number c e.s.e.’s of µ
o (µ- (c x s/√N) to µ+ (c x s/√N))
To find c:
o Find t value for 0.025% in one tail (or 0.05% for 2 tails)

Answer 68

A

Same as a z test except:
Work out e.s.e.
Find t statistic
Find if t stat is inconsistent with critical value for corresponding t(n) and significance level
Reject or fail to reject H0
Interpret in context

Answer 69

A

Use to test whether sample mean you have is different from some given or hypothetical population mean