Stats, Study Design and Power Flashcards

1
Q

What is the difference between the mean and the median?

A

Mean is the sum of all values divided by n, which outliers affect.

Median - middle value is not affected so much by outliers as the value will always be around the most common value. It is suitable for normal distribution data sets and skewed data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a normal distribution graph look like?

A

Mean, median and mode are all equal
Symmetrical
Commonly described as a bell curve
95% of the population lies within 2 SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main parts on a box plot?

A

The two whiskers: min values and max value. median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the null hypothesis?

A

The hypothesis is to be disproved.
The hypothesis for when there is no difference between two data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the alternate hypothesis?

A

Something that can be tested for.
The hypothesis is used when there is a difference between two data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for SD?

A

MEW=MEAN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the variance?

A

This is the square of the SD - this is the value before it is square routed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why isn’t the variance used?

A

As it is squared the units are squared which makes it awkward to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the standard variation test for?

A

It measures the variation around the mean within a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What data is the T-test used for?

A

Continuous quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the assumptions made with a T-test?

A

Regarding the scale of measurement
Random sampling
Normality of data distribution
Adequacy of sample size
Equality of variance in SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What values are taken from the T-test? What do each of the values mean?

A

P value - the probability of the results being correct ( should be <0.05) (the rate of type I errors)
T statistic - the higher this value the bigger the difference between the two data sets
Degrees of freedom ( group 1 n-1 + group 2 n-1) - the number of values in a population that can vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What data is the ANOVA test used for?

A

Continuous quantitative - when there is more than 2 data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does ANOVA stand for?

A

Analysis of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the assumptions of the ANOVA test?

A

Continuous quantitative data
Must be normally distributed
The variance must be homogenous ( approx. equal level of variability)
Samples must be independent of each other/ not paired

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What data is the Chi-squared test used for?

A

Qualitative data / categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the chi-squared test used for?

A

It is used to compare a set of data with an expected set of data
(observed/expected)^2/ expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are non-parametric tests?

A

These are tests that are used for data sets that don’t meet the stringent requirements for parametric tests.
They need to be measured on an interval scale - the units go up in a linear fashion, each integer is of the same value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Mann-Whitney U test used for?

A

Non parametric data sets in a similar manner to the T-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the spearman correlation test used for?

A

To determine the strength of and direction of a correlation between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do you need to take into consideration when designing an experiment?

A

Number of samples
Type of data that will be collected
Controls needed when obtaining samples
Need to calculate power to obtain N values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the methods for data analysis?

A

Averages - median/mean \statistical tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the two different types of experiments

A

Manipulative & Observational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Explain what a manipulative type of experiment entails

A

It is when one or more factors are deliberately altered.
It explores cause and effect
It can include treatments = set of conditions
Require blocks = groups of replicates subjected to the same treatment
Some examples might be assess DNA recovery from sample stored in wet and dry environment. Asses counts of cancer cell line after incubation with differing concentration of inhibitor drug

25
Q

What are factorial designs and what are they used for?

A

This is an alternative to changing one factor at a time
This enables the data to be collected in fewer experiments and a shorter period of time as a result

26
Q

Explain what an observational type of experiment entails

A

This is the investigation of the links between variables of interest occurring in natural conditions.
Allows comparisons between natural measurements
It sets of natural measurements
Groups of replicates
Some examples are: what are the consequences of smoking on lung function. Can you identify biomarkers that predict survival outcomes in cancer.
There is no manipulation of the experiment, they are just observing and taking measurements.

27
Q

What is the take home message on statistical power

A

Do not let your study be a waste of time
Use the right number of samples
Do not waste resources
Too many samples is better than too few
Allow for attrition (loss of data)

28
Q

Why is it important that the correct sample size is done?

A

If sample is too small then you can’t reject the null hypothesis
If sample is too large some subjects have been unnecessarily exposed to risk of harm and there is a waste of resources
We use a sample rather than the whole population- from this we draw inferences

29
Q

What is the null hypothesis

A

This states there is no difference between two populations
Properly designed study offers clarity: the two populations differ- reject the null hypothesis That the populations are equal/don’t differ - accept null hypothesis.
If the null hypothesis is accepted we need to know that we looked at enough subjects to perceive the differences

30
Q

What is H0?

A

The null hypothesis

31
Q

What is a type I error?

A

False positive - the alternate was accepted but the null hypothesis is actually true

32
Q

What is the type I error rate set to? And what is it also known as?

A

It is set to 0.05. it is also known as alpha. This is why p is <0.05

33
Q

What is a type II error?

A

False negative - the null was accepted but the alternate was actually true

34
Q

What is the type II error rate known as?

A

it is known as beta.
It is the probability that a false H0 is retained

35
Q

What are the 4 outcomes on considering hypothesis?

A

There is always the risk of a false positive or false negative when accepting/rejecting the null hypothesis
Where the null hypothesis is false, if we reside alpha ( false positives) then we increase beta ( false negatives)

36
Q

How do we remember type I and Type II?

A

I –> Positive ( P has one leg)
II –> Negative ( N has two legs)

37
Q

What is the definition of alpha?

A

This is the probability of retaining a false positive - probability commonly <0.05
This is when the null hasn’t been accepted but should have

38
Q

What is the definition of beta?

A

The probability of incorrectly retaining a false negative
Null hypothesis was accepted but should not have been

39
Q

What is the definition of statistical power? What are their common accepted probabilities?

A

power (0.8) = 1- beta (0.2)

40
Q

What is the effect size? What does it relate to?

A

Effect size is the measure of size of the difference between two populations.
The larger the effect size the fewer samples that is needed
The smaller the effect size the more samples that is needed
This measures the strength of the result, it is solely magnitude based

41
Q

What does the value from the effect size mean (an example)?

A

An effect size of 0.7 means that the score of the average student in the intervention group is 0.7 standard deviation higher than the average student in the ‘‘control group’’

42
Q

What is the p-value? What is it dependent on?

A

The P value related to the likelihood that what you found is not due to chance
P values are very dependent on sample size

43
Q

What is the best practice for calculating power?

A

Report effect sizes, and P-values and conduct a power analysis.
Emphasise that the study has sufficient power to find differences based on a P value

44
Q

How are most power calculations calculated?

A

Through the relationship between effect size and sample size
Alpha commonly = < 0.05
power = usually set to 0.8/0.9

45
Q

How can the effect size be predicted?

A

Pilot study - small-scale study
Literature research - see what others have done
It might need to be calculated from the results if not reported in a paper
A common understanding of a meaningful effect in your field of interest
Educated guess/ assumption

46
Q

What are the different types of power analysis?

A

A priori
Post-hoc
Sensitivity
Criterion ( rarely used)

47
Q

Why is effect size needed?

A

It is needed to design the experiment so that the value of N can be calculated - with a priori

48
Q

What is the definition of effect size?

A

Effect size is the measure of the strength of the relationship between two variables

49
Q

How can effect size be calculated other than that mentioned before?

A

Cohen’s d is a widely used standardised mean effect size used when comparing two means in a standard deviation unit
Cohen’s d = (M2-M1/SD pooled)
Interpretation of the results is subjective but Cohen suggested that:

50
Q

What does a large effect size mean?

A

The higher the effect size the lower N has to be

51
Q

Influence of expected ES on N

A

Large effect size lower N

52
Q

What effect does increasing sample size have?

A

It increases power but not a linear relationship

53
Q

Why are power calculations important?

A

So that time + resources aren’t wasted
There isn’t unnecessary risk for participants if clinical trial

54
Q

What are the consequences of a poorly planned study?

A

Waste of Time
Waste of Money

55
Q

What are the different types of power analysis?

A

Priori
Post hoc
Sensitivity
Criterion

56
Q

What are the different types of power analysis?

A

Priori
Post hoc
Sensitivity
Criterion

57
Q

What are the variables necessary for a power calc. and how are they defined?

A

Effect size - measure of strength between two variables
Sample size - number of participants
Statistical power - this is the probability test correctly rejects the null hypothesis ( power = 1-b)
Alpha/statistical significance level - this is the probability of rejecting the null hypothesis when it is true

58
Q

What type of testing is statistical testing used in?

A

Hypothesis testing