Stats, Study Design and Power Flashcards

1
Q

What is the difference between the mean and the median?

A

Mean is the sum of all values divided by n, which outliers affect.

Median - middle value is not affected so much by outliers as the value will always be around the most common value. It is suitable for normal distribution data sets and skewed data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a normal distribution graph look like?

A

Mean, median and mode are all equal
Symmetrical
Commonly described as a bell curve
95% of the population lies within 2 SD of the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main parts on a box plot?

A

The two whiskers: min values and max value. median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the null hypothesis?

A

The hypothesis is to be disproved.
The hypothesis for when there is no difference between two data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the alternate hypothesis?

A

Something that can be tested for.
The hypothesis is used when there is a difference between two data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for SD?

A

MEW=MEAN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the variance?

A

This is the square of the SD - this is the value before it is square routed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why isn’t the variance used?

A

As it is squared the units are squared which makes it awkward to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the standard variation test for?

A

It measures the variation around the mean within a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What data is the T-test used for?

A

Continuous quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the assumptions made with a T-test?

A

Regarding the scale of measurement
Random sampling
Normality of data distribution
Adequacy of sample size
Equality of variance in SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What values are taken from the T-test? What do each of the values mean?

A

P value - the probability of the results being correct ( should be <0.05) (the rate of type I errors)
T statistic - the higher this value the bigger the difference between the two data sets
Degrees of freedom ( group 1 n-1 + group 2 n-1) - the number of values in a population that can vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What data is the ANOVA test used for?

A

Continuous quantitative - when there is more than 2 data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does ANOVA stand for?

A

Analysis of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the assumptions of the ANOVA test?

A

Continuous quantitative data
Must be normally distributed
The variance must be homogenous ( approx. equal level of variability)
Samples must be independent of each other/ not paired

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What data is the Chi-squared test used for?

A

Qualitative data / categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the chi-squared test used for?

A

It is used to compare a set of data with an expected set of data
(observed/expected)^2/ expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are non-parametric tests?

A

These are tests that are used for data sets that don’t meet the stringent requirements for parametric tests.
They need to be measured on an interval scale - the units go up in a linear fashion, each integer is of the same value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Mann-Whitney U test used for?

A

Non parametric data sets in a similar manner to the T-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the spearman correlation test used for?

A

To determine the strength of and direction of a correlation between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do you need to take into consideration when designing an experiment?

A

Number of samples
Type of data that will be collected
Controls needed when obtaining samples
Need to calculate power to obtain N values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the methods for data analysis?

A

Averages - median/mean \statistical tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the two different types of experiments

A

Manipulative & Observational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Explain what a manipulative type of experiment entails

A

It is when one or more factors are deliberately altered.
It explores cause and effect
It can include treatments = set of conditions
Require blocks = groups of replicates subjected to the same treatment
Some examples might be assess DNA recovery from sample stored in wet and dry environment. Asses counts of cancer cell line after incubation with differing concentration of inhibitor drug

25
What are factorial designs and what are they used for?
This is an alternative to changing one factor at a time This enables the data to be collected in fewer experiments and a shorter period of time as a result
26
Explain what an observational type of experiment entails
This is the investigation of the links between variables of interest occurring in natural conditions. Allows comparisons between natural measurements It sets of natural measurements Groups of replicates Some examples are: what are the consequences of smoking on lung function. Can you identify biomarkers that predict survival outcomes in cancer. There is no manipulation of the experiment, they are just observing and taking measurements.
27
What is the take home message on statistical power
Do not let your study be a waste of time Use the right number of samples Do not waste resources Too many samples is better than too few Allow for attrition (loss of data)
28
Why is it important that the correct sample size is done?
If sample is too small then you can't reject the null hypothesis If sample is too large some subjects have been unnecessarily exposed to risk of harm and there is a waste of resources We use a sample rather than the whole population- from this we draw inferences
29
What is the null hypothesis
This states there is no difference between two populations Properly designed study offers clarity: the two populations differ- reject the null hypothesis That the populations are equal/don't differ - accept null hypothesis. If the null hypothesis is accepted we need to know that we looked at enough subjects to perceive the differences
30
What is H0?
The null hypothesis
31
What is a type I error?
False positive - the alternate was accepted but the null hypothesis is actually true
32
What is the type I error rate set to? And what is it also known as?
It is set to 0.05. it is also known as alpha. This is why p is <0.05
33
What is a type II error?
False negative - the null was accepted but the alternate was actually true
34
What is the type II error rate known as?
it is known as beta. It is the probability that a false H0 is retained
35
What are the 4 outcomes on considering hypothesis?
There is always the risk of a false positive or false negative when accepting/rejecting the null hypothesis Where the null hypothesis is false, if we reside alpha ( false positives) then we increase beta ( false negatives)
36
How do we remember type I and Type II?
I --> Positive ( P has one leg) II --> Negative ( N has two legs)
37
What is the definition of alpha?
This is the probability of retaining a false positive - probability commonly <0.05 This is when the null hasn't been accepted but should have
38
What is the definition of beta?
The probability of incorrectly retaining a false negative Null hypothesis was accepted but should not have been
39
What is the definition of statistical power? What are their common accepted probabilities?
power (0.8) = 1- beta (0.2)
40
What is the effect size? What does it relate to?
Effect size is the measure of size of the difference between two populations. The larger the effect size the fewer samples that is needed The smaller the effect size the more samples that is needed This measures the strength of the result, it is solely magnitude based
41
What does the value from the effect size mean (an example)?
An effect size of 0.7 means that the score of the average student in the intervention group is 0.7 standard deviation higher than the average student in the ''control group''
42
What is the p-value? What is it dependent on?
The P value related to the likelihood that what you found is not due to chance P values are very dependent on sample size
43
What is the best practice for calculating power?
Report effect sizes, and P-values and conduct a power analysis. Emphasise that the study has sufficient power to find differences based on a P value
44
How are most power calculations calculated?
Through the relationship between effect size and sample size Alpha commonly = < 0.05 power = usually set to 0.8/0.9
45
How can the effect size be predicted?
Pilot study - small-scale study Literature research - see what others have done It might need to be calculated from the results if not reported in a paper A common understanding of a meaningful effect in your field of interest Educated guess/ assumption
46
What are the different types of power analysis?
A priori Post-hoc Sensitivity Criterion ( rarely used)
47
Why is effect size needed?
It is needed to design the experiment so that the value of N can be calculated - with a priori
48
What is the definition of effect size?
Effect size is the measure of the strength of the relationship between two variables
49
How can effect size be calculated other than that mentioned before?
Cohen's d is a widely used standardised mean effect size used when comparing two means in a standard deviation unit Cohen's d = (M2-M1/SD pooled) Interpretation of the results is subjective but Cohen suggested that:
50
What does a large effect size mean?
The higher the effect size the lower N has to be
51
Influence of expected ES on N
Large effect size lower N
52
What effect does increasing sample size have?
It increases power but not a linear relationship
53
Why are power calculations important?
So that time + resources aren't wasted There isn't unnecessary risk for participants if clinical trial
54
What are the consequences of a poorly planned study?
Waste of Time Waste of Money
55
What are the different types of power analysis?
Priori Post hoc Sensitivity Criterion
56
What are the different types of power analysis?
Priori Post hoc Sensitivity Criterion
57
What are the variables necessary for a power calc. and how are they defined?
Effect size - measure of strength between two variables Sample size - number of participants Statistical power - this is the probability test correctly rejects the null hypothesis ( power = 1-b) Alpha/statistical significance level - this is the probability of rejecting the null hypothesis when it is true
58
What type of testing is statistical testing used in?
Hypothesis testing