Statistics Flashcards

1
Q

Define variable

A

Aspect that can take different values for different participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main types of variable?

A

Categorical

Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two types of categorical variable?

A

Nominal - unordered labelled characteristics

Ordinal - small set of ordered/ranked categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What descriptive statistics would you do for categorical data?

A

Frequency

Relative Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of graph would you create for categorical data?

A

Bar chat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What descriptive statistics would you do for quantitative data?

A

Averages
Variation
Symmetry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What types of graph would you create for quantitative data?

A

Histogram
Box Plot
Box and whisker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is normal distribution?

A

Mathematically defined theoretical distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define descriptive statistics

A

Describe and summarise data in the sample

i.e. how common are certain characteristics, how are different characteristics related to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define inferential statistics

A

Using sample data to make inferences about characteristics and relationships in the populations
i.e. standard errors, confidence intervals, p-values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is standard error?

A

Indicates how far, on average, the sample estimate is expected to be from the true population parameter value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between standard error and standard deviation?

A

SE summarises precision of an estimate

SD summarises variability of an estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a confidence interval?

A

Range of values in which we can be confident the true value lies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a p-value?

A

Quantifies the extent to which the sample estimate contradicts the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a null hypothesis?

A

The most boring truth imaginable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an alternative hypothesis?

A

Opposite of the null hypothesis

Usually two tailed - contradictions to the null hypothesis in either direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a Type I and Type II error in hypothesis testing?

A

Type 1 error - null being rejected when it is true, ‘significant’ result due to chance
Type 2 error - null not rejected when it is false, study not powerful enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a two sample (unpaired) t-test used for? What are the assumptions?

A
Tests for mean difference between two independent groups
Assumptions:
Variable is normally distributed
SD is similar
Observations are not paired
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a paired t-test used for? What are the assumptions?

A

Used when observations are linked in some way (e.g. before and after)
Analysis based on within-pair differences between groups
Assumption:
Within-pair differences are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is an ANOVA used for? What are the assumptions?

A
For comparing 3 or more independent groups
Assumptions:
Each group is normally distributed
SD is similar
Observations are independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a repeated measures ANOVA used for? What are the assumptions?

A

For comparing 3 or more paired groups
Assumptions:
Difference scores between any two groups are normally distributed
SD of different scores should be the same for all combined groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do non-parametric tests work?

A

Analyse rank ordering rather than actual scores

Compare distributions rather than means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When do we use non-parametric tests?

A

When assumptions for parametric tests do not hold

e.g. variable is skewed, SD differs markedly, variable is more ordinal than quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a wilcoxon (rank sum) test used for?

A

Compares two independent groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is a kruskal-wallis test used for?
Comparing three or more independent groups
26
What is a wilcoxon signed rank test used for?
Compares two paired groups
27
What is a Friedman test used for?
Compares three or more paired groups
28
What are the advantages and disadvantages of non-parametric tests?
``` Advantages: Always valid for quantitative data Often provide similar p-values to quantitative tests Disadvantages: Do not make direct inferences Do not provide CIs Based on analysis of ranks, not scores When assumptions hold, not as powerful ```
29
How do you calculate proportion?
No. in category of interest/total no. of participants
30
What does the term risk mean?
The proportion/percentage of people with a specified disease in a population
31
How do you calculate odds?
No. in category of interest/No. in other category OR % in category of interest/ 100-% in category of interest
32
How are odds and % related?
The higher the %, the higher the odds
33
In what type of study are odds particularly useful?
Case-control studies
34
How would you summarise binary variables in 2 independent groups?
Cross tabulate - exposure variable and outcome variable
35
When calculating absolute measures, what indicates no difference?
0
36
How do you calculate risk difference?
% of people affected in one group - % in the other
37
When calculating relative measures, what indicates no difference?
1
38
How do you calculate risk ratio/relative risk?
% affected in intervention group / % in the control
39
How do you calculate odds ratio?
Odds in one group / odds in the other
40
What does 0, 0< and 0> mean in term of risk difference?
``` 0 = groups equally likely to have disease 0< = first group more likely to have disease 0> = second group more likely to have disease ```
41
How do you calculate absolute risk reduction?
Difference in % points between groups
42
What is number needed to treat?
Number of people that need to receive intervention before one person benefits from it
43
How do you calculate number needed to treat?
100/risk difference
44
What does a risk ratio of <1 mean?
Disease occurrence is lower in the intervention group
45
How do you calculate relative risk reduction?
(1-risk ratio)x100
46
How do you calculate odds ratio?
odds of disease in exposed group / odds of disease in non-exposed group
47
What are risk difference and NNT good at quantifying?
Impact of an intervention
48
What are risk ratio and odds ratio good at quantifying?
Strength of association between intervention and disease status
49
Which two tests give p-values when comparing binary variables between two groups?
Chi-squared test | Fishers exact test
50
How do you calculate an 'expected value'?
(row total x column total) / total sample size
51
What are the assumptions of the Chi-sqaured test?
Total sample size is at least 40 OR If sample is between 20 and 39, the expected value in each cell is at least 5
52
What are the assumptions of the Fisher's Exact Test?
Fewer than 20 participants | Between 20 and 39 participants and the expected value in at least one cell is less than 5
53
What is the definition of correlation?
The association between two variables
54
How can correlation be summarised graphically and numerically?
Graphically - scatterplots | Numerically - correlation coefficients
55
What does a correlation coefficient do?
Quantifies the strength of association between two variables
56
What is the difference between Pearson and Spearman correlation coefficient?
Pearson - linear relationship | Spearman - non-linear associations (monotonic - only positive or only negative)
57
What does R Squared tell us?
The proportion of the variation in one variable that is explained by another variable
58
How do we calculate R squared?
Multiply Pearson's coefficient by itself
59
What is linear regression used for?
Estimating the mathematical equation that describes the linear relationship between a quantitative outcome and a quantitative predictor
60
What is the least squares estimation?
Method of estimating regression coefficients Derives line of best fit/regression line Estimate of the true regression line in the population
61
What are the assumptions of regression?
Outcome is quantitative Relationship is linear Residuals are normally distributed Constant variance
62
What are the similarities and differences between regression and persons CC?
Pearsons quantifies strength of association | Regression describes the relationship and can be used to predict outcome variable score
63
How do we calculate sensitivity?
TP / TP + FN
64
How do we calculate specificity?
TN / TN + FP
65
What is PPV?
Positive predictive value - proportion of those with a +ve result that have the condition
66
What is NPV?
Negative predictive value - proportion of those with a -ve result that do not have the condition
67
How do we calculate PPV?
TP / TP + FP
68
How do we calculate NPV?
TN / TN + FN