Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define variable

A

Aspect that can take different values for different participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main types of variable?

A

Categorical

Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two types of categorical variable?

A

Nominal - unordered labelled characteristics

Ordinal - small set of ordered/ranked categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What descriptive statistics would you do for categorical data?

A

Frequency

Relative Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of graph would you create for categorical data?

A

Bar chat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What descriptive statistics would you do for quantitative data?

A

Averages
Variation
Symmetry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What types of graph would you create for quantitative data?

A

Histogram
Box Plot
Box and whisker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is normal distribution?

A

Mathematically defined theoretical distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define descriptive statistics

A

Describe and summarise data in the sample

i.e. how common are certain characteristics, how are different characteristics related to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define inferential statistics

A

Using sample data to make inferences about characteristics and relationships in the populations
i.e. standard errors, confidence intervals, p-values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is standard error?

A

Indicates how far, on average, the sample estimate is expected to be from the true population parameter value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between standard error and standard deviation?

A

SE summarises precision of an estimate

SD summarises variability of an estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a confidence interval?

A

Range of values in which we can be confident the true value lies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a p-value?

A

Quantifies the extent to which the sample estimate contradicts the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a null hypothesis?

A

The most boring truth imaginable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an alternative hypothesis?

A

Opposite of the null hypothesis

Usually two tailed - contradictions to the null hypothesis in either direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a Type I and Type II error in hypothesis testing?

A

Type 1 error - null being rejected when it is true, ‘significant’ result due to chance
Type 2 error - null not rejected when it is false, study not powerful enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a two sample (unpaired) t-test used for? What are the assumptions?

A
Tests for mean difference between two independent groups
Assumptions:
Variable is normally distributed
SD is similar
Observations are not paired
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a paired t-test used for? What are the assumptions?

A

Used when observations are linked in some way (e.g. before and after)
Analysis based on within-pair differences between groups
Assumption:
Within-pair differences are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is an ANOVA used for? What are the assumptions?

A
For comparing 3 or more independent groups
Assumptions:
Each group is normally distributed
SD is similar
Observations are independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a repeated measures ANOVA used for? What are the assumptions?

A

For comparing 3 or more paired groups
Assumptions:
Difference scores between any two groups are normally distributed
SD of different scores should be the same for all combined groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do non-parametric tests work?

A

Analyse rank ordering rather than actual scores

Compare distributions rather than means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When do we use non-parametric tests?

A

When assumptions for parametric tests do not hold

e.g. variable is skewed, SD differs markedly, variable is more ordinal than quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a wilcoxon (rank sum) test used for?

A

Compares two independent groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a kruskal-wallis test used for?

A

Comparing three or more independent groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a wilcoxon signed rank test used for?

A

Compares two paired groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a Friedman test used for?

A

Compares three or more paired groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are the advantages and disadvantages of non-parametric tests?

A
Advantages:
Always valid for quantitative data
Often provide similar p-values to quantitative tests
Disadvantages:
Do not make direct inferences
Do not provide CIs
Based on analysis of ranks, not scores
When assumptions hold, not as powerful
29
Q

How do you calculate proportion?

A

No. in category of interest/total no. of participants

30
Q

What does the term risk mean?

A

The proportion/percentage of people with a specified disease in a population

31
Q

How do you calculate odds?

A

No. in category of interest/No. in other category
OR
% in category of interest/ 100-% in category of interest

32
Q

How are odds and % related?

A

The higher the %, the higher the odds

33
Q

In what type of study are odds particularly useful?

A

Case-control studies

34
Q

How would you summarise binary variables in 2 independent groups?

A

Cross tabulate - exposure variable and outcome variable

35
Q

When calculating absolute measures, what indicates no difference?

A

0

36
Q

How do you calculate risk difference?

A

% of people affected in one group - % in the other

37
Q

When calculating relative measures, what indicates no difference?

A

1

38
Q

How do you calculate risk ratio/relative risk?

A

% affected in intervention group / % in the control

39
Q

How do you calculate odds ratio?

A

Odds in one group / odds in the other

40
Q

What does 0, 0< and 0> mean in term of risk difference?

A
0 = groups equally likely to have disease
0< = first group more likely to have disease
0> = second group more likely to have disease
41
Q

How do you calculate absolute risk reduction?

A

Difference in % points between groups

42
Q

What is number needed to treat?

A

Number of people that need to receive intervention before one person benefits from it

43
Q

How do you calculate number needed to treat?

A

100/risk difference

44
Q

What does a risk ratio of <1 mean?

A

Disease occurrence is lower in the intervention group

45
Q

How do you calculate relative risk reduction?

A

(1-risk ratio)x100

46
Q

How do you calculate odds ratio?

A

odds of disease in exposed group / odds of disease in non-exposed group

47
Q

What are risk difference and NNT good at quantifying?

A

Impact of an intervention

48
Q

What are risk ratio and odds ratio good at quantifying?

A

Strength of association between intervention and disease status

49
Q

Which two tests give p-values when comparing binary variables between two groups?

A

Chi-squared test

Fishers exact test

50
Q

How do you calculate an ‘expected value’?

A

(row total x column total) / total sample size

51
Q

What are the assumptions of the Chi-sqaured test?

A

Total sample size is at least 40
OR
If sample is between 20 and 39, the expected value in each cell is at least 5

52
Q

What are the assumptions of the Fisher’s Exact Test?

A

Fewer than 20 participants

Between 20 and 39 participants and the expected value in at least one cell is less than 5

53
Q

What is the definition of correlation?

A

The association between two variables

54
Q

How can correlation be summarised graphically and numerically?

A

Graphically - scatterplots

Numerically - correlation coefficients

55
Q

What does a correlation coefficient do?

A

Quantifies the strength of association between two variables

56
Q

What is the difference between Pearson and Spearman correlation coefficient?

A

Pearson - linear relationship

Spearman - non-linear associations (monotonic - only positive or only negative)

57
Q

What does R Squared tell us?

A

The proportion of the variation in one variable that is explained by another variable

58
Q

How do we calculate R squared?

A

Multiply Pearson’s coefficient by itself

59
Q

What is linear regression used for?

A

Estimating the mathematical equation that describes the linear relationship between a quantitative outcome and a quantitative predictor

60
Q

What is the least squares estimation?

A

Method of estimating regression coefficients
Derives line of best fit/regression line
Estimate of the true regression line in the population

61
Q

What are the assumptions of regression?

A

Outcome is quantitative
Relationship is linear
Residuals are normally distributed
Constant variance

62
Q

What are the similarities and differences between regression and persons CC?

A

Pearsons quantifies strength of association

Regression describes the relationship and can be used to predict outcome variable score

63
Q

How do we calculate sensitivity?

A

TP / TP + FN

64
Q

How do we calculate specificity?

A

TN / TN + FP

65
Q

What is PPV?

A

Positive predictive value - proportion of those with a +ve result that have the condition

66
Q

What is NPV?

A

Negative predictive value - proportion of those with a -ve result that do not have the condition

67
Q

How do we calculate PPV?

A

TP / TP + FP

68
Q

How do we calculate NPV?

A

TN / TN + FN