Choosing statistics Flashcards

E-modules 2018/19

1
Q

What is needed to test the hypothesis?

A

Choice of statistical test

Patient population/study sample selected allows for comparison (i.e. inclusion/exclusion criteria)

Patient outcome measures (i.e. variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When the hypothesis proposes a correlation, what are the possible stats tests based on the variables?

A

Discrete
- Chi-Square

Continuous

  • Pearson (normally distributed)
  • Spearman rank (not normally distributed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When the hypothesis proposes a comparison between groups, what stats test do you use for discrete data?

A

Chi-Square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When the hypothesis proposes a comparison between groups, what stats test do you use for continuous, normally distributed data based on number of groups?

A

> 2 groups
- ANOVA (one variable)

2 groups

  • paired t-test
  • independent t-test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When the hypothesis proposes a comparison between groups, what stats test to you use for continuous, NOT normally distributed data based on number of groups?

A

> 2 groups
- Kruskal Wallis

2 groups

  • Wilcoxon (paired)
  • Mann Whitney (independent)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which statistical analysis tests for differences?

A
Chi-square
ANOVA
T-tests
Kruskal-Wallis
Wilcoxon
Mann-Whitney U-Test

*hypothesis proposes a comparison between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which statistical analysis tests for similarities?

A

Chi-Square
Pearson
Spearman rank

*hypothesis proposes a correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is quantitative data?

A

Numerical information about quantities

  • MEASURED: information can be measured and have continuous dimensions (height, temperature, BP)
  • COUNTED: information can be counted but not continuous (no. of children in family, no. of patients in clinic)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is qualitative data?

A

Information about qualities, it can’t actually be measured

Deals with descriptive information such as free-text comments to open-ended question/response to interview

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is categorical data?

A

In-between quantitative and qualitative

  • ORDINAL aspects can be easily converted into numerical data (i.e. scale on happiness can be given in numbers instead of words)
  • NOMINAL aspect consists of individual terms rather than sentences like in qualitative data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Broadly compare quantitative, qualitative, and categorical data

A

Quantitative = when you measure something and give it a number value

Categorical = when you classify something

Qualitative = when you judge something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Compare discrete and continuous data

A

Discrete data; counted

  • cannot be made more precise
  • i.e. number of children

Continuous data; measured

  • can be divided and reduced to finer and finer levels
  • i.e. height of a person
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Compare nominal and ordinal data

A

Nominal = items that are assigned individual named categories that do not have an implicit or natural value or rank
i.e. gender, fracture incidence

Ordinal = items which are assigned to categories that do have some kind of implicit or natural order
i.e. describe patients’ characteristics: stage of hypertension, pain level, and satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Broadly describe the mean and standard deviation

A

Mean is an average of the data

Standard deviation describes the width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is normality?

A

It measures the central tendency and dispersion of data, and is used to decide how to describe the properties of large data-sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the relative mean, median, and mode for the following skews:

a) positive skew
b) symmetrical distribution
c) negative skew

A

a) mode > median > mean
b) mean = median = mode
c) mode < median < mean

  • positive: >
  • negative <
17
Q

What is kurtosis?

A

Describes data that are heavy-tailed or light-tailed relative to a normal distribution

18
Q

Compare high and low kurtosis

A

High kurtosis
- tend to have heavy tails, or outliers that create a very wide distribution

Low kurtosis
- ten to have light tails, or lack of outliers that create a very narrow distribution

19
Q

What statistical tests are used to test for normality?

A

Shapiro-Wilks test
- small sample size (n<50)

Kolmogorov-Smirnov
- large sample size (n>50)

20
Q

What is a descriptive statistics?

A

Mean, mode or median

- used to categorise large data-sets into a tangible format

21
Q

Compare range, variance and standard deviation

A

Range
- measures how fat a set of numbers are spread out from their average value
Variance
- measure of the spread of the numbers away from the mean value
Standard deviation
- measure the spread of a set of data

22
Q

Compare IQR, standard error of mean, and confidence intervals

A

Interquartile range
- UQ - LQ
Standard error of mean
- measures how well the sample mean approximates to the population mean
Confidence intervals
- range of values in which true mean value might be found

23
Q

How do you determine whether groups are paired or independent?

A

Look at whether each group is composed of the same subjects of interest or if they are different

Paired = two data-sets come from the same individual
- measure same variable in same subject at different time points (longitudinal study)

Independent = two data-sets from different individuals
- comparing two groups with no common factors (cross-sectional study)

24
Q

Compare when to use parametric and non-parametric statistics

A

Parametric = normally distributed

Non-parametric = not

25
Q

Name parametric tests

A

Paired/independent t-tests

ANOVA

26
Q

Name non-parametric tests

A

Wilcoxon Signed Rank
Mann-Whitney U
Friedman (non-parametric equivalent of repeated measures one-way ANOVA)
Kruskal-Wallis

27
Q

When would you use the different t-tests?

A

Paired: different variables are compared with the same sample

Independent: same variable is compared by from different samples

28
Q

What does a one-way ANOVA tell you?

A

Used to compare the means from more than two samples with a normal distribution and will only tell you if a difference exists between your samples

Further stat tests (i.e. post hoc test) are needed to calculate exactly where the difference is

29
Q

What can the Pearson correlation coefficient tell you about correlation?

A

How strong the relationship is

Varies between -1 to +1 (from perfect negative to perfect positive correlation)

30
Q

Approximately what are the r-values for the following correlations:

a) very low,
b) low,
c) reasonable,
d) high,
e) very high?

A

a) 0.0-0.2
b) 0.2-0.4
c) 0.4-0.6
d) 0.6-0.8
e) 0.8-1.0

*can be +/-

31
Q

What is the r^2-value from a Pearson correlation?

A

Represents how closely your data is fitted to the correlation line

The higher the value, the more reliable your conclusion can be

32
Q

Compare correlation and regression

A

Correlation = indicates the strength of the relationship between two variables

Regression = quantifies the association between the two variables, i.e. tells us the impact that changing one variable will have on the other variable

33
Q

How is regression defined?

A

gcse

y = a + bx

a = the y-axis intercept value
b = the gradient of the line, i.e. the regression coefficient
34
Q

What does the chi square test measure?

A

It is a measure of the differences between observed and expected frequencies

Represented as X/X^2

35
Q

What does it mean when X^2 = 0?

A

The observed and expected frequencies are the same

36
Q

What does it mean the higher the X^2 value?

A

The bigger the difference between the observed and expected frequencies

37
Q

How can the size of a study affect the p-value?

A

Very small studies with few samples might not return a reliable p-value

Very large studies with many samples might be over powered and find a significant difference where none exists

38
Q

What is a type I error?

A

Incorrectly reject the null hypothesis when it is true (significance level, a-value)

False positive

39
Q

What is a type II error?

A

Incorrectly fail to reject the null hypothesis when it is false

False negative

*the greater the power of the test, the lower type II error rate (power = 1-beta; the closer the power is to 1, the better the test is at detecting a false null hypothesis)