E-module 2 - Choosing statistics Flashcards

1
Q

Which statistical tests are used when the hypothesis proposes a correlation between continuous variables:

  • with a normal distribution?
  • without a normal distribution?
A

Hypothesis proposes correlation between continuous variables:

  • normal distribution: Pearson
  • not normal: Spearman rank
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Test for comparison between 2 groups with continuous variables with a normal distribution?
- Paired vs unpaired data

A

Paired: paired t-Test
Unpaired: independent t-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which statistical test is used when the study uses discrete variables?

A

Chi squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Test for comparison between more than 2 groups with continuous variables with a normal distribution with:

  • one variable?
  • multivariate?
A
  • One variable: ANOVA

- Multivariate: Consult book

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Test for comparison between more than 2 groups with continuous variables without a normal distribution?

A

Kruskal Wallis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Test for comparison between 2 groups with continuous variables without a normal distribution with:

  • paired variables?
  • independent variables?
A

Paired: Wilcoxon
Independent: Mann Whitney

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 2 types of analysis can be used to test the hypothesis?

A
  • Correlations i.e. hypothesis tests to evaluate relationships between variables
  • Comparisons i.e. hypothesis tests to evaluate differences between groups or populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the different types of qualitative data and give examples of each?

A
  • nominal (unordered) e.g. gender, life status (alive/dead)
  • ordinal (ordered) e.g. fitness, stages of hypertension

both are non-parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the different types of quantitative data and give examples of each?

A
  • continuous (parametric) e.g. heart rate, age

- discrete (non-parametric) e.g. no. of males/females in a group, no of people with hypertension

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define discrete data

A

Discrete data is of a count that cannot be made more precise e.g. a family cannot have 2.4 children

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define continuous data

A

Continuous data can take any value between a range so it can be divided and reduced to finer and finer levels e.g. can measure height in progressively more precise scales: meters, centimetres, millimetres etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give an example of a variable that could be measured quantitatively or qualitatively

A

Eye colour can be measured quantitatively by assessing the RGB scale or qualitatively by categorising into blue, brown or green etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Give an example of a variable that could be interpreted as discrete or continuous

A

Age is a discrete variable if going by the number of years and continuous if looking for the exact age in months, days, hours minutes or seconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define nominal data

A

Items that are assigned individual named categories that do not have an implicit or natural value or rank. e.g. gender (male or female) or fracture incidence (yes or no).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define ordinal data

A

Items which are assigned to categories that have some implicit or natural order, such as ‘small, medium, or large’.
Ordinal variables are often used to describe a patient’s characteristics e.g. stage of hypertension, pain level, and satisfaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define a normal distribution

A

Normality measures the central tendency and dispersion of data and is used to decide how to describe the properties of large data-sets i.e. the descriptive statistics which are presented instead of the raw data.

17
Q

How can you determine whether distribution is normal?

A

By graphing data in a histogram (frequency distribution plot of the data points from a group or population) or a frequency bar chart

18
Q

Describe a normal curve

A

Symmetrical distribution with well-behaved tails i.e. many data points at the central region of the range and a symmetrical disruption either side.
Also called ‘Gaussian’ or ‘bell curved’

19
Q

Define skewed data

- left and right?

A

Asymmetric with many data points in the high or low end of the range and an uneven tail (long on one side and short on the other).

  • left-skewed distribution = negatively skewed -> long left tail and mean to left of peak
  • right-skewed distribution = positively skewed -> long right tail and mean to right of peak
20
Q

What is a kurtosis?

A

A kurtosis describes data are heavy-tailed or light-tailed relative to a normal distribution.

  • high kurtosis = heavy tails, or outliers that create a very wide distribution
  • low kurtosis = light tails, or lack of outliers that create a very narrow distribution.
21
Q

Why is the distribution of data important for statistical analysis?

A

Mathematics underpinning most statistical tests rely on the data having a normal distribution (i.e. two-thirds of data is within one standard deviation of the mean), and that the distribution is symmetrical (i.e. 50% of data is above the mean average and 50% is below).

22
Q

What are the statistical tests of normality?

A
  • Shapiro-Wilks test: used to test for normality with small sample sizes (n<50)
  • Kolmogorov-Smirnov: used to test for normality with large sample sizes (n>50)

n= no. of samples in data set
p-value <0.05 means data is not normally distributed

23
Q

What are descriptive statistics?

A

Raw data is usually presented in the form of descriptive statistics which summarise the data. They categorise large data-sets into a tangible format.

  • measure of central tendency - mean, mode or median.
  • measures of dispersion of the data - variance, standard deviation (SD) or standard error (also known as the standard error of the mean, SEM)
24
Q

Explain the difference between paired and unpaired data

A

Paired data is dependent and occurs when each group is composed of the same subjects of interest. Typically, paired observations arise from measuring the same variable in the same subject at different time-points i.e. longitudinal experiment

Unpaired data is independent. Each group is composed of the different subjects of interest. These observations are seen when comparing two groups with no common factors i.e. cross-sectional study

25
Q

What is a correlation coefficient?

A

A correlation coefficient tells you how strong the relationship is. It varies from 1 (perfect positive correlation) to -1 (perfect negative correlation).
e.g. Pearson’s r
Spearman’s rho (p)

26
Q

When are parametric and non -parametric tests used?

A

Parametric statistics (e.g. t-test, ANOVA) are used when the data is well described by the mean and standard deviation i.e. quantitative data which is normally distributed.

Non-parametric tests (e.g. Mann-Whitney, Wilcoxon signed rank test) are adopted when the population is not well described by the mean and standard deviation. i.e. when quantitative data is not normally distributed or when data is qualitative.

27
Q

Why are parametric statistics better than non-parametric?

A

Parametric tests are easier to understand, and the analyses are more powerful, and less likely the incorrectly reject or fail to reject a hypothesis.

28
Q

What is the most appropriate measure of dispersion for normally distributed data: mean, median or mode?

A

Mean

Median and mode should only be used for data that is not normally distributed
Mode is very rarely used