eM2 – Choosing statistics Flashcards

1
Q

In terms of analysis, what are correlations?

A

Hypothesis tests to evaluate relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are comparisons?

A

Hypothesis tests to evaluate differences between groups or populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is quantitative data?

A

Numeric information about quantities - i.e height width etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is qualitative / categorical data?

A

Information that cannot be measured - i.e. gender, stages of disease etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Give two types of quantitative data and an example for each:

A

Continuous: age Counted (discrete): number of people with hypertension

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give two types of qualitative data and an example for each:

A

Nominal: Gender Ordinal: Fitness (not fit, quite fit, very fit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between continuous and discrete data?

A

Continuous can be divided to finer and more precise levels. Discrete data cannot be made more precise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is nominal data?

A

Qualitative data containing individual categories that cannot be put in an implicit rank/order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is ordinal data?

A

Categories that have an implicit/natural order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is normality in terms of statistical analysis?

A

Normality is a measure of central tendency and dispersion of data - i.e symmetric distribution with “well behaved tails”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is meant by left skewness?

A

Mean to the left of the peak, long tail in negative (decreasing) direction of curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is meant by right skewness?

A

Mean to the right of the peak, long tail in positive (increasing) direction of the curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is kurtosis?

A

The sharpness of a peak of a distribution curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What two factors do statistical tests rely on?

A

50% of values above and below mean - symmetrical 2/3rds of data within 1 SD from mean - normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to assess normality of data quantitatively?

A

Shapiro-Wilks test - n>50 Kolmogarov-Smirnof test - n<50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is descriptive statistics?

A

A method of categorising large data sets into a format easy to read (tangible).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the mean?

A

μ = ( Σ Xi ) / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the median?

A

(n+1)/2 -th number in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the mode?

A

Most frequent data entry.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the standard deviation in a data set?

A

σ = sqrt[Σ ( Xi – μ )^2 / N] A measure of how dispersed the data are from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is meant by dependant (paired) data?

A

When the data being collected is consistantly being collected from the same subject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is meant by parametric statistics?

A

When the data from the population are well described by the mean and SD - normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is meant by non-parametric statistics?

A

When the data is not well described by the mean - non-normally distributed quantitative data. note: non-parametric tests are used for qualitative data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Parametric, 2 groups, paired

A

Paired t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Parametric, 2 groups, unpaired

A

Independant t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Parametric, 3+ groups, paired

A

Repeated measures, one way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Parametric, 3+ groups, unpaired

A

one way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Non-parametric, 2 groups, paired

A

Wilcoxon Signed Rank test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Non-parametric, 2 groups, unpaired

A

Mann-Whitney U test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Non-parametric, 3+ groups, paired

A

Kruskall-Wallis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Non-parametric, 3+ groups, unpaired

A

Friedman test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

To test for a linear relationship in a normally distributed population:

A

Pearson’s Correlation test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

To test for a linear relationship in a non-normally distributed population:

A

Spearman’s Correlation test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Where are the mean median and mode in the skewed curves

A

mean and median are to the right in the right skewed curve

and the left in the left skewed curved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

how do you calculate the mean median and mode?

A

mode - most frequent number

median - number of n+1/2

mean- add all the number / n

36
Q

how do you calculate the range

A

largest minus smallest value

37
Q

how do you calculate variance and standard deviation

A

?

38
Q

how do you calulate the interquartile range?

A

calculate the middle of the first half and calculate the middle of the second half

if you have 11 numbers

Lower interquartile number is the 3rd number of the range

upper interquartile number 8th number of te range

substract the 8th number and the 3rd number of the range

39
Q

what are test that can be used on parametric data

A

t test, Anova

40
Q

What are tests that can be used on nonparametric test?

A

Mann-Whitney, Wilcoxon signed rank test

41
Q

What statistical test is used to compare two variables that are parametric?

A

paired and unpaired t-test

42
Q

What statistical test is used to compare more than 2 variable that are parametric?

A

ANOVA test

  1. one-way ANOVA (paired t-test). compares an independent and dependent variable
  2. Two-way ANOVA compares a two independent variables.
  3. MANOVA: this is a multivariate ANOVA test,
43
Q

What will the anova test tell you?

What are the other test that can be used?

A

ANOVA will only tell you if a difference exists between your samples e.g. it will inform you if sample A, sample B and sample C have different means it will not tell you where the difference is i.e. is it between A&B, A&C or B&C?

post hoc test such as a Tukey post hoc test or a Bonferonni post hoc test.

44
Q

What is the equivalent test to the paired t-test for non parametric data?

A

WILCOXON TEST

quantitative data-sets that do not have a normal distribution. Only the p-value needs to be reported.

45
Q

What is the equivalent statistical test for the unpaired t test?

A

Mann-Whitney U test

46
Q

What is the equivalent of the one way ANOVA test?

A

Kruskal-Wallis test

47
Q

What is teh equivalent of the repeated measures one-way ANOVA?

A

Friedman test

48
Q

describe the strength of the pearon correlation

A

(±) 0-0.2: very low correlation

(±) 0.2-0.4: low correlation

(±) 0.4-0.6: reasonable correlation

(±) 0.6-0.8: high correlation

(±) 0.8-1.0: very high correlation

49
Q

What does the pearson correlation show and when is it used?

A

If your data are normally distributed, you should use a Pearson’s Correlation test to identify linear trends.

50
Q

What does the p value of the indicate in the pearson correlation equation

A

he p-value in this case tells you how reliable the r-value is. The smaller the p-value, the more reliable the r-value

51
Q

What does the r squared value of the Pearson correlation indicate?

A

This represents how closely your data is fitted to the correlation line. A similar rule of thumb applies with both the r and the r2-values i.e. the higher the r2-value the more reliable your conclusion can be.

52
Q

When is the spearman rank correlation used?

A
  1. not normally distributed,
  2. identify linear trends

give you rho value (similar to r value in pearson correlation)

p value - how reliable the rho value is

53
Q

What is a linear regression?

A

It is defined by a simple equation: y = a + bx

Where:

a= the y-axis intercept value

b= the gradient of the the line, i.e. the regression coefficient

Using this equation, you can calculate the value on the y-axis if you know the value of the x-axis or vice versa.

54
Q

What is the difference between the correlation and regression?

A
  1. correlation indicates the strength of the relationship between two variables.
  2. Regression quantifies the association between the two variables i.e. it tells us the impact that changing one variable will have on the other variable.
55
Q

What can the chi squared test be used for?

A

simmilarities and also can be used to evaluate the qualitative data

56
Q

What does the X2 mean in the chi squared test mean?

A

X2= 0 means that there is no difference between expected and observed

X2= larger means the larger the difference between expected and observed values

57
Q

Since the X2 value in the chi squared test is difficult to evaluate what is used for evaluation

A

the p value

58
Q

What statistical test would you use for blood pressure (Gaussian) between males pre-renal denervation and post renal denervation?

A

Paired T-test

59
Q

What statistical test would you use for blood pressure (Gaussian) between males and females?

A

Unpaired T-test

60
Q

What statistical test would you use for tumour size (not-normal) in men with prostate cancer in 3 age classes?

A

Kruskal wallis

61
Q

What statistical test would you use for tumour size (not normal) in women with breast cancer before, during and 5 years after cancer treatment?

A

Friedman test

62
Q

What statistical test would you use for resting heart rate (Gaussian) in children, men and women?

A

One-way ANOVA

63
Q

What statistical test would you use for fear level (rated 1-4) in children before and after exposure therapy?

A

Wilcoxon Signed Rank Test

64
Q

What statistical test would you use for height (normal) in children before, during and after puberty?

A

Repeated-measures, one-way ANOVA

65
Q

What statistical test would you use for unpaired non-parametric tests with 2 groups?

A

Mann Whitney U

66
Q

What must figures and tables have?

A

Title
Labelled axes with units
Legend
[Plus: Annotations to describe certain elements, asterisks to denote significance]

67
Q

What must a legend include?

A

Title- descriptive or declarative title
Method of generation (brief, 1 sentence)
Result (brief) explanation, sample size and p values
Definition of symbols/ scale bars/ error bars/ abbreviations

68
Q

What are the types of graph you can use?

A

Pie chart
Bar chart
Histogram
Dot-plot
Box and whiskers
Scatter plot
Line graph
Cumulative frequency
Bubble plot
Stem and Leaf plot

69
Q

When would you use a pie chart?

A

If you want to show pieces of a whole e.g. demographic breakdown.

70
Q

When would you use a bar chart?

A

When comparing categorical (x) and numerical (y) data.

71
Q

When would you use a Histogram?

A

When comparing continuous quantitative (x) and quantitative counted (y) (e.g. heart rate vs frequency)

72
Q

When would you use a dot-plot?

A

Similar to bar charts but with smaller data sets. More visually appealing.

73
Q

When would you use a box and whiskers chart?

A

To summarise a single data set (more for non parametric numerical data)

74
Q

What do box and whiskers charts show?

A

There are 5 numbers (Lower extreme, lower quartile, median, upper quartile, upper extreme). The box shows the interquartile range and the whiskers show the extremities.

75
Q

When would you use a scatter plot?

A

To show similarities between two data sets. It is conventionally used between two continuous numerical variables. A line can be added to show correlation.

76
Q

What are line graphs and cumulative frequency curves?

A

Line graph- LIne joining points, time and dependent variable
Cumulative frequency graph- Similar to histogram but uses curve (incl. dose-response)

77
Q

What are bubble plots and stem and leaf plots?

A

Bubble- similar to scatter but size of bubble represents a third variable
Stem and leaf- Displays general distribution, hybrid between table and graph. Used for moderately sized data sets.

78
Q

What is the one-way ANOVA?

A

ompares parametric means of data from more than 2 samples. It can be for paired data (repeated measures, one-way ANOVA) or unpaired (one-way ANOVA)

79
Q

What is the two way ANOVA?

A

Compares two independent variables

80
Q

What is the MANOVA?

A

Multivariate ANOVA

81
Q

What is the problem with a one way ANOVA and how do we get over this?

A

You cannot tell which sets of data of the 3+ samples show difference.

You need to use a (Tukey/Bonferonni) post hoc test

82
Q

What are the non parametric tests and what data are they used for?

A

Wilcoxon- non-parametric, paired, two sample

Mann-Whitney U- non-parametric, unpaired, two sample

Kruskal Wallis- non-parametric, paired, 3+ sample

Friedman- non parametric, unpaired, 3+ sample

83
Q

What is the difference between normal, kurtosis and skewness?

A

skewedness is teh mode is skewed

kurtosis is the tail

84
Q

How do you measure central tendency?

A

Mean, median and mode

85
Q

How do you measure dispersion?

A

Standard deviation

Variance

Range

86
Q

What is variance?

A

A measure of the spread of the numbers away from the mean value. It is calculated by working out the average of the squared differences from the mean. You are not required to know how to calculate this for the RDS course.

87
Q

What is standard deviation?

A

Square root of the variance. Measures the spread of a set of data