eM2 – Choosing statistics Flashcards

Question 1

Q

In terms of analysis, what are correlations?

Answer

A

Hypothesis tests to evaluate relationships between variables

Question 2

Q

What are comparisons?

Answer

A

Hypothesis tests to evaluate differences between groups or populations

Question 3

Q

What is quantitative data?

Answer

A

Numeric information about quantities - i.e height width etc.

Question 4

Q

What is qualitative / categorical data?

Answer

A

Information that cannot be measured - i.e. gender, stages of disease etc.

Question 5

Q

Give two types of quantitative data and an example for each:

Answer

A

Continuous: age Counted (discrete): number of people with hypertension

Question 6

Q

Give two types of qualitative data and an example for each:

Answer

A

Nominal: Gender Ordinal: Fitness (not fit, quite fit, very fit)

Question 7

Q

What is the difference between continuous and discrete data?

Answer

A

Continuous can be divided to finer and more precise levels. Discrete data cannot be made more precise.

Question 8

Q

What is nominal data?

Answer

A

Qualitative data containing individual categories that cannot be put in an implicit rank/order

Question 9

Q

What is ordinal data?

Answer

A

Categories that have an implicit/natural order.

Question 10

Q

What is normality in terms of statistical analysis?

Answer

A

Normality is a measure of central tendency and dispersion of data - i.e symmetric distribution with “well behaved tails”

Question 11

Q

What is meant by left skewness?

Answer

A

Mean to the left of the peak, long tail in negative (decreasing) direction of curve

Question 12

Q

What is meant by right skewness?

Answer

A

Mean to the right of the peak, long tail in positive (increasing) direction of the curve

Question 13

Q

What is kurtosis?

Answer

A

The sharpness of a peak of a distribution curve

Question 14

Q

What two factors do statistical tests rely on?

Answer

A

50% of values above and below mean - symmetrical 2/3rds of data within 1 SD from mean - normal distribution

Question 15

Q

How to assess normality of data quantitatively?

Answer

A

Shapiro-Wilks test - n>50 Kolmogarov-Smirnof test - n<50

Question 16

Q

What is descriptive statistics?

Answer

A

A method of categorising large data sets into a format easy to read (tangible).

Question 17

Q

What is the mean?

Answer

A

μ = ( Σ Xi ) / N

Question 18

Q

What is the median?

Answer

A

(n+1)/2 -th number in the data set.

Question 19

Q

What is the mode?

Answer

A

Most frequent data entry.

Question 20

Q

What is the standard deviation in a data set?

Answer

A

σ = sqrt[Σ ( Xi – μ )^2 / N] A measure of how dispersed the data are from the mean.

Question 21

Q

What is meant by dependant (paired) data?

Answer

A

When the data being collected is consistantly being collected from the same subject

Question 22

Q

What is meant by parametric statistics?

Answer

A

When the data from the population are well described by the mean and SD - normally distributed.

Question 23

Q

What is meant by non-parametric statistics?

Answer

A

When the data is not well described by the mean - non-normally distributed quantitative data. note: non-parametric tests are used for qualitative data.

Question 24

Q

Parametric, 2 groups, paired

Answer

A

Paired t-test

Question 25

Q

Parametric, 2 groups, unpaired

Answer

A

Independant t-test

Question 26

Q

Parametric, 3+ groups, paired

Answer

A

Repeated measures, one way ANOVA

Question 27

Q

Parametric, 3+ groups, unpaired

Answer

A

one way ANOVA

Question 28

Q

Non-parametric, 2 groups, paired

Answer

A

Wilcoxon Signed Rank test

Question 29

Q

Non-parametric, 2 groups, unpaired

Answer

A

Mann-Whitney U test

Question 30

Q

Non-parametric, 3+ groups, paired

Answer

A

Kruskall-Wallis test

Question 31

Q

Non-parametric, 3+ groups, unpaired

Answer

A

Friedman test

Question 32

Q

To test for a linear relationship in a normally distributed population:

Answer

A

Pearson’s Correlation test

Question 33

Q

To test for a linear relationship in a non-normally distributed population:

Answer

A

Spearman’s Correlation test

Question 34

Q

Where are the mean median and mode in the skewed curves

Answer

A

mean and median are to the right in the right skewed curve

and the left in the left skewed curved

Question 35

Q

how do you calculate the mean median and mode?

Answer

A

mode - most frequent number

median - number of n+1/2

mean- add all the number / n

Question 36

Q

how do you calculate the range

Answer

A

largest minus smallest value

Question 37

Q

how do you calculate variance and standard deviation

Question 38

Q

how do you calulate the interquartile range?

Answer

A

calculate the middle of the first half and calculate the middle of the second half

if you have 11 numbers

Lower interquartile number is the 3rd number of the range

upper interquartile number 8th number of te range

substract the 8th number and the 3rd number of the range

Question 39

Q

what are test that can be used on parametric data

Answer

A

t test, Anova

Question 40

Q

What are tests that can be used on nonparametric test?

Answer

A

Mann-Whitney, Wilcoxon signed rank test

Question 41

Q

What statistical test is used to compare two variables that are parametric?

Answer

A

paired and unpaired t-test

Question 42

Q

What statistical test is used to compare more than 2 variable that are parametric?

Answer

A

ANOVA test

one-way ANOVA (paired t-test). compares an independent and dependent variable
Two-way ANOVA compares a two independent variables.
MANOVA: this is a multivariate ANOVA test,

Question 43

Q

What will the anova test tell you?

What are the other test that can be used?

Answer

A

ANOVA will only tell you if a difference exists between your samples e.g. it will inform you if sample A, sample B and sample C have different means it will not tell you where the difference is i.e. is it between A&B, A&C or B&C?

post hoc test such as a Tukey post hoc test or a Bonferonni post hoc test.

Question 44

Q

What is the equivalent test to the paired t-test for non parametric data?

Answer

A

WILCOXON TEST

quantitative data-sets that do not have a normal distribution. Only the p-value needs to be reported.

Question 45

Q

What is the equivalent statistical test for the unpaired t test?

Answer

A

Mann-Whitney U test

Question 46

Q

What is the equivalent of the one way ANOVA test?

Answer

A

Kruskal-Wallis test

Question 47

Q

What is teh equivalent of the repeated measures one-way ANOVA?

Answer

A

Friedman test

Question 48

Q

describe the strength of the pearon correlation

Answer

A

(±) 0-0.2: very low correlation

(±) 0.2-0.4: low correlation

(±) 0.4-0.6: reasonable correlation

(±) 0.6-0.8: high correlation

(±) 0.8-1.0: very high correlation

Question 49

Q

What does the pearson correlation show and when is it used?

Answer

A

If your data are normally distributed, you should use a Pearson’s Correlation test to identify linear trends.

Question 50

Q

What does the p value of the indicate in the pearson correlation equation

Answer

A

he p-value in this case tells you how reliable the r-value is. The smaller the p-value, the more reliable the r-value

Question 51

Q

What does the r squared value of the Pearson correlation indicate?

Answer

A

This represents how closely your data is fitted to the correlation line. A similar rule of thumb applies with both the r and the r2-values i.e. the higher the r2-value the more reliable your conclusion can be.

Question 52

Q

When is the spearman rank correlation used?

Answer

A

not normally distributed,
identify linear trends

give you rho value (similar to r value in pearson correlation)

p value - how reliable the rho value is

Question 53

Q

What is a linear regression?

Answer

A

It is defined by a simple equation: y = a + bx

Where:

a= the y-axis intercept value

b= the gradient of the the line, i.e. the regression coefficient

Using this equation, you can calculate the value on the y-axis if you know the value of the x-axis or vice versa.

Question 54

Q

What is the difference between the correlation and regression?

Answer

A

correlation indicates the strength of the relationship between two variables.
Regression quantifies the association between the two variables i.e. it tells us the impact that changing one variable will have on the other variable.

Question 55

Q

What can the chi squared test be used for?

Answer

A

simmilarities and also can be used to evaluate the qualitative data

Question 56

Q

What does the X2 mean in the chi squared test mean?

Answer

A

X2= 0 means that there is no difference between expected and observed

X2= larger means the larger the difference between expected and observed values

Question 57

Q

Since the X2 value in the chi squared test is difficult to evaluate what is used for evaluation

Answer

A

the p value

Question 58

Q

What statistical test would you use for blood pressure (Gaussian) between males pre-renal denervation and post renal denervation?

Answer

A

Paired T-test

Question 59

Q

What statistical test would you use for blood pressure (Gaussian) between males and females?

Answer

A

Unpaired T-test

Question 60

Q

What statistical test would you use for tumour size (not-normal) in men with prostate cancer in 3 age classes?

Answer

A

Kruskal wallis

Question 61

Q

What statistical test would you use for tumour size (not normal) in women with breast cancer before, during and 5 years after cancer treatment?

Answer

A

Friedman test

Question 62

Q

What statistical test would you use for resting heart rate (Gaussian) in children, men and women?

Answer

A

One-way ANOVA

Question 63

Q

What statistical test would you use for fear level (rated 1-4) in children before and after exposure therapy?

Answer

A

Wilcoxon Signed Rank Test

Question 64

Q

What statistical test would you use for height (normal) in children before, during and after puberty?

Answer

A

Repeated-measures, one-way ANOVA

Answer 64

A

Mann Whitney U

Answer 65

A

Title
Labelled axes with units
Legend
[Plus: Annotations to describe certain elements, asterisks to denote significance]

Answer 66

A

Title- descriptive or declarative title
Method of generation (brief, 1 sentence)
Result (brief) explanation, sample size and p values
Definition of symbols/ scale bars/ error bars/ abbreviations

Answer 67

A

Pie chart
Bar chart
Histogram
Dot-plot
Box and whiskers
Scatter plot
Line graph
Cumulative frequency
Bubble plot
Stem and Leaf plot

Answer 68

A

If you want to show pieces of a whole e.g. demographic breakdown.

Answer 69

A

When comparing categorical (x) and numerical (y) data.

Answer 70

A

When comparing continuous quantitative (x) and quantitative counted (y) (e.g. heart rate vs frequency)

Answer 71

A

Similar to bar charts but with smaller data sets. More visually appealing.

Answer 72

A

To summarise a single data set (more for non parametric numerical data)

Answer 73

A

There are 5 numbers (Lower extreme, lower quartile, median, upper quartile, upper extreme). The box shows the interquartile range and the whiskers show the extremities.

Answer 74

A

To show similarities between two data sets. It is conventionally used between two continuous numerical variables. A line can be added to show correlation.

Answer 75

A

Line graph- LIne joining points, time and dependent variable
Cumulative frequency graph- Similar to histogram but uses curve (incl. dose-response)

Answer 76

A

Bubble- similar to scatter but size of bubble represents a third variable
Stem and leaf- Displays general distribution, hybrid between table and graph. Used for moderately sized data sets.

Answer 77

A

ompares parametric means of data from more than 2 samples. It can be for paired data (repeated measures, one-way ANOVA) or unpaired (one-way ANOVA)

Answer 78

A

Compares two independent variables

Answer 79

A

Multivariate ANOVA

Answer 80

A

You cannot tell which sets of data of the 3+ samples show difference.

You need to use a (Tukey/Bonferonni) post hoc test

Answer 81

A

Wilcoxon- non-parametric, paired, two sample

Mann-Whitney U- non-parametric, unpaired, two sample

Kruskal Wallis- non-parametric, paired, 3+ sample

Friedman- non parametric, unpaired, 3+ sample

Answer 82

A

skewedness is teh mode is skewed

kurtosis is the tail

Answer 83

A

Mean, median and mode

Answer 84

A

Standard deviation

Variance

Range

Answer 85

A

A measure of the spread of the numbers away from the mean value. It is calculated by working out the average of the squared differences from the mean. You are not required to know how to calculate this for the RDS course.

Answer 86

A

Square root of the variance. Measures the spread of a set of data