E-module 2 - Choosing statistics Flashcards by Shifa B

Which statistical tests are used when the hypothesis proposes a correlation between continuous variables:

with a normal distribution?
without a normal distribution?

Hypothesis proposes correlation between continuous variables:

normal distribution: Pearson
not normal: Spearman rank

How well did you know this?

Not at all

Perfectly

Test for comparison between 2 groups with continuous variables with a normal distribution?
- Paired vs unpaired data

Paired: paired t-Test
Unpaired: independent t-Test

How well did you know this?

Not at all

Perfectly

Which statistical test is used when the study uses discrete variables?

Chi squared

How well did you know this?

Not at all

Perfectly

Test for comparison between more than 2 groups with continuous variables with a normal distribution with:

one variable?
multivariate?

One variable: ANOVA

- Multivariate: Consult book

How well did you know this?

Not at all

Perfectly

Test for comparison between more than 2 groups with continuous variables without a normal distribution?

Kruskal Wallis

How well did you know this?

Not at all

Perfectly

Test for comparison between 2 groups with continuous variables without a normal distribution with:

paired variables?
independent variables?

Paired: Wilcoxon
Independent: Mann Whitney

How well did you know this?

Not at all

Perfectly

What are the 2 types of analysis can be used to test the hypothesis?

Correlations i.e. hypothesis tests to evaluate relationships between variables
Comparisons i.e. hypothesis tests to evaluate differences between groups or populations

How well did you know this?

Not at all

Perfectly

What are the different types of qualitative data and give examples of each?

nominal (unordered) e.g. gender, life status (alive/dead)
ordinal (ordered) e.g. fitness, stages of hypertension

both are non-parametric

How well did you know this?

Not at all

Perfectly

What are the different types of quantitative data and give examples of each?

continuous (parametric) e.g. heart rate, age

- discrete (non-parametric) e.g. no. of males/females in a group, no of people with hypertension

How well did you know this?

Not at all

Perfectly

Define discrete data

Discrete data is of a count that cannot be made more precise e.g. a family cannot have 2.4 children

How well did you know this?

Not at all

Perfectly

Define continuous data

Continuous data can take any value between a range so it can be divided and reduced to finer and finer levels e.g. can measure height in progressively more precise scales: meters, centimetres, millimetres etc.

How well did you know this?

Not at all

Perfectly

Give an example of a variable that could be measured quantitatively or qualitatively

Eye colour can be measured quantitatively by assessing the RGB scale or qualitatively by categorising into blue, brown or green etc.

How well did you know this?

Not at all

Perfectly

Give an example of a variable that could be interpreted as discrete or continuous

Age is a discrete variable if going by the number of years and continuous if looking for the exact age in months, days, hours minutes or seconds.

How well did you know this?

Not at all

Perfectly

Define nominal data

Items that are assigned individual named categories that do not have an implicit or natural value or rank. e.g. gender (male or female) or fracture incidence (yes or no).

How well did you know this?

Not at all

Perfectly

Define ordinal data

Items which are assigned to categories that have some implicit or natural order, such as ‘small, medium, or large’.
Ordinal variables are often used to describe a patient’s characteristics e.g. stage of hypertension, pain level, and satisfaction.

How well did you know this?

Not at all

Perfectly

Define a normal distribution

Study These Flashcards

Normality measures the central tendency and dispersion of data and is used to decide how to describe the properties of large data-sets i.e. the descriptive statistics which are presented instead of the raw data.

How can you determine whether distribution is normal?

Study These Flashcards

By graphing data in a histogram (frequency distribution plot of the data points from a group or population) or a frequency bar chart

Describe a normal curve

Study These Flashcards

Symmetrical distribution with well-behaved tails i.e. many data points at the central region of the range and a symmetrical disruption either side.
Also called ‘Gaussian’ or ‘bell curved’

Define skewed data

- left and right?

Study These Flashcards

Asymmetric with many data points in the high or low end of the range and an uneven tail (long on one side and short on the other).

left-skewed distribution = negatively skewed -> long left tail and mean to left of peak
right-skewed distribution = positively skewed -> long right tail and mean to right of peak

What is a kurtosis?

Study These Flashcards

A kurtosis describes data are heavy-tailed or light-tailed relative to a normal distribution.

high kurtosis = heavy tails, or outliers that create a very wide distribution
low kurtosis = light tails, or lack of outliers that create a very narrow distribution.

Why is the distribution of data important for statistical analysis?

Study These Flashcards

Mathematics underpinning most statistical tests rely on the data having a normal distribution (i.e. two-thirds of data is within one standard deviation of the mean), and that the distribution is symmetrical (i.e. 50% of data is above the mean average and 50% is below).

What are the statistical tests of normality?

Study These Flashcards

Shapiro-Wilks test: used to test for normality with small sample sizes (n<50)
Kolmogorov-Smirnov: used to test for normality with large sample sizes (n>50)

n= no. of samples in data set
p-value <0.05 means data is not normally distributed

What are descriptive statistics?

Study These Flashcards

Raw data is usually presented in the form of descriptive statistics which summarise the data. They categorise large data-sets into a tangible format.

measure of central tendency - mean, mode or median.
measures of dispersion of the data - variance, standard deviation (SD) or standard error (also known as the standard error of the mean, SEM)

Explain the difference between paired and unpaired data

Study These Flashcards

Paired data is dependent and occurs when each group is composed of the same subjects of interest. Typically, paired observations arise from measuring the same variable in the same subject at different time-points i.e. longitudinal experiment

Unpaired data is independent. Each group is composed of the different subjects of interest. These observations are seen when comparing two groups with no common factors i.e. cross-sectional study

What is a correlation coefficient?

A correlation coefficient tells you how strong the relationship is. It varies from 1 (perfect positive correlation) to -1 (perfect negative correlation). e.g. Pearson's r Spearman's rho (p)

When are parametric and non -parametric tests used?

Parametric statistics (e.g. t-test, ANOVA) are used when the data is well described by the mean and standard deviation i.e. quantitative data which is normally distributed. Non-parametric tests (e.g. Mann-Whitney, Wilcoxon signed rank test) are adopted when the population is not well described by the mean and standard deviation. i.e. when quantitative data is not normally distributed or when data is qualitative.

Why are parametric statistics better than non-parametric?

Parametric tests are easier to understand, and the analyses are more powerful, and less likely the incorrectly reject or fail to reject a hypothesis.

What is the most appropriate measure of dispersion for normally distributed data: mean, median or mode?

Mean Median and mode should only be used for data that is not normally distributed Mode is very rarely used

E-module 2 - Choosing statistics Flashcards

(28 cards)