8. Medical Statistics 2 Flashcards by Ibby Khan

what is BIVARIATE ANALYSIS

ANALYSIS of the RELATIONSHIP between 2 Variables:
RESPONSE variable and EXPLANATORY variable

statistical methods analyse how the outcome on the RESPONSE VARIABLE DEPENDS ON or is EXPLAINED BY the value of the EXPLANATORY VARIABLE

needed to formally investigate whether there are meaningful differences and not a result of random chance

How well did you know this?

Not at all

Perfectly

what is the RESPONSE VARIABLE

OUTCOME or DEPENDENT

the one on which COMPARISONS ARE MADE

How well did you know this?

Not at all

Perfectly

what is the EXPLANATORY VARIABLE

INDEPENDENT or EXPOSURE

usually DEFINES the 2 GROUPS being COMPARED

How well did you know this?

Not at all

Perfectly

STEPS in HYPOTHESIS TESTING

STATE the NULL and ALTERNATIVE HYPOTHESES
DECIDE what STATISTICAL TEST is appropriate
Use the test to CALCULATE the P-VALUE
WEIGH the EVIDENCE AGAINST the NULL

How well did you know this?

Not at all

Perfectly

what type of TEST is used for 2 NUMERICAL VARIABLES

CORRELATION / REGRESSION

Correlation (2 sided association)
Simple Linear Regression (one-sided association)

How well did you know this?

Not at all

Perfectly

what type of TEST is used for 2 CATEGORICAL VARIABLES

CHI-SQUARED TESTS

chi-squares test (unpaired)
McNemar test (paired)

How well did you know this?

Not at all

Perfectly

what type of TEST is used for 1 CATEGORICAL and 1 NUMERICAL VARIABLE

if 2 GROUPS: T-TEST (paired and unpaired)

if >2 GROUPS: ANOVA (unpaired) & ANOVA for repeated measures (paired)

How well did you know this?

Not at all

Perfectly

what are DEPENDENT SAMPLES

when the research hypothesis involves comparing the SAME PEOPLE who were measured twice or more often

(PAIRED DATA)

eg. a diet study in which subjects’ weights are measured before and after the diet.
the observation in the first (before diet) and second (after diet) samples are related because they refer to the same person

How well did you know this?

Not at all

Perfectly

Analysis of DEPENDENT SAMPLES requires use of the paired or unpaired version of the statistical tests

PAIRED VERSION

How well did you know this?

Not at all

Perfectly

what are INDEPENDENT SAMPLES

and is the paired or unpaired version of the respective statistical test used

involve comparison of 2 or more groups who are INDEPENDENT from each other
- DIFFERENT INDIVIDUALS

eg. randomised trial that randomly allocates subjects to 2 treatments

eg. observational study that separates subjects into groups according to their value for an explanatory variable (ie smoking status)

the UNPAIRED VERSION of the statistical test is used

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION

in SCATTERPLOT what is on the HORIZONTAL and VERTICAL AXIS

Horizontal, x : PREDICTOR VARIABLE

Vertical, y : RESPONSE VARIABLE

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION

what to look for in SCATTERPLOTS?

DIRECTION of the relationship
- NEGATIVE: as one goes up other goes down
- POSITIVE: as one goes up other also goes up

FORM of the relationship
- LINEAR? or not

STRENGTH of the relationship
- points appear tightly clustered in a single stream or form a vague cloud?

OUTLIERS

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION
what measures the STRENGTH of the LINEAR ASSOCIATION between 2 NUMERICAL VARIABLES

CORRELATION COEFFICIENT (r)

r is always between -1 and +1

r>0 POSITIVE correlation
r<0 NEGATIVE correlation

r=0 NO correlation

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION

do OUTLIERS AFFECT CORRELATION

Correlation is VERY SENSITIVE to OUTLIERS

an extreme outlier can cause a dramatic change in r

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION

what is COEFFICIENT of DETERMINATION (r^2)

expresses the PROPORTION of the VARIANCE in one variable that is ACCOUNTED FOR or ‘EXPLAINED’ by the variance in the other variable

square of r

eg. a study finds an r=0.40 between salt intake and blood pressure.
It can be concluded that 0.40^2 = 0.16
or 16% of the variance in blood pressure in this study is accounted for by salt intake

How well did you know this?

Not at all

Perfectly

CORRELATION / REGRESSION

what is SIMPLE LINEAR REGRESSION

we model the relationship between 2 QUANTITATIVE variables in such a way that we can PREDICT ONE VARIABLE FROM ANOTHER

it is an APPROXIMATION for the TRUE RELATIONSHIP

the 1ST STEP is to IDENTIFY the RESPONSE and PREDICTOR VARIABLE

RESPONSE VARIABLE (outcome or dependent)
- y variable, on vertical axis of scatterplot

PREDICTOR VARIABLE (explanatory or independent or exposure)
- x variable, on horizontal axis of scatterplot

CORRELATION / REGRESSION

how do REGRESSION and CORRELATION DIFFER

in REGRESSION, we MUST IDENTIFY RESPONSE and EXPLANATORY VARIABLES
CORRELATION does NOT REQUIRE one variable to be DESIGNATED as RESPONSE and the other as PREDICTOR

CORRELATION / REGRESSION

LINE OF BEST FIT equation with REGRESSION COEFICIENTS B0 and B1

y = b0 + b1x

(y=mx+c)

b0 = INTERCEPT (where line cuts the y axis. value of y when x=0)

b1 = SLOPE (gradient, the change in y for every 1 unit increase in x)

CORRELATION / REGRESSION

what does the SLOPE (how much y changes when x increases by 1 unit) DEPEND ON

the UNITS used to measure the variables

we can make the slope as large/small as we want by changing the units

CORRELATION / REGRESSION

what does the SLOPE tell us about ASSOCIATION STRENGTH

it DOES NOT TELL US whether the association is strong or weak

CORRELATION / REGRESSION

does CORRELATION depend on UNITS

the correlation is a standardized version of the slope

CORRELATION / REGRESSION

when using a REGRESSION model for PREDICTION where can you predict

ONLY WITHIN relevant range of data

do not try to extrapolate beyond the range of observed X’s

CORRELATION / REGRESSION

what does a STRONG CORRELATION between x and y mean

that there is a STRONG LINEAR ASSOCIATION between the 2 variables

do not say that increasing x by one unit ‘causes’ or ‘results in’ a corresponding change in y

what does CHI-SQUARED TEST (X^2) ASSESS

WHETHER 2 CATEGORICAL VARIABLES are ASSOCIATED

indicates HOW CERTAIN we can be that the VARIABLES are ASSOCIATED (NOT how strong the association is)

it compares FREQUENCIES - OBSERVED vs EXPECTED under the null hypothesis that the variables are independent

measures HOW FAR the OBSERVED cell counts in a contingency table FALL FROM the EXPECTED cell counts (for a null hypothesis)

(assumes expected counts more than or equal to 5 in all cells)

CHI SQUARED TEST (X^2) EQUATION

(observed count - expected count)^2 X^2 = sum ------------------------------------------------------------ expected count

what is the MCNEMAR CHI-SQUARED TEST

the equivalent of chi-squared test when we want to COMPARE BEFORE and AFTER findings - PAIRED DATA

what is a T-TEST (UNPAIRED or INDEPENDENT samples)

ASSESSES WHETHER a NUMERICAL RESPONSE VARIABLE DIFFERS between 2 GROUPS (of different individuals) COMPARES the MEANS and STANDARD DEVIATIONS of 2 groups - LIMITED to comparing ONLY 2 GROUPS T-Test is a PARAMETRIC test and ASSUMES the NUMERICAL VARIBALE is NORMALLY DISTRIBUTED and has EQUAL VARIANCE in both groups

when is a PAIRED T-TEST used

where the SAME GROUP contributes to REPEATED OBSERVATIONS (paired data) - SAME INDIVIDUALS

what is an ANOVA (ANALYSIS OF VARIANCE) TEST

when testing for SIGNIFICANT DIFFERENCE of a CONTINUOUS VARIABLE between 3 OR MORE GROUPS (t-test limited to only 2 groups) - only informs us whether there is an OVERALL DIFFERENCE between the MEANS Does NOT give us specific information about which groups are different - PARAMETRIC TEST ASSUMING NORMAL DISTRIBUTION and EQUAL VARIANCES - REPEATED MEASURES ANOVA test is performed when the SAME GROUP has contributed to REPEATED OBSERVATIONS (PAIRED DATA)