Study for SPSS Test Flashcards
Correlation
measure of the degree of association between two variables (x and
y) initially assumed to be numerical
Pearson’s correlation
Investigates the linear relationship between x and y (scatterplot of x and y)
Pearson’s correlation assumptions
Assumes both x and y to be numerical and that at least one is
normally distributed.
Pearsons correlation H0
: There is no linear association in the
population between the two variables X and Y
(ρ=0).
Spearmans rank correlation
Investigates any monotonic* relationship between x and y
Spearmans rank correlation assumptions
At least one variable should be continuous (the other could be ordinal) –
they do not have to be normal (ranks)
Spearmans H0
There is no association in the population
between the two variables X and Y (ρs=0)
Myotonic
As one increases so does other but maybe not at constant rate
Linear regression
e investigation of the relationship between two independent continuous
variables, X and Y (X’s for multiple linear regression). It can be used to predict Y
(dependent/response variable) from the X(s) (independent /predictive variable(s))
Simple linear regression equation
Y= β0 + β1X1 + e
Simple linear regression assumptions
Linear relationship between X(s) and Y – Check by Correlation, scatterplot of
X and Y, OR scatter plot of the Normalised Predicted values vs Dependent
values
* Normal distribution of RESIDUALS – Check by a Histogram of Residuals, or
by inspection of the P-P plots
* Constant variance – Check by a Scatterplot of Residuals vs Normalised
Predicted (sausage shaped, random scatter)
* Independent observations – Check by Scatterplot of Normalised Residuals
and Predicted - AGAIN (no organisation)
Simple linear regression fit of mode
The coefficient of variation, R2, indicates how much of the variation in Y
is explained by the proposed model
Simple linear regression what does ANOVA do
This divides the overall variation into the
variation explained by the regression model and the residual (left over or not
explained) variation and compares them.
SLR ANOVA H0
The relationship between X and Y in the population is not informative. The
regression coefficients are zero i.e. Ho: β0 = β1 = 0
SLR tests of individual coefficents H0
H0β0: The intercept is zero i.e. β0 = 0
H0β1: The effect of Xi is constant β1 = 0 i.e. the coefficient is flat
SLR t value for coefficient to be sig
t>2
Multiple linear regression H0
The relationship between all the X’s and Y in the population is not
informative. All the regression coefficients are zero
Equation of multiple logistic regression model
logit(p) = β0 + β1X1 + β2X2…+ βnXn + e
where logit(p)=loge(p/(1-p))
How to get odds ratio from logit
Exponential function of the coeffieient
Assumptions for logit regression
None
Bland altman plot
Analyse agreement between 2 methods
Both variables should be continuous
Interpretation bland-altman plots
Estimate of bias shows how big average discrepancy between 2 methods are, should be mean +/- 2SD
Checks variability consistence across range of values
When to use RR
Compare the risk of an event between 2 groups, used in cohort and RCT as measure of effect
RR H0
Risk of x is the same in both groups
When to use OR
Compare odds of exposure between those with the event to those without event
Calculate SE
SE = SD/square root n
Calculate CI
CI = x +/- 1.96 x SE