Statistics 2 Flashcards
correlation
the strength of association between two quantitative variables
correlation
describes the extend to which one variable relies on the other
correlation coefficient
persons QUANTIFIES the strength of the LINEAR association between two quantitative variable
pearsons coefficient ranges from
-1 to 1
if a graph has any curve in it
NOT LINEAR - CANNOT CALCITE COEFFICIENT
when data is linear
when there is variation around and on the line
linear regression
used to describe the linear relationship between quantitative outcome and one or more predictor variables
linear regression can be used to
estimate mean scores on the outcome for subject with specific profile of score not he predictors
error in predictions
- simple relationship between weight and height
- regression line fitted to data, actual points may not lie on the line- vertical differences are errors – residuals
- each individuals data point will not lie on the line
when using linear regression
we must look at errors- residuals
residuals must be
normally distributed
- with constant variance]
if residuals have constant variance
size of error is unrelated to vale of predictor variable
if regression is 1.4
with each unit increase in the dependent variable, the independent variable increase by 1.4
confounding factors
factors which destroy relationships- meaning relationships are not causative
–> sometimes looking at simple correlation will not tell you the whole story
what can help tell the whole story
causal diagrams
-which show all factors in the system
in linear regression what is diagnostic of confounding
differences between adjusted and unadjusted analyses
consequences of confounding
bias in estimates fo exposure effect
e.g. stronger or weaker or opposite to true association
multiple regressions use
multiple variables
regression assumptions
- linear relationship
- constant variance of residuals
- homoscedasticity / normality
residuals are
the difference between the observed ad predicted outcome values - error terms for each person
residuals play an important role in
checking regression assumption
assumptions for regression must be met to ensure
CIs and P values
–> especially important in small sample sizes
how to check assumptions for regression
using histograms
checking for constant variance shows that
there is no relationship between residuals and the expected outcomes