Regression analysis Flashcards
What is the most common statistical method used to analyse data from epidemiological studies?
Regression analysis
what is the purpose of a regression analysis?
assess association between exposire and outcome accounting for possible confounders
What does correlation measure?
Strength of linear association between two variables
What is linear regression?
- Nature of association = helps identify which is the outcome and which is the exposure
- More formal description of association between two variables when outcome and exposure can be identified e.g. birth length and childhood height
- Depends on explicitly defining line which best describes association
- Allows estimation of value of outcome per unit change in exposure
In the linear regression line y = a + bx what does each letter represent?
x = exposure
y = outcome
a = y intercept
b = slope of line (= REGRESSION COEFFICIENT)
What are the limitations of using regression lines as estimates?
They should only be used to ,ale estimates within the range of data on which they were based (we have no information saying that the association is the same outside of this range - it could be totally different!)
What should you be aware of with regression lines?
UNITS
e.g. make sure that a change of 1 unit in esposure is feasible or not!
What can you also do with a linear regression line?
Calculate confidence intervals
Perform hypothesiss tests for regression coefficients (null hypothesis is that the coefficient value is 0)
What is R2?
Correlation coefficient squared
= proportion of variation in outcome explained by variation in exposure
The regression equation derived for the association between chip consumption and serum cholestrol is calculated to be:
Serum cholestrol (mmol per L) = 4.5 + 1 X (100g portion chips/week)
If a person eats 2 50g portions per week how much higher would ther predicted cholesterol be than someone who ate no chips?
1 mmol per L
Because 4.5 is nor,al persons serum cholestrol
the extra serum cholestrol is: 1 X (100g portion chips/week) = 1 X 1 = 1 mmol per L
Which of the following statements is false?
Both the correlation coefficient and the regression coefficient:
- Take the same value when there is no association between variables
- have the same associated p value when the null hypothesis of no association between variables is tested
- have the same sign (i.e. both +ve & both -ve)
- have units
THEY DO NOT BOTH HAVE UNITS
Correlation vs linear regression:
What do they measure?
What are there unuts?
What is there maximum and minimum?
What is their value when there is no association?
Does strength of evidence against null hypothesis depend on the statistic chosen (correlation vs. linear regression)?
No
Which provides more information correlation or regression coefficient?
regression coefficient
If exposure and outcome can be identified which is more appropriate correlation or regression coefficient?
Regression coefficient