Bio-statistics linear regression Flashcards

1
Q

The relationship between outcome (Y) and a covariate (X) can be

A

either linear or non‐linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the outcome is ……

and the exposure is ……

A

 The outcome is continuous.

 The exposure can be continuous or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A scatter‐plot can help determine:

A

 Is the relationship between outcome & covariate linear?
 How strong is the strength of the relationship?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

correlation coefficient

A

The strength of the relationship can be negative, zero or positive,
and assessed by the correlation coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Outcome” (Y) other names

A

 Response variable

 Dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

“Exposure” (X) other names

A
 Covariate
 Independent variable
 Predictor
 Explanatory variable
 Risk factor
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Similarities between CORRELATION AND REGRESSION

A

 Create a scatter plot of outcome vs exposure. Observe the pattern.
 Outcome is continuous
 Exposure: continuous or categorical
 Hypothesis test used for both.
 Correlation: r = 0 vs r ≠ 0.
 Regression:  = 0 vs  ≠ 0.
 AIM: To find if there is an association between the chosen exposure and outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Differences between CORRELATION AND REGRESSION

A
 Correlation:
r ranges from ‐1 to +1.
Strength of relationship.
 Regression:
 B‐coefficient can be any value.
Equation: outcome & exposure.
Predict the value of outcome
from a certain exposure value.
Two types of regression:
Simple
Multiple
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

LINEAR REGRESSION – STEPS

A
  1. Graph the data. Check linear relationship.
  2. Calculate correlation coefficient
  3. Do linear regression analysis
  4. Evaluate the model
     Coefficient of determination (R2)
     Residual plot
     Normal probability plot
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CORRELATION COEFFICIENT

A
 Correlation coefficient, p, quantifies the linear relationship between a pair of variables.
 The correlation coefficient can be between ‐1 and +1.
Stats package (Graph Pad, SPSS, Stata) used to obtain “r” .
Degrees of freedom: n ‐ 2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the  Hypothesis test for correlation:

A

 Null: Correlation = 0

 Alternative: Correlation ≠ 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HOW TO INTERPRET A CORRELATION COEFFICIENT?

A

 r < 0.00 (Negative numbers)
Negative relationship. As X increases, Y decreases.
 r > 0.00 (Positive numbers)
Positive relationship. As X increases, Y also increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ranges of r (magnitude)

A
Ranges of r (magnitude)
 0 to 0.3 = fairly weak
 0.3 to 0.7 = fairly strong
 0.7 to 0.9 = strong
 Above 0.9 = very strong
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

THREE ASSUMPTIONS OF LINEAR REGRESSION

A

The outcome (Y) variable follows a normal distribution.
 Check by histogram or boxplot.
The relationship between outcome (Y) and covariate (X) is linear.
 Check with a Scatterplot.
There is constant variance of the outcome across different values of the covariate.
 Check with a residual plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Two types of linear regression models:

A

 Simple – one risk factor.

 Multiple – at least two risk factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Equation of a simple regression line (one x variable):

A
y= B0+B1X1
B1 = slope of the line.
B0 = Y‐Intercept
x1 = The value of variable “x”.
17
Q

WHAT DOES B1 REPRESENT?

A

The beta coefficient represents the amount of change in outcome variable for every unit change in the covariate, that is, the effect of the covariate on the outcome.

18
Q

t‐score follows a t‐distribution with df = n – 2

A

t=B1/SE

19
Q

The 95 % Confidence Interval

A

Statistic +,- Multiplier x Standard Error

= B1 +,- t xSE

20
Q

What are the 3 ways to evaluate linear regression model?

A

 There are 3 ways to evaluate the linear regression model:
1. Coefficient of determination (R2)
2. Residual plot
3. Normal Probability Plot
These evaluate whether there are any outlier data points.
Outliers can have a large influence on the regression equation.

21
Q

COEFFICIENT OF DETERMINATION (R‐SQUARED)

A

 The coefficient of determination tells us about the proportion
of variation in the outcome variable that is explained by the
covariate(s).
 It is the square of the correlation coefficient. i.e. R2 = r2
 R2 can range from 0 to +1. (r ranges from ‐1 to +1.)

22
Q

What is the “residual”?

And how to calculate it?

A

 The “error” between the “observed” and “predicted value”.
i.e. How far away from the “line of best fit” is the point?

Residual = Observed value – Predicted value (from equation)

23
Q

Residual plot of a linear regression

A

For linear regression:
 The residuals are random.
 They follow a normal distribution

24
Q

NORMAL PROBABILITY PLOT

A

Why is it done?
 To check if the outcome (Y) variable is normally distributed.
 If the dots follow a straight line, the data is normal.
 If the dots are scattered at either tail, the data is skewed.

Not possible in Graph Pad.
Can be done in Excel (Data Analysis – Regression)

25
Q

Stepwise regression

A

– a method of selecting significant factors in above.