Bio-statistics linear regression Flashcards

Question 1

Q

The relationship between outcome (Y) and a covariate (X) can be

Answer

A

either linear or non‐linear

Question 2

Q

the outcome is ……

and the exposure is ……

Answer

A

 The outcome is continuous.

 The exposure can be continuous or categorical.

Question 3

Q

A scatter‐plot can help determine:

Answer

A

 Is the relationship between outcome & covariate linear?
 How strong is the strength of the relationship?

Question 4

Q

correlation coefficient

Answer

A

The strength of the relationship can be negative, zero or positive,
and assessed by the correlation coefficient.

Question 5

Q

Outcome” (Y) other names

Answer

A

 Response variable

 Dependent variable

Question 6

Q

“Exposure” (X) other names

Answer

A

 Covariate
 Independent variable
 Predictor
 Explanatory variable
 Risk factor

Question 7

Q

Similarities between CORRELATION AND REGRESSION

Answer

A

 Create a scatter plot of outcome vs exposure. Observe the pattern.
 Outcome is continuous
 Exposure: continuous or categorical
 Hypothesis test used for both.
 Correlation: r = 0 vs r ≠ 0.
 Regression:  = 0 vs  ≠ 0.
 AIM: To find if there is an association between the chosen exposure and outcome.

Question 8

Q

Differences between CORRELATION AND REGRESSION

Answer

A

 Correlation:
r ranges from ‐1 to +1.
Strength of relationship.
 Regression:
 B‐coefficient can be any value.
Equation: outcome &amp; exposure.
Predict the value of outcome
from a certain exposure value.
Two types of regression:
Simple
Multiple

Question 9

Q

LINEAR REGRESSION – STEPS

Answer

A

Graph the data. Check linear relationship.
Calculate correlation coefficient
Do linear regression analysis
Evaluate the model
 Coefficient of determination (R2)
 Residual plot
 Normal probability plot

Question 10

Q

CORRELATION COEFFICIENT

Answer

A

 Correlation coefficient, p, quantifies the linear relationship between a pair of variables.
 The correlation coefficient can be between ‐1 and +1.
Stats package (Graph Pad, SPSS, Stata) used to obtain “r” .
Degrees of freedom: n ‐ 2

Question 11

Q

What is the  Hypothesis test for correlation:

Answer

A

 Null: Correlation = 0

 Alternative: Correlation ≠ 0

Question 12

Q

HOW TO INTERPRET A CORRELATION COEFFICIENT?

Answer

A

 r < 0.00 (Negative numbers)
Negative relationship. As X increases, Y decreases.
 r > 0.00 (Positive numbers)
Positive relationship. As X increases, Y also increases.

Question 13

Q

Ranges of r (magnitude)

Answer

A

Ranges of r (magnitude)
 0 to 0.3 = fairly weak
 0.3 to 0.7 = fairly strong
 0.7 to 0.9 = strong
 Above 0.9 = very strong

Question 14

Q

THREE ASSUMPTIONS OF LINEAR REGRESSION

Answer

A

The outcome (Y) variable follows a normal distribution.
 Check by histogram or boxplot.
The relationship between outcome (Y) and covariate (X) is linear.
 Check with a Scatterplot.
There is constant variance of the outcome across different values of the covariate.
 Check with a residual plot

Question 15

Q

Two types of linear regression models:

Answer

A

 Simple – one risk factor.

 Multiple – at least two risk factors

Question 16

Q

Equation of a simple regression line (one x variable):

Answer

A

y= B0+B1X1
B1 = slope of the line.
B0 = Y‐Intercept
x1 = The value of variable “x”.

Question 17

Q

WHAT DOES B1 REPRESENT?

Answer

A

The beta coefficient represents the amount of change in outcome variable for every unit change in the covariate, that is, the effect of the covariate on the outcome.

Question 18

Q

t‐score follows a t‐distribution with df = n – 2

Question 19

Q

The 95 % Confidence Interval

Answer

A

Statistic +,- Multiplier x Standard Error

= B1 +,- t xSE

Question 20

Q

What are the 3 ways to evaluate linear regression model?

Answer

A

 There are 3 ways to evaluate the linear regression model:
1. Coefficient of determination (R2)
2. Residual plot
3. Normal Probability Plot
These evaluate whether there are any outlier data points.
Outliers can have a large influence on the regression equation.

Question 21

Q

COEFFICIENT OF DETERMINATION (R‐SQUARED)

Answer

A

 The coefficient of determination tells us about the proportion
of variation in the outcome variable that is explained by the
covariate(s).
 It is the square of the correlation coefficient. i.e. R2 = r2
 R2 can range from 0 to +1. (r ranges from ‐1 to +1.)

Question 22

Q

What is the “residual”?

And how to calculate it?

Answer

A

 The “error” between the “observed” and “predicted value”.
i.e. How far away from the “line of best fit” is the point?

Residual = Observed value – Predicted value (from equation)

Question 23

Q

Residual plot of a linear regression

Answer

A

For linear regression:
 The residuals are random.
 They follow a normal distribution

Question 24

Q

NORMAL PROBABILITY PLOT

Answer

A

Why is it done?
 To check if the outcome (Y) variable is normally distributed.
 If the dots follow a straight line, the data is normal.
 If the dots are scattered at either tail, the data is skewed.

Not possible in Graph Pad.
Can be done in Excel (Data Analysis – Regression)

Question 25

Q

Stepwise regression

Answer

A

– a method of selecting significant factors in above.