Regression & Correlation Flashcards

1
Q

What does correlation refer to?

A

Degree to which two quantitative variables are related

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is commonly used to measure correlation in quantitative parametric data?

A

Pearsons correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does value of correlation coefficient ‘r’ vary between?

A

-1 to +1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Units of ‘r’?

A

None

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When is correlation coefficient not valid?

A

If data is not independent (paired)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When can Fishers transformation be used?

A

To compare two correlation coefficients for hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Partial correlations?

A

Correlations between two variables after adjusting for a third variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Spearmans correlation (rho)?

A

Non-parametric equivalent of Pearsons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can Spearmans be used for?

A

To test association between two variables if at least one is ordinal or
If sample size is small despite being continuous variables
or if non-linearity is suspected or
if non-normal distribution is noted for both variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Spearmans assume?

A

Difference between each pair of ordinal variables is the same i.e. the ranks are equidistant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If the difference between each pair of ordinal variables is not the same, how can one calculate correlation?

A

Kendalls Tau - appropriate measure of nonparametric correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does regression statistics help with?

A

Helps predict what value one variable will be if given a particular value of the other variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain the formula for simple linear regression

A

y = a +bx

B = regression coefficient
A = intercept on y axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What can simple linear regression predict?

A

Probable score in Y axis from known score in X axis i.e. dependent variable can be predicted from value of independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does one determine the value of a and b for regression?

A

Using a scattergram and method of least squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the method of using a scattergram and method of least squares

A

Hypothetical straight line is constructed so that its vertical distance from various points of observations on a scattergram is kept to a minimum; this is called the residue.
The sum of the square of residues is kept to a minimum for a regression line of good fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens in multiple linear regression?

A

Several independent variables together predict a single dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What type of technique is multiple regression?

A

Multivariate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the name of the independent variables in multiple regression?

A

Covariates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the name of covariates which may be highly correlated with each other?

A

Collinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Effect of collinearity?

A

May disturb the regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When is regression coefficient useful?

A

Examine confounders

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What can be used to express correlation coefficients?

24
Q

What does square of regression coefficient test?

A

Goodness of fit or final regression

25
What is goodness of fit?
Proportion of total variation in dependent variable that can be explained by the independent variable
26
What does goodness of fit measure?
How well actual outcome (dependent variable) and calculated dependent variable correspond
27
Values or goodness of fit
0-1
28
What is the coefficient of variation?
Goodness of fit calculated for Pearsons correlation coefficient
29
What is needed for linear regression?
Continuous dependent variable
30
What type of regression is used if dependent variable is binary?
Logistic regression
31
Why is logistic regression popular?
It can give OR, RR and hazard ratio for independent variables that affect the dependent variable
32
Test used if variable is continuous, 1 independent & dependent variable
Simple linear regression
33
Regression test used if continuous data with >1 independent variable and 1 dependent variable
Multiple linear regression
34
Regression test used if binary data, 1 independent variable and 1 dependent variable
Simple logistic regression
35
Test used if binary data, >1 independent variable and 1 dependent variable
Multiple logistic regression
36
What is log-linear analysis used for?
Categorical data
37
What must be noted in log-linear analysis?
No demarcation between the dependent and independent variable
38
What data can be used in logistic regression?
Continuous and categorical independent variables
39
What are Bernoulli random variables?
Variables that have dichotomous outcomes used in logistic regression
40
What is exponential correlation?
When one demonstrates the exponential relationship of a variable with a factor such as time using log-transformed values plotted against time
41
What is polynomial regression?
When in non-linearity, the relationship between dependent variable (y) and independent variable (x) is expressed as Y=X(n square)
42
What is the 1 in 10 rule?
Number of variables studied in multiple regression models must not be greater than 10% of sample size.
43
1 in 10 rule for logistic regression?
Number of variables must not be greater than 10% of number of events
44
How can multiple regression be performed
Stepwise regression Forward selection Backward elimination
45
What happens in stepwise regression?
Coefficient of regression calculated and starts with most significant to least significant independent variable and fits them in stepwise fashion into regression equation.
46
Disadvantage of stepwise regression
Sometimes statistically significant variables may not be clinically significant
47
Theory behind forward selection
Confounding factor is associated with both independent and dependent variable If one does not know the confounding variable, they are treated as covariates
48
What does one often examine in multiple regression
Which is the confounding variable
49
What happens in forward selection?
While constructing multiple regression equations, if the regression coefficient of a previously added variable changes then either one of the covariates is a confounder; these are retained in the equation irrespective of statistical significance. Latter added covariate is discarded if no change occurs in regression coefficient
50
What is backward elimination?
Starts with final model - full equation - and tries to discard covariates one by one according to changes that occur in correlation coefficients
51
In the equation y=a+bx+e what is y?
Dependent variable - outcome of interest
52
In the equation y=a+bx+e what is a and b?
Constants
53
In the equation y=a+bx+e what is b?
Slope or regression coefficient
54
In the equation y=a+bx+e what is x?
Independent variable - predictor of outcome
55
In the equation y=a+bx+e what is e?
Error. Random variable with mean = 0
56
In the equation y=a+bx+e what does e represent?
Part of variability of Y which is not explained by relationship with x
57
What can method of least squares be used for with respect to e (error)?
We can find best linear regression equation with minim variance of e