Regression and Correlation Flashcards

1
Q

Correlation

A
  • degree to which two quantitative variables are related

- does not suggest causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Pearson’s correlation coefficient

A
  • commonly used measure for quantitative parametric data
  • correlation ranges from -1 to +1
  • no units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Regression

A

-helps predict what the next number is going to be in correlated values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Regression co-efficient

A
  • y= a+bx

- b is the regression coefficient and a is the intercept on the y axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fisher’s transformation

A

-may be used to compare two correlation coefficients for hypothesis testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Partial correlations

A

-correlations between two variables after adjusting for a third variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Spearman’s correlation

A

(rho)

  • non-parametric equivalent of pearsons
  • used to test the association of variables if at least one is ordinal (ranked)
  • assumes ranks are equidistant
  • if this is not true then Kendall’s tau will be used
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to calculate the value of a and b in the regression calulation

A
  • done using a scatter gram and ‘method of least squares’
  • lines drawn from dots on the scattergram back to the line of good fit
  • these distances are called residue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiple linear regression

A
  • several independent variables together predict a single dependent variable
  • multivarate technique
  • the independent variables are called covariates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Collinearity

A
  • when two covariates studied may be highly correlated with each other
  • may disturb regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

R2

A
  • square of regression coefficient
  • also called the coefficient of determination
  • used to test goodness of fit or final regression
  • it is the proportion of total variation in the dependent variable that can be explained by the independent variable
  • measures how well the dependent variable and calculated dependent variable correspond to each other
  • ranges from 0 to 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Linear regression

A

-dependent variable must be continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Logistic regression

A

-used if the dependent variable is binary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Log-linear analysis

A

-accommodates only categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Bernoulli random variables

A

-variables that have dichotomous outcomes used in the logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Exponential correlation

A
  • used if one wants to demonstrate the exponential relationship of a variable with a factor such as time
  • log transformed values can be plotted against time
17
Q

Polynomial regression

A
  • in some cases of non-linearity, the relationship between dependent variable y and independent variable x could be expressed as Y=Xn, where n may be 2,3,4 etc
  • this is polynomial regresion
18
Q

1 in 10 rule

A
  • the number of varibles studied in multiple regression models must not be greater than 10% of sample size
  • for logistic regression the number of variable must not be greater than 10% of number of events
19
Q

Stepwise regression

A
  • calculates coefficient of regression and starts with most significant to least significant independent variable and fits them in a stepwise fashion into the regression equation
  • some statistically significant variables may not be clinically relevant
20
Q

Forward selection

A

-confounding variable is treated as covariates

21
Q

Backward elimination

A

-starts with the full equation and tries to discard covariates one by one according to changes that occur in correlation coefficients

22
Q

Y=a+bX+e

A
  • regression equation
  • Y= the dependent variable
  • a and b are constants, b is the regression coefficient
  • X is the independent variable
  • E is the error (random variable with mean of 0 ?!)
23
Q

Key point

A

-using method of least squares we can find the best linear regression equation with minimum variance of ‘e’