linear regression and correlation Flashcards

1
Q

What do we mean when we talk about bivariate data

A

Data where there are two variables.
The two variables can be either categorical, or numerical.
This session we are dealing with continuous bivariate data i.e. both variables are continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when do you use correlation

A

There is no distinction between the two variables. No causation is implied, simply association:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

when do you use regression

A

One variable y is a response to another variable x. You could use the value of x to predict what y would be:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Properties of Pearson’s correlation coefficient (r)

A

r must be between -1 and +1
+1 -perfect positive linear association

-1 = perfect negative linear association

0 = no linear relation at all

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Assumptions for hypothesis test and confidence intervals for p(coefficent)

A

Both variables are plausibly Normally distributed.
There is a linear relationship between them.
The null hypothesis is that there is no association between them.
Check assumptions with a scatter diagram of the data
Should display a roughly elliptical pattern.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the equation for Estimating the best fitting line

A

y=a+bx

y-dependent varaib;e
a-incept- start of the line- where it meets the horizontal axis
b-slope
x-independent V

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is Multiple linear regression

A

Sometimes there is more than one possible explanatory variable influencing the outcome variable.
Multiple linear regression can be used to investigate the influence of several explanatory variables simultaneously on the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why carry out a multiple regression analysis?

A

To identify any explanatory variables that may be associated with the y variable.

To investigate the extent to which one or more explanatory variables are linearly related to the y variable after adjusting for other variables that may be related to it.

To predict the value of the y variable from the explanatory x variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiple regression equation

A

Suppose we are interested in the effect of p explanatory variables, x1, x2,…, xp, on the outcome variable y.
The estimated multiple regression equation would be:
y = b0+ b1x1 + b2x2 + … + bpxp
Where xp is the pth explanatory variable.
y is the predicted value of the outcome given a particular set of values of x1, x2,…, xp;
b0, is the estimated intercept and is a constant term and is the value of y when all the xp’s are zero.
The bp’s are the estimated regression coefficients.
That is b1 represents the amount by which y increases on average if we increase x1 by one unit, but keep all the other xp’s constant (or adjust or control for them).

basically is we are looking to predicted birthweight for a baby girl of 30 weeks gestation born with a normal delivery to a mother aged 40?

the equation would be

b0+b1(age in this case30)+ bs (gestation) +b3(sex)+b4(delivery)

each b would dif have a value eg b1- 1.3 which would be xby what you are looking for so 1.3x30 would be the b1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is correlation

A

Correlation is used to denote association between two quantitative variables. The degree of association is estimated using the correlation coefficient. It measures the level of linear association between the two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is regression

A

Regression quantifies the relationship between two quantitative variables. It involves estimating the best straight line with which to summarise the association. The relationship is represented by an equation, the regression equation. It is useful when we want to describe the relationship between the variables, or even predict a value of one variable for a given value of the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly