Question 1

Regression

Accepted Answer

Describes a relationship which isn't deterministic between two variables one which is continuous. Allows easy visual analysis of linear /non-linear relationships - dependent and independent variables

Question 2

Regression line

Accepted Answer

y = a + bx ( y changes with x)

Straight line of best fit where a = y-intercept and b = slope of the line. This dependence of the mean of the y variable on the x variable is known as the
regression of y on x.

Question 3

Easiest way to assess trend between variables

Accepted Answer

Scatter plot

Question 4

Sum of all squares

Accepted Answer

Estimate a line - a line is then drawn up/down from the line to each induvidual point. This difference is squared to remove the -ve and added. The smallest number = line of best fit

Question 5

Correlation

Accepted Answer

test of the relationship between to variables r = 0 is a linear straight line

Question 6

Correlation coefficient

Accepted Answer

r. Can vary from 1 - -1 these begins the two extremes of correlation
1 = increase in one variable leads to a linear increase in the other variable
-1 = increased in one variable leads to an linear decrease in the other variable

Question 7

Assumptions for regression analysis

Accepted Answer

The sample is representative of the population for the inference prediction.

The error is a random variable with a mean of zero conditional on the explanatory variables.

The independent variables are measured with no error. (Note: If this is not so, modeling may be done instead using errors-in-variables model techniques).

The independent variables (predictors) are linearly independent, i.e. it is not possible to express any predictor as a linear combination of the others.

The errors are uncorrelated

We also assume that the spread of FEV1 about this mean is measured by a standard deviation, σ, about the line and that this does not change with height.

The variance of the error is constant across observations

Question 8

Regression of y on x

Accepted Answer

This dependence of the mean of the y variable on the x variable μ )( =α + βxx .

Question 9

What does B measure

Accepted Answer

Measures the rate at which the mean of the y variable changes as the x variable changes.

Question 10

B = 0

Accepted Answer

Mean of the y variable does not change with the x variable. Hence no association between the y and x variables.

Question 11

Outcomes to extract from regression analysis

Accepted Answer

the estimated slope and intercept, given under Coef; the standard error of the slope, given under SE Coef; the P-value for the test of the hypothesis β=0; the standard deviation about the line, given as S.

Question 12

a = y intercept

Accepted Answer

Mean value when x = 0, hence may be negative, needed for correct orientation and degree of slope

Question 13

Making predictions from regression

Accepted Answer

Natural variability needs to be taken into account hence wide wide limits on the prediction made for an individual are often in place estimate for height h is α + β h,

Question 14

Intervals for prediction

Accepted Answer

Confidence interval - calculate the uncertainty within the sample means to consturct an interval where we are 95% sure u (population mean) will lie NB as sample increased interval will reduce

Question 15

Prediction interval

Accepted Answer

Can expect to see the next data point sampled. Collect a sample of data and calculate a prediction interval. Then sample one more value from the population. If you do this many times, you’d expect that next value to lie within that prediction interval in 95% of the samples.

Crucially the prediction interval tells you about the distribution of values, not the uncertainty in determining the population mean. Account for uncertainty within the mean and data scatter hence are wider than confidence intervals

Regression + Correlation Flashcards

(25 cards)