Simple Linear Regression Flashcards

Question

If b1 equals 0.758, what does this mean? Put this into context when predicting the number of doctor visits and health problems.

Answer 1

For every 1 unit increase in X, Y changes .758. For every 1 unit of additional health problems, the number of doctor visits go up .758 times.

Answer 2

It tells us if we had NO physical health problems, we would expect to go to the doctor .036 times (almost zero times).

Answer 3

The function is fit. The output for regression in R shows the residuals and coefficients.

Answer 4

b0 interpretation: The expected number of dv is (b0 value) when no iv has been reported. ex) The expected number of doctor visits is .036 when no physical health problems have been reported. b1 interpretation: The expected number of dv is expected to increase by (b1 value) for every additional iv. ex) The expected number of doctor visits is expected to increase by .758 for every additional physical health problem.

Answer 5

By setting the derivative to 0, we found the best fit of the line in solving for the 2 unknown equations, b0 and b1.

Answer 6

The SS residual (E squared), and on the information tables, the formula is (Y-ŷ )squared.

Answer 7

ŷ = we sum up the b0+b1X of all the observations to get the best fitting line.

Answer 8

1. Sum of residuals = 0 (y-y-hat = 0) 2. Sum of the squared residuals is at a minimum (y-y-hat squared) 3. Sum of observed values equals sum of the fitted values (y-bar = ŷ ) - The sum of the observed values are equal to the fitted or predicted value. 4. The regression line always goes through the point (X,Y) 5. Residuals are uncorrelated with predictor - the relationship between x and y is uncorrelated. 6. The fitted Y value is less extreme on Y than the associated X value is on X (this property is called Regression towards the mean).

Answer 9

If we plug in the SD value (above or below the mean), into the regression equation, I will get closer to or regress towards the mean and the outcome will be less extreme. So this makes the ŷ values less extreme (location is closer to the mean) on Y than their corresponding x-values on X. Ex) ŷ = .036 + .758 (7.58) = 5.97

Answer 10

SD = the average deviation from the mean Variance = the average SQUARED deviation from the mean

Answer 11

We take the estimate of the scores around the regression line by computing the SD (sum of squares over df) and variance (squared). Formula for regression line: sqrt of Σ (Y-ŷ)² / n-k-1 *n is # of rows or observations *k is # of predictors (1 for simple linear regression).

Answer 12

The average squared distance of each point around the regression line

Answer 13

Our residuals have to distributed normally.

Answer 14

The square root of MSresidual over SSx. Numerator: It is the square root of the residual, or the variability of our model after removing everything associated to X. Denominator: Square root of the SSx.

Answer 15

T-tests are a measure of whether the predictor is making a significant contribution to the model. We are testing the hypothesis that a b-coefficient significantly differs from zero (meaning it's significantly different from the mean). >If the slope (b1) is not different from zero then we can't use our x-variable to predict y because the slope would be flat (because if our x was 17, it would still be in the same place on y). We are comparing the differences between the b and the amount of error - if the standard error is small, we can conclude that the b-values are all similar to the b-value in our sample. t = b-observed ÷ standard error. A larger t-value, or smaller p-value indicates that the predictor had a significant effect on predicting the outcome.

Answer 16

Our slope is significantly different from zero, which is what we want. If the intercept is not significant, it is not different from the origin.

Answer 17

Predicted line | Fitted value

Answer 18

When we take off the hat and add an e for error, we want to know if we're doing a good job predicting the y-value from our line (which is dependent on x-information).

Answer 19

Total deviation = Regression + Residual We are trying to see how good the predictor is.

Answer 20

The regression deviation (or model) is like SSbetween. It's the amount of variability in Y that's explained by our line (predictor). The residual deviation is what can't be explained by the line. I want to know how my model has improved my ability to predict Y over just having the mean of Y. Because ŷ involves x information, so Im looking at: Am I able to improve my ability to predict Y by including y predictor, or is there no real difference between what I have no based on the model and the mean of y. We are trying to see if ADDING the predictor improves my ability to predict y from x, or is it no different from what I have based on our model. So it's the improvement of model over what's unexplained by my model.

Answer 21

Anything above the y bar is residual, while everything below the y bar is regression. If there is a larger distance between the line and observed value below the slope, it indicates that the regression value is larger than the residual - it's a good predictor bc we are explaining more variability by including the predictor (or doctor visits). We compute the variability of the points by doing squared sums of squares / df... eventually doing MSregression/MSresidual (similar to ANOVA)

Answer 22

A model fit ANOVA table

Answer 23

Looks great!

Answer 24

The regression model significantly fits the data, such that the number of reported physical health problems significantly predicts the number of doctor visits, F (1,8) = 5.85, p

Answer 25

R² = The effect size, or the proportion of variance. We divide SSregression with SStotal and report the number as a percentage.

Answer 26

r² is correlation effect size, and we square the correlation. R² is the effect size of regression, where we divide SSregression by SStotal. R² is also the squared correlation between the predictor and outcome.

Answer 27

47% of the variability in Y (number of doctor visits) can be explained by the number of X (reported physical healthy problems).

Answer 28

No, we're not explaining the majority of variability of X on Y. We're only explaining per degrees of freedom.

Answer 29

Running regressions are contingent on Gauss-Markov Assumptions. We also need to know how to test them (which happens after we run the regression model and then evaluate it - after data screening). If any of these are violated, we can look to other tests. The first 3 assumptions are about our variables: 1. All predictors are quantitative (numeric) or dichotomous (gender dummy coded as 0 or 1), and the criterion is quantitative, continuous, and unbounded (-∞ to ∞). All variables are measured without error. 2. All predictors must have non-zero variance. A zero variance indicates that the predictor is constant. We can't make a line, and mathematically will be Undefined. 3. There is an absence of perfect multicollinearity. Multicollinearity states two or more predictor variables in a multiple regression model are highly correlated.. but predictors should not be the same. On a graph, it will look platykurtic. The last 3 assumptions are about our error: 4. The expected (average) value of the error term is 0 at each value of the predictors. If x = 1, then the average residual for all points where x = 1, the average error would equal to zero. 5. Each predictors is uncorrelated with the error term. 6. The variance of the error term is constant (Homoscedasticity) at each value of the predictors. 7. Error terms for different observations are uncorrelated (Independence of observations).

Answer 30

The significance of R2 can be tested using an F-ratio. F = R²/k over (1-R²)/ (n-k-1). We take our model R² over the variability left-over This formula allows us to get the F-value without computing the SS or variance.

Answer 31

We can take the proportions of variability explained by the model and the proportions of variability unexplained by the model, both per degrees of freedom, to produce the F value.

Answer 32

It is important to meet the 7 assumptions to run regressions.

Answer 33

It's that our error term is constant at each value of our predictor. Homoscedasticity refers to the assumption that that the dependent variable (y) exhibits similar amounts of variance across the range of values for an independent (x) variable. On a scatterplot, observations that are shaped as megaphones are heteroscedastic- we want an even distribution at the point of X.

Answer 34

``` It means that OLS is Blue! B - best L - linear U - unbiased E - estimator ``` This means that the distribution of my parameters is unbiased and the mean is in the center of the distribution.

Answer 35

8. At each value of the IVs, the errors are normally distributed (skews, platykurtic, bi-modal). This CAN be violated but still NOT violate Gauss-Marcov assumptions. However, we need normality because to test my coefficients, we used a one-sample t-test... and one of the assumptions of a one sample t-test is normality. The standard error will be incorrect due to bias, therefore, the t-value will be incorrect.

Answer 36

It gets replaced by a best fit that define the line better than the model.

Answer 37

The difference between the score predicted by the line and the score that the participant actually obtained.

Answer 38

Residuals are synonymous with 'deviations'. Residuals are technically the deviations from the line.

Answer 39

How well a model that is generated fits the data - based on how well the data predicted by the model actually corresponds to the data that's collected. We still need to asses this model to make sure this is the best one for the data.

Answer 40

It represents the degree of inaccuracy when the best model is fitted to the data. SStotal represents the difference between the mean of the observed Ys and the predicted values. Together they are used to calculate how much better the regression line (line of best fit) is better than the mean, or R-squared.

Answer 41

It is the measure of how much the model has improved the prediction of the outcome compared to the level of inaccuracy of the model. So MSreg (improvement) ÷ MSres (errors and inaccuracies).

Answer 42

The line representing the mean is flat - so as predictor values change, the value of the outcome don't change. *thinking back to author predicting $1 and $100,000 difference in advertising - use of mean would say that they would sell $200,000 either way...

Answer 43

That when no money is spent on advertising (when X = 0), the model predicts that 134,100 albums will be sold.

Answer 44

That for an increase of $1,000, the model predicts 96 extra album sales.

Answer 45

When all assumptions are met, we can apply the model we get from sample to population because the coefficients and parameters of that regression is unbiased.

Answer 46

1. All predictors must be quantitative or dichotomous. All criterion must be quantitative, continuous, and unbounded. All without Error. 2. All predictors must have non-zero variance. A zero variance indicates that the predictor is constant, and there won't be a line and it would be undefined. 3. No Multicollinearity - No correlation between predictors.

Answer 47

4. The expected value of the Error term is zero at each value of the predictor. So if X = 1, then the average residual for all points where X = 1 means the average error would equal to zero. 5. Predictors are uncorrelated with the error term. 6. Variance of the Error Term is constant (HOMOSCEDASTICITY) at each value of predictor. 7. Error terms for different observations are Uncorrelated (Independence of observation).

Answer 48

We are MINIMIZING the sum of the squared residuals. The properties of regression equations based on OLS solutions being true: 1. Sum of residuals = 0 (y- ŷ = 0) 2. Sum of the squared residuals is at a minimum (y- ŷ squared) 3. Sum of observed values equals sum of the fitted values (y-bar = ŷ ) The sum of the observed values are equal to the fitted or predicted value. 4. The regression line always goes through the point (X,Y) 5. Residuals are uncorrelated with predictor - the relationship between x and y is uncorrelated. 6. The fitted Y value is less extreme on Y than the associated X value is on X (this property is called Regression towards the mean).

Answer 49

It signifies that the distribution of the parameters are UNBIASED and the mean is in the center of the distribution.

Answer 50

It allows you to have accurate standard errors which allow us to do t-tests properly.

Answer 51

Variance of the Estimate AKAK MS Residual. Variance of the Estimate measures the Residual stuff that can't be explained by our Regression. MSReg ÷ MSRes

Answer 52

It is the denominator in our F-test.

Answer 53

We can run a t-test to see the significance of the Intercept and coefficient.

Answer 54

It's R-squared, and we calculate R-squared by SSRes ÷ SSTotal It explains the variability of X that's attributed to the Criterion.

Answer 55

R-squared -> The variability of Y that's explained by X b0 ->Value of Y when X is 0 b1->Change is Y when X increases by 1 unit To plot the regression line, we need to plot the predicted values To calculate residual, we do y-hat - y.

Simple Linear Regression Flashcards

(81 cards)