chpt 14 Flashcards

Question

What is B0 in the simple linear regression model

Answer 1

the y-intercept of the regression line or the value of y when x is 0

Answer 2

the slope of the regression line - the line tells us two things 1. whether the line is increasing or decreasing 2. how steep it is

Answer 3

the error term | - as good as our model might be, there is always random error term that cannot be accounted for

Answer 4

as x increases, so does y - positive relationship | B1 - will be positive

Answer 5

as x increases, y decreases, negative relationship | B1 - would be negative

Answer 6

no relationship, as x increases, y remains the same | B1 is 0

Answer 7

the predicted value of y for a given x value

Answer 8

y hat = b0 +b1x

Answer 9

how well does the estimated regression equation fit the data

Answer 10

a measure of the goodness of fit

Answer 11

the predicted value of the dependent variable y hat i

Answer 12

yi- y hat i

Answer 13

r squared = SSR/SST

Answer 14

the coefficient of determination

Answer 15

sum of squares due to regression

Answer 16

sum of squares for the total deviation

Answer 17

sum (y hat i - y bar) squared

Answer 18

the difference b/w the predicted values and the average or | how much the y hat values on teh estimated regression line deviates from y hat

Answer 19

sum of squares due to Error

Answer 20

sum (yi - ybar) squared

Answer 21

sum (yi - y hat i) squared

Answer 22

SST = SSR + SSE

Answer 23

we should expect that SST, SSR and SSE related from

Answer 24

SSR = SST | SSR / SST = 1

Answer 25

large values for SSE | - poorest fit when SSR = 0 and SSE = SST

Answer 26

percent of variability in y can be explained by x

Answer 27

95.5% of the variability in grades for instance, can be explained by the number of hours studied

Answer 28

it measures the strength of association b/w x and y

Answer 29

it measures the strength of association b/w x and y

Answer 30

between -1 and +1

Answer 31

means perfect positive linear relationship b/w x and y - no deviation - all the data points from the sample lay exactly on the line of regression with no deviation and the line slopes upward

Answer 32

means perfect negative linear relationship b/w x and y - no deviation - all data points from the sample lay exactly on the line of regression with no deviation and the line slopes downward

Answer 33

no relationship b/w x and y

Answer 34

rxy = (sign of b1)x square root of coefficient of determination or rxy = (sing of b1) x square root of rsqaured

Answer 35

slope of the estimate

Answer 36

the slope and then we use the sign for our slope example b1 is positive 4.74 then we use positive sign rxy = +.9505

Answer 37

a very strong positive linear relationship bw x and y

Answer 38

B0 no matter what value x is - the value of y does not depend on x (no linear relationship b/w x and y)

Answer 39

``` Ho= B1 = 0 Ha = B1 does not = 0 ```

Answer 40

t = b1 / sb1

Answer 41

standard error for slope

Answer 42

sb1 = s (standard deviation) / square root sum (xi - xbar)squared

Answer 43

s = square root of (SSE/n-2)

Answer 44

A measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation

Answer 45

The interval estimate of the mean value of y for a given value of x.

Answer 46

A measure of the strength of the linear relationship between two variables

Answer 47

The variable that is being predicted or explained. It is denoted by y.

Answer 48

The estimate of the regression equation developed from sample data by using the least squares method. For simple linear regression, the estimated regression equation is yˆ = b0 + b1x.

Answer 49

Observations with extreme values for the independent variables

Answer 50

The variable that is doing the predicting or explaining. It is denoted by x.

Answer 51

An observation that has a strong influence or effect on the regression results.

Answer 52

The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation; for the ith observation the ith residual is yi − yˆi.

Answer 53

A procedure used to develop the estimated regression equation. The objective is to minimize o( yi − yˆi)2.

Answer 54

The unbiased estimate of the variance of the error term s2. It is denoted by MSE or s2.

Answer 55

A graph of the standardized residuals plotted against values of the normal scores. This plot helps determine whether the assumption that the error term has a normal probability distribution appears to be valid

Answer 56

A data point or observation that does not fit the trend shown by the remaining data

Answer 57

The interval estimate of an individual value of y for a given value of x.

Answer 58

THe Equation that describes how the mean or expected value of the dependent variable is related to the independent variable; in simple linear regression, E(y) = b0 + b1x.

Answer 59

The equation that describes how y is related to x and an error term; in simple linear regression, the regression model is y = b0 + b1x + e.

Answer 60

The analysis of the residuals used to determine whether the assumptions made about the regression model appear to be valid. Residual analysis is also used to identify outliers and influential observations

Answer 61

Graphical representation of the residuals that can be used to determine whether the assumptions made about the regression model appear to be valid

Answer 62

A graph of bivariate data in which the independent variable is on the horizontal axis and the dependent variable is on the vertical axis

Answer 63

Regression analysis involving one independent variable and one dependent variable in which the relationship between the variables is approximated by a straight line.

Answer 64

The square root of the mean square error, denoted by s. It is the estimate of s, the standard deviation of the error term e

Answer 65

The value obtained by dividing a residual by its standard deviation

Answer 66

relationships between two or more variables

Answer 67

the size and the direction of the relationship

Answer 68

the two variables increase together and decrease together

Answer 69

they move in different directions

Answer 70

prediction

Answer 71

a model relating the two variables. the model has to be estimated, using a sample from the bivariate distribution, before it can be used

Answer 72

plot the data as a scatter diagram, and see if the assumption of a linear relationship is plausible

Answer 73

clustered around the line and the assumption of a linear relationship is reasonable

Answer 74

such as a curve, or in no pattern at all the assumption is violated that it is a linear relationship

Answer 75

height horizontal

Answer 76

error at the given xi

Answer 77

= y triangle hat and the error is zero

Answer 78

positive error

Answer 79

negative error

Answer 80

when all these error terms are zero, which means all the data points are lined up along a straight line

Answer 81

is a line throughthe points that minimizes the errors in some sense

Answer 82

deals with the squares of the errors, instead of the errors themselves, so it treats only the size of the errors and not heir sings

Answer 83

it minimizes the sum of squared errors

Answer 84

because it assesses the contribution of the regression as a source of variation compared to other sources of variation in the data

Answer 85

lumped under the general label "error" and are treated as one source. This is similar to the notion of "between" and "within" variations

Answer 86

one measure of the goodness of fit for a regression equation

Answer 87

only in a relative sense a value of say, 13.829 for MSE does not tell us whether the fit is good or bad. nor, if good, does it tell us how good the fit is. It is only useful when we compared it with MSE for another model or fit

Answer 88

the one with the smaller MSE is better

Answer 89

constructing tests of significance and confidence intervals

Answer 90

to estimate the standard error of an estimate which servers as a benchmark for decisions regarding the size of a difference between an estimate and its hypothesized value

Answer 91

a comparison can be made directly with the MSE, since MSE is itself a measure of variation. This results in the F test ?

Answer 92

an influential observation, it can be an outlier but not an influential observation, or it can be an influential observation but not an outlier

Answer 93

the y value (or equivalently, on the residual or standardized residual) of a point

Answer 94

the x values

Answer 95

influential observations

Answer 96

the dependent variable

Answer 97

The variable or variables being used to predict the value of the dependent variable are called the independent variables

Answer 98

1. one for the independent variable | 2. one for the dependent variable

Answer 99

no, it can only indicate how or to what extent variables are associated with each other any conclusions about cause and effect must be based upon the judgement of those individuals most knowledgeable about eh application

Answer 100

should be done with caution because outside that range we cannot be sure that the same relationship is valid

Answer 101

minimizes the sum of squared deviations between the observed values of the dependent variable yi and the predicated values of the dependent variable ytraingle hat i

Answer 102

to choose the equation that provides the best fit. It is the mostly widely used method

Answer 103

a measure of goodness of fit for the estimated regression equation

Answer 104

it is a measure of the error in using the estimated regression equation to predict values of the dependent variable in the sample

Answer 105

you would use the mean value the estimated regression is a much better predictor than using the mean value

Answer 106

the explained portion of SST

Answer 107

the unexplained portion of SST

Answer 108

yi - y triangle hat = 0 this means that every value of the dependent variable yi lies on the estimated regression line

Answer 109

SSE = 0 and SSR/SST = 1

Answer 110

poorer fits will have larger values of SSE

Answer 111

the largest value for SSE occurs when SSR = 0 and SSE=SST

Answer 112

poorest fit

Answer 113

take on teh values between 0 and 1

Answer 114

coefficient of determination

Answer 115

good a fit

Answer 116

90.27% of the variability in yi can be explained by the estimated regression equation

Answer 117

a descriptive measure of the STRENGTH of linear association between two variables (x and y)

Answer 118

between -1 and +1

Answer 119

indicates that the two variables x and y are perfectly related in a positive linear sense. That all data points are on a straight line that has a positive slope

Answer 120

indicate that x and y are not linearly related

Answer 121

rxy = (sing of b1) Square root of Coefficient of determination rxy = square root of r^2

Answer 122

Coefficient of Determination r^2 is between 0 and 1 Correlation Coefficient rxy = square root of r^2 is between -1 and +1

Answer 123

A linear relationship between two variables

Answer 124

nonlinear relationships and for relationships that have two or more intendent variables thus, the coefficient of determination r^2, provides a wider range of applicability

Answer 125

Coefficient of Determination

Answer 126

whether the relationship between x and y is statistically significant - such conclusion must be based on considerations that involve the sample size and the properties of the appropriate sampling distributions of the least squares estimators

Answer 127

sum of squared residuals

Answer 128

of the variability of the actual observations about the estimated regression line

Answer 129

yes, if it is just for one I.V.

Answer 130

no, only the F test can be used to test for an overall significant relationship

Answer 131

a higher degree of precision

Answer 132

the mean value of y for a given value of x

Answer 133

used to predict an individual value of y for a new observation corresponding to a given value of x

Answer 134

prediction interval

Answer 135

t a/2 spread

Answer 136

1. erroneous data - error recording, s/b corrected 2. signal a violation of the model assumption - may need to consider another model 3. unusual values that occurred by chance - should stay

Answer 137

1. it could be an outlier 2. can influence how the data is interpreted if this data set was removed, it would change our slope from negative to positive for example

Answer 138

1. can contribute to a better understanding of the appropriate mode and lead to a better estimate regression equation 2. try to obtain data on intermediate values of x to better understand the relationship b/w x and y

Answer 139

the father xi is form it's mean (x bar) the higher the leverage of the observation - need computer software to help with this

Answer 140

outside of the other data sets but won't change the line

Answer 141

outside of the other data sets by a lot

Answer 142

near to the line

Answer 143

need some work

Answer 144

if it is outside of the +2 or -2 from the mean line

Answer 145

by using the mean

Answer 146

R2 the coefficient of determination

Answer 147

t-test b1/sb1

chpt 14 Flashcards

(186 cards)