Chpt 15 Flashcards

Question

multiple regression what does bi represent

Answer 1

an estimate of the change in y corresponding to a one-unit change in xi when all other I.Vs are held constant

Answer 2

r squared = SSR/SST

Answer 3

Rsqaured = SSR/SST

Answer 4

indicates we are measuring the goodness of fit for the estimated multiple regression equation - denoted R squared

Answer 5

the proportion of variability in the dependent variable that can be explained by the estimated multiple regression equation - when multiplied by 100, it can be interpreted as the % of the variability in y that can be explained by the estimated regression equation

Answer 6

90.4% of the variability in travel time y is explained by the estimated multiple regression equation

Answer 7

always increases

Answer 8

to avoid overestimating the impact of adding an i.v.

Answer 9

R squared a = 1 - (1-R sqaured)[ (n-1) (n-p-1)

Answer 10

of independent variables

Answer 11

causes prediction errors to become smaller, thus reducing the SSE SSR = SST-SSE - when SSE becomes smaller, causing R squared - ssr/sst to increase

Answer 12

becomes larger even if the variable added is not statistically significant

Answer 13

can take on a negative value | - in such cases, minitab sets the adjusted coeff of det to zero

Answer 14

same for all values of the I.vs

Answer 15

1. the error term E - is a random variable with an expected value of 0 2. The variance of E is the same for all values of the I.Vs 3. The values of E are independent 4. The Error term - is a normally distributed random variable

Answer 16

plane in 3-d space graph

Answer 17

difference b/w actual y and the expected value of y E(y) when x1 = x* and x2 = X*2

Answer 18

we now use response variable | graph is called a response surface

Answer 19

t and F test | - both provided the same conclusion

Answer 20

F test and t test

Answer 21

used to determine whether a significanct relationship exists b/w d.v and the set of i.vs

Answer 22

the test for Overall significance

Answer 23

used to determine whether EACH of the I.V.s is Significant | - a separate t test is conducted for EACH of the I.V. s

Answer 24

a test for individual significance

Answer 25

HO: B1 = B2 = .......Bp = 0 Ha: one or more of the parameters is NOT equal to zero

Answer 26

gives us sufficient statistical evidence to conclude that one or more of the parameters is NOT equal to zero and that the overall relationship b/w y and the set of I.V s is significant

Answer 27

we do not have sufficient evidence to conclude that a significant relationship is present

Answer 28

sum of squares / df

Answer 29

p df | p - # of i.v?

Answer 30

SSE has n-p-1 df

Answer 31

MSR = SSR/p

Answer 32

MSE = SSE/n-p-1

Answer 33

provides an unbiased estimate of Q squared (the variance of the error term E)

Answer 34

then MSR also provides an unbiased est of Q squared | and the value of MSR/MSE should be close to 1

Answer 35

MSR overestimates Q squared | MSR/MSE - becomes larger

Answer 36

p-value approach - reject HO if p-value < a CV approach reject if F > Fa

Answer 37

based on F distribution df for numerator = p df for denominator = n-p-1

Answer 38

standard error

Answer 39

sqaure root of MSE

Answer 40

t = bi / sbi

Answer 41

Ho: B1 =B2 = 0 Ha: B1 and / or B2 is not equal to zero

Answer 42

Ho: B1 =B2 = 0 Ha: B1 and / or B2 is not equal to zero

Answer 43

any variable being used to predict or explain the value of the D.V.

Answer 44

rule of thumb test - a sample correlation coefficent greater than +7 or less than =7 for 2 IVs is a warning of potential problems - try to avoid including i.vs that are highly correlated (in practice, this is rarely possible)

Answer 45

separating the effects of individual i.vs on the dependent variables is very difficult

Answer 46

1. a confidence interval and a prediction interval

Answer 47

mean travel for all trucks that travel 100 miles and make 2 deliveries

Answer 48

of the travel time for one specific truck that travels 100 miles and makes 2 deliveries

Answer 49

similar to simple linear reg, we use mini tab or excel or other software packages

Answer 50

similar to simple linear reg, we use mini tab or excel or other software packages

Answer 51

``` E(y|mechanical) = B0 + B1x1 + B2(0) = B0 + B1x1 for electrical E(y| electrical) = B0 + B1x1 = B2(1) = B0+B1x1+B2 =(B0+B2) + B1x1 ```

Answer 52

``` E(y|mechanical) = B0 + B1x1 + B2(0) = B0 + B1x1 for electrical E(y| electrical) = B0 + B1x1 = B2(1) = B0+B1x1+B2 =(B0+B2) + B1x1 ```

Answer 53

Positive - the mean repair time for electrical will be greater than that for mechanical Negative - the mean repair time for electrical will be less than that for mechanical 0 - no difference in the mean time b/w electrical and mechanical and the type of repair is NOT related to the repair time

Answer 54

Positive - the mean repair time for electrical will be greater than that for mechanical Negative - the mean repair time for electrical will be less than that for mechanical 0 - no difference in the mean time b/w electrical and mechanical and the type of repair is NOT related to the repair time

Answer 55

the use of a dummy variable provides 2 estimated regression equations that can be used to predict the repair time depending on if its mechiancal or electrical

Answer 56

k-1 | - each dummy variable is coded as a 1 or a zero

Answer 57

standardized residuals are frequently used in residual plots and in the identification of outliers

Answer 58

yi - y triangle hat i / Syi- y triangle hat i

Answer 59

the standard deviation of the residual i

Answer 60

S x square root of hi S- standard error of the estimate hi - leverage of observation

Answer 61

by how far the values of the I.vs are form their mean

Answer 62

to determine whether the distribution of E appears to be normal - same procedure as in simple linear regression - use software to compute it

Answer 63

if the value of the standardized residual is less than -2 or greater than +2

Answer 64

tends to increase the standard error of the estimate

Answer 65

S increases

Answer 66

use studentized deleted residuals

Answer 67

may detect outliers that standardized residuals do not detect

Answer 68

difference b/w an observed value and the value predicted by the model

Answer 69

residual / an estimate of its SD | - also called Pearson Residuals, M=0, SD =1

Answer 70

- the deleted residual for a case / by it's standard error

Answer 71

how much difference eliminating a case makes on its own prediction

Answer 72

how much difference eliminating a case makes on its own prediction

Answer 73

obtained by regressing using all of the data EXCEPT for the point in question

Answer 74

the standard error of the estimate based on the data set with i th observation removed

Answer 75

the ith ovservation is an outlier | - the absolure value of the ith studentized residual will be larger than the absolute value of the standardized residual

Answer 76

to determine whether the studentized deleted residuals indicate the presence of outliers

Answer 77

it's an outlier

Answer 78

how far the values of the I.V.s are form their mean values

Answer 79

use minitab

Answer 80

hi > CV, we have an influential observations?

Answer 81

observations can be identified as having a high leverage and not necessarily be influential - using leverage can lead to wrong conclusions

Answer 82

Cook's distance measure

Answer 83

both leverage of observation i, hi and the residual observation i (ui-Y triangle hat i)

Answer 84

Di - (Yi-y triangle hati)squared / (p+1)s Squared [hi / (1-hi)squared]

Answer 85

the ith observation is influential and should be studied further

Answer 86

estimate the prob that the bank will approve the request for a c/c given a particula set of vlaues for the chosen I.Vs

Answer 87

dependent variable y and one or more i.v s

Answer 88

E(y) = e B0+B1x1+B2x2+...Bpxp / 1+e B0+B1x1+B2x2+..Bpxp

Answer 89

s shaped graph

Answer 90

fairly rapidly as x increase father up

Answer 91

1. model the prob. of an event occurring depending on the values of the I.Vs, which can be categorical or numerical 2. estimate the prob. that an event occurs for a randomly selected observation vs the prob the event does not occur 3. predict the effect

Answer 92

1. model the prob. of an event occurring depending on the values of the I.Vs, which can be categorical or numerical 2. estimate the prob. that an event occurs for a randomly selected observation vs the prob the event does not occur 3. predict the effect of a series of variables on a binary response variable (0 or 1) 4. classify observations by estimating the prob that an observation is in a particular category (such as approved or not approved) Model, estimate, predict and classify)

Answer 93

b/c simple linear regression is one quantiative variable predicting another

Answer 94

multiple linear regression is simple linear regression with more i.vs

Answer 95

still 2 quantative variables but the data is curvilinear

Answer 96

1. binary data does not have a normal distribution (1 or 0), which is a condition needed for most other types of regression 2. predicted values of the DV can be beyond 0 and 1 which violates the definition of probability 3. probabilities are often not linear such as 'U" shapes where prob is very low or very high at extremes of x -values

Answer 97

1. binary data does not have a normal distribution (1 or 0), which is a condition needed for most other types of regression 2. predicted values of the DV can be beyond 0 and 1 which violates the definition of probability 3. probabilities are often not linear such as 'U" shapes where prob is very low or very high at extremes of x -values `

Answer 98

odds = Prob (occurring)/ Prob (not occurring) = p / (1-p)

Answer 99

odds = Prob (occurring)/ Prob (not occurring) = p / (1-p)

Answer 100

in logistic regression

Answer 101

the impact on the odds of a one-unit increase in only one of the I.Vs - the odd's that y = 1 given that one of the IVs has been increased by 1 unit (odds1)/(the odds that y = 1 given no change in the values (odds0)

Answer 102

an unknown p for any given linear combination of the I.vs (so the prob of succcess is p and failure is q = 1-p

Answer 103

Ho: B1 = B2 = 0

Answer 104

Follows a chi-square distribution df = # of I.V

Answer 105

we reject HO and conclude that the overall model is significant

Answer 106

overall significance

Answer 107

overall significance

Answer 108

used to determine whether each of the I.V.s is making a significant contribution to the overall model

Answer 109

the estimated coefficient divided by its standard error follows a normal prob distribution

Answer 110

A measure of the goodness of fit of the estimated multiple regression equation that adjusts for the number of independent variables in the model and thus avoids overestimating the impact of adding more independent variables

Answer 111

An independent variable with categorical data

Answer 112

A measure of the influence of an observation based on both the leverage of observation i and the residual for observation i

Answer 113

A variable used to model the effect of categorical independent variables. A dummy variable may take only the value zero or one

Answer 114

The term used to describe the correlation among the independent variables

Answer 115

A measure of the goodness of fit of the estimated multiple regression equation. It can be interpreted as the proportion of the variability in the dependent variable that is explained by the estimated regression equation

Answer 116

Regression analysis involving two or more independent variables.

Answer 117

The mathematical equation relating the expected value or mean value of the dependent variable to the values of the independent variables; that is, E(y) = B0 + B1x1 + B2x2 + . . . + Bpxp.

Answer 118

The mathematical equation that describes how the dependent variable y is related to the independent variables x1, x2, . . . , xp and an error term e.

Answer 119

The probability the event will occur divided by the probability the event will not occur

Answer 120

The odds that y = 1 given that one of the independent variables increased by one unit (odds1) divided by the odds that y = 1 given no change in the values for the independent variables (odds0); that is, Odds ratio = odds1yodds0

Chpt 15 Flashcards

(149 cards)