Exam 1 Flashcards

Question 1

Q

Why use models?

Answer

A

To understand the relationships between variable

To predict future outcomes

To quantify differences between groups or treatments

Question 2

Q

Response variable

Answer

A

the variable that you want to understand/model/predict. aka - y, dependent variable

Question 3

Q

explanatory variables

Answer

A

the variables you know and think that they are maybe related to the response variable that you want to use to figure out a pattern/model/relationship. aka - x, independent variable, predictor variable, covariates

Question 4

Q

model

Answer

A

a function that combines explanatory variables mathematically into estimates of the response variable

Question 5

Q

error

Answer

A

what’s left over; the variability in the response that your model doesn’t capture (error
is somewhat of a misnomer – maybe noise is a better term)

Question 6

Q

Categorical Data

Answer

A

Two outcomes, not numerical

Question 7

Q

Quantitative variables

Answer

A

Numerical

Question 8

Q

Parameter

Answer

A

Describes entire population

Question 9

Q

Statistic

Answer

A

Describes sample

Question 10

Q

The four-step process

Answer

A

Choose
Fit
Assess
Use

Question 11

Q

Model Notation

Answer

A

Y = f(X) + e

Question 12

Q

ybar or xbar

Question 13

Q

yhat

Question 14

Q

Y = ? (Simple Linear Regression)

Answer

A

Beta0 + Beta1*X + e

Question 15

Q

Yhat = ? (Simple Linear Regression)

Answer

A

Beta0 + Beta1*X

Question 16

Q

Naive Model

Answer

A

Mean + Error

Age = Agebar + e

Question 17

Q

Residuals

Answer

A

How far from the prediction line points are

yhat - y

Question 18

Q

Least Squares

Answer

A

Technique to minimize SSE
The value of all squared residuals is at a minimum

Question 19

Q

SSE

Answer

A

SSE =∑(yhat − y)^2

Question 20

Q

Regression Standard Error

Answer

A

σ = sqrt(SSE / n-2)

Question 21

Q

Linearity

Answer

A

If the resuduals resemble a line

Question 22

Q

Independence

Answer

A

Residuals do not depend on time. Don’t get bigger or smaller as plot goes on

Question 23

Q

Normality of Residuals:

Answer

A

The residuals are distributed symmetrically around zero, with no skewness or kurtosis.

Question 24

Q

Equal Variance of Residuals (homoskedasticity):

Answer

A

Variables have equal variance over time.

Question 25

Q

Standard Error

Answer

A

ei / σhat = yi - yhati/σhat

If greater than 3 it is considered an outliar

Question 26

Q

Leverage

Answer

A

Points that have extreme x values can have a disproportionate influence on the slope of the regression line

Question 27

Q

Hypothesis Testing

Answer

A

H0: B1 = 0
HA: B1 DNE 0

Question 28

Q

Test Statistic

Answer

A

t = B1hat / SE

Question 29

Q

Confidence Interval for Slope

Answer

A

Beta1 +/- t* SE

Question 30

Q

Coefficient of determination

Answer

A

R^2, How much of the variability is explained by the model

Question 31

Q

Partitioning variability

Answer

A

ANOVA
(yi - ybar) = (yhat - ybar) + yi - yhat)

Question 32

Q

SST

Answer

A

∑(yi - ybar)^2

Question 33

Q

SSM

Answer

A

∑(yhat-ybar)^2

Question 34

Q

SST, SSM, SSE Relationship

Answer

A

SST = SSM + SSE

Question 35

Q

R^2 =

Question 36

Q

Confidence Interval

Answer

A

sqrt(1/n + [x*-xbar]^2/[∑x-xbar^2])

Question 37

Q

Prediction Interval

Answer

A

sqrt(1 + 1/n + [x* -xbar]^2/[∑x-xbar^2])

Question 38

Q

MLR

Answer

A

Y = B0 + B1X1 + B2X2 +…+Bp*Xp + e

Question 39

Q

MLR with categorical data

Answer

A

Parallel slopes model

Question 40

Q

When does p-value explain

Answer

A

p-value < .05