Regression Flashcards
Regression modelling steps
Define problem or question
Specify model
Collect data
Do descriptive data analysis
Estimate unknown parameters
Evaluate model
Use model for prediction
Regression model is used to
Predict the value of a dependent variable based on the value of at least one
independent variable
Explain the impact of changes in an independent variable on the dependent
variable
A low correlation coefficient
gives a flatter slope
Large spread of Y, i.e. high standard deviation in y,
results in a
steeper slope (high value of a)
Large spread of X, i.e. high standard deviation in x,
results in a
flatter slope (high value of a)
The smaller the correlation,
the closer the
intercept is to the mean of y
Assumption of regression line
Linearity
The relationship between X and Y is linear
Independence of Errors
Error values are statistically independent
Particularly important when data are collected over a period of time
Normality of Error
Error values are normally distributed for any given value of X
Equal Variance (also called homoscedasticity)
The probability distribution of the errors has constant variance
General linear model
any model that describes the data in
terms of a straight line
Akaike information criterion
based on the concept of selecting the
model that produces a probability distribution that is closest to the (yet unknown)
true distribution