Regression Analysis Flashcards
What is regression analysis?
Regression analysis uses a mathematical model to predict a variable (y) based on values of other variables (x1, x2, … xk). It is the process of finding the mathematical model that relates y to a set of independent variables and best fits the data.
What is a dependent variable?
A dependent variable (y), aka response variable, is the variable to be modelled and/or predicted.
What is an independent variable?
An independent variable (x1, x2, … xk) are variables that are used to predict the response variable.
What is the equation for a probabilistic model?
y = E(y) + ε
Where E(y) = mean of y (i.e. the expected value of y)
Where ε = some random error
A probabilistic model is based on the theory of probability or the fact that _______________ plays a role in predicting future events.
randomness
What is the difference between a probabilistic model and a deterministic model?
A probabilistic model is based on the fact that randomness plays a role in predicting future events.
A deterministic model is the opposite of random - it tells us something can be predicted exactly, without the added complication of randomness.
There are 7 major steps in regression analysis.
What are they?
(Hint: H, C, U, E, U, V, I)
- Hypothesize the form of the model for the E(y) - expected value of y.
- Collect the sample data.
- Estimate the unknown parameters in the model using the sample data.
- Specify the probability distribution of ε (random error) and estimate any unknown parameters.
- Statistically check model adequacy.
- Check validity of the assumptions on the ransom error; Make modifications if necessary.
- Use the model for prediction and estimation.
There are 6 steps in regression for a probabilistic model [y = E(y) + ε].
What are they?
(Hint: H, C, E, P, C, P)
- Hypothesize the form of the model for the E(y) - expected value of y.
- Collect the sample data.
- Estimate the unknown parameters in the model.
- Specify the probability distribution of ε.
- Statistically check model adequacy.
- Use the model for prediction and estimation.
There are two types of regression data.
What are they?
Observational: where values of x are uncontrolled.
Experimental: where values of x are controlled via a designed experiment.
What is the difference between simple linear regression and multiple regression?
Simple Linear Regression involves a single independent variable.
Multiple Regression involves two or more independent variables.
How does this equation need to be modified to be considered a prediction equation?
E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22
![](https://s3.amazonaws.com/brainscape-prod/system/cm/348/583/927/q_image_thumb.png?1623547531)
We would need to update the entire equation to be a prediction equation for ŷ.
ŷ=B̂0+B̂1x1+B̂2x2+B̂3x1x2+B̂4x12+B̂5x22
where ŷ is the predicted value of y.
What is missing from the equation if we are supposed to use it for probabilistic model?
E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22
![](https://s3.amazonaws.com/brainscape-prod/system/cm/348/583/860/q_image_thumb.png?1623547368)
Within a probabilistic model, we would need to make sure to incorporate the +ε factor. The equation would be updated to read as follows:
E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22+ε
Within the following mathematical equation for the deterministic model pictured, what do β0,β1,β2,β3,β4,β5 represent?
E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22
![](https://s3.amazonaws.com/brainscape-prod/system/cm/348/583/770/q_image_thumb.png?1623547202)
β0,β1,β2,β3,β4,β5 are constants with values that would have to be estimated from the sample data.
Within the following mathematical equation for the deterministic model pictured, what doe the E(y) represent?
E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22
![](https://s3.amazonaws.com/brainscape-prod/system/cm/348/583/504/q_image_thumb.png?1623547111)
E(y) repesents the mean percentage price increase for a set of values (x1 and x2).
What is the purpose of collecting sample data for regression analysis?
The purpose of collecting sample data is to estimate the unknown parameters of a regression model, (The β’s).