Regression Analysis Flashcards
regression analysis
the study of the dependence of one variable
correlation analysis
measure the strength or degree of linear association between two variables
ex. correlation between smoking and lung cancer
explanatory variable
same as independent variable
simple regression
studies the dependence of a variable only on a single explanatory variable
multiple regression
one dependent variable but multiple independent variables
time series data
a set of observations on the values that a variable takes at different times
cross-sectional data
data of one or more variables collected at the same point in time
multiple regression formula
Y = B(0) + B(1)X(1)+… B(n)X(n) + error
big B = whole population
small b = sample (estimated)
B(0)
intercept term
can be interpreted as the average effect on Y of all the variables excluded from the model
ordinary least squares
method of fitting a straight line to a sample of data
R squared
measures how well the estimated regression fits the data (how close to the line it is)
from 0 -1
the higher the value the higher the quality of the model
adjusted r squared
same as r squared but for more variables to avoid overestimating the impact of adding an independent variable
F-test
overall significance
shows if there is a linear relationship between all of the x variables considered together and y
f-test conclusions
p-value < 5% then reject the null hypothesis and x is significant in explaining y
p-value > 5% cannot reject the null hypothesis and x is not significant
null hypothesis
no statistical significance exists
attempts to show that a single variable is no different than 0