lecture 5 summary Flashcards
Hypothesis test
checks whether the observed difference occurred randomly because of sampling error or whether it indicates a difference between the two samples
Two sided test
states that the difference value must fall within one of the two end intervals
A one sided test states that
the difference value must only fall within one of the two end intervals
Type 1 error
a hypothesis test thats been declared positive while in reality its not true, a false positive
Type 2 error
is a hypothesis test that has been rejected while in reality it is true, a false negative
Chi-quare statistic
Is a fuction of the squares of the deviations of the observed counts n from their expected values (under some null hypothesis) E(n1) weighted by the reciprocals of their expected values:
with K-1 degrees of freedo
This test is mostly used y managers as a test for independence
Regression analysis
is the quantification of the slope of a regression line. Furthermore, it is designed to estimate the influence of one variable (X) on another variable (Y)
R^2 expresses the proportion of the explained variance in the dependent variable (Y) that is explained by the regression line (value range: 0 to 1)
We are interested in knowing how good our predictions are. FOr this
we use R quared. This is hte measure of the regression models ability to predict, also called the coefficient of determination of a model. it indicates how well a model explains the variance of the dependent variable
R Formula
R^2 = (regression coefficient)^2 * Variance of X/Variance of Y)
Regression analysis can be used for various purposes
explanation of relationships
stimulation of effects
Prediction
Identification of driving factors
Linear regression takes sevel keky assumptions
Multiple linear regression requires at least two independent variables
There should be a linear rrelationship between the dependent and independent variables
The error term is normally distributed
No multicollinearity: multiple regression assumes that the independent variables are not highly correlated with each other
Homoscedasticity: this indicates the variance of error terms are similar across the values of the independent vairables
Sample size (a rule of thumb): regression analysis requires at least 20 cases per independent variable