General 4 Flashcards
We are 95 percent confident that the interval (0.02883, 0.0267, or upper and lower)
contains the true slope of the regression line. So we are 95 percent confident that the actual population slope, lies within this interval.
In the table of the t two major information
first degree of freedom and second a quantity is 5 percent in 95 percent confidence interval.
Degree of freedom: has 3 performance
First general intuitionally:
Second in the multiple regression estimation
Third: in Q2 exam when we want to know the difference between the observed value and expected value
First general intuitionally:
it increases amount of population variance of standard deviation. Because of estimating the population variance from the ‘‘sample variance’’ and standard deviation. In sample variance, we want to estimation the population variance and we don’t have known where is population variance exactly.
Second in the multiple regression estimation.
That it is n-k-1 and the k is a number of variables and intercept. Because when we have more explanatory variables thus we have a lower error degree.
Third: in Q2 exam
when we want to know the difference between the observed value and expected value it because we can know the remaining variable with first two variables identification.
Why 1 in N-1 formulate in the calculation of the degree of variables? What is the formulate logic?
What is the formulate logic? The logic is that in the calculation of the variance of the population just 1 amount cannot change in free mode and that amount is mean! The other variables change freely.
What’s the minimum observation required to run a regression? Why we don’t use just two observations that draw the best fit line?
Because the regression NOT just estimates the gradients and intercept but also estimate uncertainty we have with those values.
The degree of freedom in simple regression:
N-2 (gradient and intercept).
The main use of the degree of freedom in regression for:
Standard error calculation.
Additional independent values in multiple regression,
decrease the degree of freedom.
t statistic
Coefficient / Standard error
The number of dummy variables is the number
of all variables minus one. Ex. When there is a four-variable the maximum number of variables will equal to three.
For a good model always the R square has to amount to
more than 60 percent.
F statistics say that
there is an x percent probability that the explanatory improvement of our model is due to the random chance alone.
Levels of significance:
Standard error:
1, 5, and 10 percent.
average error term.
Higher t statistic,
more significant variable
The relationship between the linear regression and ANOVA=
both of them are about partitioning and allocating a total sum of squares into different components. We measure these components ratios to determine the model is statistically significant or not.
Standardized value, useful when
- Interpreting variables when we have very different scales and variances. 2. Helpful in multiple regression to determine which variables account for most variants independent variable.
Standardize value can helpful When
the dependent and independent variables presented in completely different units.
Standardize formula:
X – Xmean / standard deviation.
The mean of the standardized value
equal to zero and the standard deviation of the standardized value is equal to 1.
The standardize value called
Z score.
After standardization, we say the variables are
above and under the mean.
In the output of statistical software:
The Error stands as the error sum of square or SSE, the total means that SST the mean square intersection with error is MSE or mean square error.
The model standard error or root mean square error is,
how well the data points or the observations fit around the regression line. How the regression line makes predictions?
SSE,
MSE,
how far the observations differ from the equation.
the variance of error. How spread out the data point from the regression line,
The confidence interval and T-test backbone
is SE (standard error).