Final Exam Flashcards
What is backwards elimination?
An iterative variable selection procedure that starts with a model with all independent variables and considers removing an independent variable at each step.
What is the best subset?
A variable selection procedure that constructs and compares all possible models with up to a specified number of independent variables.
What is the coefficient of determination?
A measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.
What is the confidence interval?
An estimate of a population parameter that provides an interval believed to contain the value of the parameter at some level of confidence.
What is cross validation?
Assessment of the performance of a model on data other than the data that were used to generate the model
What is the dependent variable?
The variable that is being predicted or explained. It is denoted by y and is often referred to as the response.
What is a dummy variable?
A variable used to model the effect of categorical independent variables in a regression model; generally takes only the value zero or one.
What is estimated regression?
The estimate of the regression equation developed from sample data by using the least squares method.
What is the experimental region?
The range of values for the independent variables , ,…, for the data that are used to estimate the regression model.
What is extrapoltation?
Prediction of the mean value of the dependent variable y for values of the independent variables , ,…, that are outside the experimental range.
What is forward selection?
An iterative variable selection procedure that starts with a model with no variables and considers adding an independent variable at each step.
What is the holdout method?
Method of cross-validation in which sample data are randomly divided into mutually exclusive and collectively exhaustive sets, then one set is used to build the candidate models and the other set is used to compare model performances and ultimately select a model.
What is hypothesis testing?
The process of making a conjecture about the value of a population parameter, collecting sample data that can be used to assess this conjecture, measuring the strength of the evidence against the conjecture that is provided by the sample, and using these results to draw a conclusion about the conjecture.
What are independent variables?
The variable(s) used for predicting or explaining values of the dependent variable. It is denoted by x and is often referred to as the predictor variable.
What is Interaction?
Regression modeling technique used when the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
What is interval estimation?
The use of sample data to calculate a range of values that is believed to include the unknown value of a population parameter.
What is a knot?
The prespecified value of the independent variable at which its relationship with the dependent variable changes in a piecewise linear regression model; also called the breakpoint or the joint.
What is the least squares method?
A procedure for using sample data to find the estimated regression equation.
What is multicollinearity?
The degree of correlation among independent variables in a regression model.
What is linear regression?
Regression analysis in which relationships between the independent variables and the dependent variable are approximated by a straight line
What is multiple linear regression?
Regression analysis involving one dependent variable and more than one independent variable.
What is overfitting?
Fitting a model too closely to sample data, resulting in a model that does not accurately reflect the population.
What is the p-value?
The probability that a random sample of the same size collected from the same population using the same procedure will yield stronger evidence against a hypothesis than the evidence in the sample data given that the hypothesis is actually true.
What is the parameter?
A measurable factor that defines a characteristic of a population, process, or system.
What is the piecewise linear regression model?
Regression model in which one linear relationship between the independent and dependent variables is fit for values of the independent variable below a prespecified value of the independent variable, a different linear relationship between the independent and dependent variables is fit for values of the independent variable above the prespecified value of the independent variable, and the two regressions have the same estimated value of the dependent variable (i.e., are joined) at the prespecified value of the independent variable.
What is the prediction interval?
An interval estimate of the prediction of an individual y value given values of the independent variables.
What is an autoregressive model?
A regression model in which a regression relationship based on past time series values is used to predict the future time series values.
What are casual models?
Forecasting methods that relate a time series to other variables that are believed to explain or cause its behavior.
What is a cyclical pattern?
The component of the time series that results in periodic above-trend and below-trend behavior of the time series lasting more than one year.
What is exponential smoothing?
A forecasting technique that uses a weighted average of past time series values as the forecast.
What is a forecast error?
The amount by which the forecasted value differs from the observed value , denoted by .
What are forecasts?
A prediction of future values of a time series.
What is the mean absolute error?
A measure of forecasting accuracy; the average of the values of the forecast errors. Also referred to as mean absolute deviation (MAD).
What is the mean absolute percentage error?
A measure of the accuracy of a forecasting method; the average of the absolute values of the errors as a percentage of the corresponding forecast values.
What is the mean squared error?
A measure of the accuracy of a forecasting method; the average of the sum of the squared differences between the forecast values and the actual time series values.
What is the moving average method?
A method of forecasting or smoothing a time series that uses the average of the most recent n data values in the time series as the forecast for the next period.
What is the naïve forecasting method?
A forecasting technique that uses the value of the time series from the most recent period as the forecast for the current period.
What is the smoothing constant?
A parameter of the exponential smoothing model that provides the weight given to the most recent time series value in the calculation of the forecast value.
What is the stationary time series?
A time series whose statistical properties are independent of time.