Exam 2 Flashcards
Why we use exploratory modeling
obtain the best fit model from all observations
Why we use predictive modeling
split observations into a training set and a validation set
What are training sets used for
create the model
what are validation sets used for
evaluate accuracy of the model
predictors
these are our variables
target
what we are trying to estimate to test model
regression
determining the relationship between a variable and one or more other variables
linear regression
gives a set of observations, determine the equation of a line that can be used to describe the dataset
types of error
error - estimated
mean error - average of errors
mean square error = same as mean but sum of errors are squared
root mean square error = same thing as MSE but taking the square root
How do you know if r-square is accurate?
the closer to 1 it is the more accurate it is
validation set
used to test a model
why do we split validation and training sets
to learn about the data and test it
class
category for data
when would we want to use a class?
to identify a label for the data points
what is the max of k
training size of dataset