review Flashcards
we come up with prediction in term of proabbility the use it to decide
if it 0 or 1 (categorical value)
transformation is non linear so we use
odds which is ratio of probablity/ 1- probability
in a regression outcome we are predicitng
log odds
baseline goal is to predict
wheter observation will be 0 or 1
baseline predicts
msot frequent outcome
which data set to use to find outcome of baseline model
training
how to build regression tree
splitting IV and predict most frequent outcome
how to come up with prediction
count the nu,bers of outcome per split
we choose how to define splits but use it
consisntley throught the model
how to decide where to split
First decide what objective (error points that misclassified) to minimize and maximize accuracy, try different points and select one that minimize error or max accuracy
most cases arent exact algorithim but
best found tree not optimal
Annova class is for prediciting variable
limitless where as classification is for probability between 0 and 1
continous is defined wiwthout a threshold how
most frewunt or verages
classfication problem deals with probability either 0 or 1 and when use probabilty always have
threshold, speceficty, senstivity –> ROC curve
single regression trees has high variance so prediciatbility will have
high variablity