Revision for recognition Flashcards
How can we measure the class impurity in the decision tree
We can use entropy, Gini Index
Explaining briefly the pruning process
it aims to delete the parts to reduce the complexity and variance
the complexity that are inaccuracy for prediction and relies on the misclassification rate
another approach consists of a rule simplification technique
where we delete part of a test or a whole rule s
what are the advantages and disadventages of the decision tree
Ability to merge qualitative and quantitave features
the scaling doesn’t
No stability, slight modification of the data lead to variation in the resulting tree
Explain random forest
this is to use several weak decision trees that are generated by bootstrapping (random selection of features)then a majority vote is used for prediction
what are the advantages in tree forests
they are simple and accuracy, pruniunig can be omitted by setting the maximum depth for the trees
How does C affect the SVM
The higher is C the smaller is the marge , the bigger is C the higher is the margin
SVM
is supervised machine learning technique that can be used for both regression and classification
How to deal with non linearity in SVM
We can use what we call the kernel trick where we introduce a non-linear transformation to the datapoint such that problem becomes linear in essence projecting the data into high dimensional space
what is the role of hidden units in MLP
allows applying non linear transofrm to the input
what are some dificulties in terms of NN
Thereis no systematique rule to determine the network architecture and the existence of local optima giving sub optimal solutions
what isNN
Correspond to to network of interconnected unite composed of an input layer, outputlayer and optionally hidden layer , each node is connected to the next one and eachconnection is characterized by a direction and a weight
what is the difference betwee classfication and regression
in classificaiton we seek to predict something descrete while regression we to seek to predict something contineous
what is the difference between logistic regression and NN
logistic regression is a technique that can be used for binary classification which a neural network is more complex than that and logistic regression is a subset of
what is the differnce between a logistic regression and perception
percpetron predicts class like yes or no while logistic regression is outputs probabilities
what is Gini index
describes the likelihood of a new data to be misclassified