Alternative Classification Techniques Flashcards
6 ALTERNATIVE CLASSIFICATION TECHNIQUE
- Rule-Based Classifier
- Nearest Neighbor Classifier
- Naïve Bayes Classifier
- Artificial Neural Network
- Support Vector Machines
- Ensemble Method
classify record by using a collection of “if…then..” rules.
Rule-Based Classifier
These rules are ranked according to their priority.
Ordered Rule Set
an ordered rule set is known as this.
Decision List
2 RULE ORDERING SCHEMES
- Rule-based Ordering
- Class-based Ordering
an ordering scheme where individual rules are ranked based on their quality.
Rule-based Ordering
an ordering scheme where rules that belong to the same class appear together.
Class-based Ordering
fractions of records that justify the antecedent of a rule.
Coverage of a Rule
fraction of records that satisfy both the antecedent and consequent of a rule.
Accuracy of a Rule
2 Characteristic of Rule-Based Classifier
- Mutually Exclusive Rules
- Exhaustive Rules
2 Effect of Rule Simplification:
- Rules are no longer mutually exclusive
- Rules are no longer exhaustive
5 Advantages of Rule-Based Classifier
- Highly expressive as a decision tree
- Easy to interpret
- Easy to generate
- Can classify new instances rapidly
- Performance comparable to decision tree
are lazy learners, it does not build model explicitly, needs to store all training data, and classifying unknown records are expensive.
Nearest Neighbor Classifier
3 Requirements for Nearest Neighbor Classifier
- Set of stored records
- Distance Metric to compute distance between records
- The value of k, the number of nearest neighbors to retrieve.
3 Ways to Identify Unknown Values in Nearest Neighbor Classifier
1.Compute distance to other records
- Identify k nearest neighbor
- Use class label of nearest neighbor (majority vote)
is robust to isolated noise points, handles missing valued by ignoring instance during probability estimate calculation, and robust to irrelevant attributes.
Naïve Bayes Classifier
involves learning the weights of the neurons.
Artificial Neural Networks
Algorithm for learning Artificial Neural Networks (3):
- Initialize the weights
- Adjust the weights in such a way that the output of ANN is consistent with class labels of training examples.
- Find weights that minimize the above function.
uses a hyperplane (decision boundary) to separate the data.
Support Vector Machines
In SVM, means it is more robust and is less expected to generalization error.
Larger Margins
2 Methods for Ensemble Method
- Construct a set of (possible weak) classifiers from the training data.
- Predict class label of previously unseen records by aggregating predictions made by multiple classifiers.
3 Advantages of Ensemble Method
- Improve stability and often also accuracy of classifiers
- Reduce variance in prediction
- Reduces overfitting
What is the general idea of Ensemble Method
- Create multiple data sets
- Build multiple classifiers
- Combine classifiers
3 Examples of Ensemble Method
- Bagging
- Boosting
- Random Forests
3 Steps of Bagging
- Sampling with replacement (bootstrap sampling)
- Build Classifier
- Aggregate the classifiers’ results by averaging or voting.
an example of Ensemble Method where records that are incorrectly classified in one round will have their weights increased in the next.
Boosting
is a popular algorithm that typically used a decision tree as the weak learner.
AdaBoost
an example of Ensemble Method that introduces two sources of randomness: Bagging and Random Input Vector.
Random Forests
each tree is grown using a bootstrap sample of training data.
Bagging Method
at each node, best split is chosen only from a random sample.
Random Vector Method