B06 K-NN Flashcards
Some examples of Parametric models are:
- Linear Regression
- Logistic Regression
- Naive Bayes
- Simple Neural Networks
Some examples of Non- Parametric models are:
- k - Nearest Neighbor
- Support Vector Machines
- Decision Trees
A learning model that summarizes data with a set of
parameters of fixed size (independent of the number of
training examples) is called a __________. No
matter how much data you throw at a ___________, it won’t change its mind about how many
parameters it needs.”
parametric model
parametric
model
Strengths of Parametric Models?
-Simpler: These methods are easier to understand and the results are easy to interpret. -Speed: Parametric models are usually very fast to train. -Less Data: They do not require as much training data and can work well even if the fit to the data is not perfect.
Weaknesses of Parametric Models?
-Constrained: By choosing a functional form, these methods are highly constrained to the specified form. -Limited Complexity: The methods are more suited to simpler problems. -Poor Fit: In practice, the methods may not always match the underlying mapping function.
A learning model that does not make strong
assumptions about the form of the mapping function is
called a ___________. By not making
assumptions, ____________ are free to learn
any functional form from the training data.
non-parametric model
Weaknesses of Non-Parametic Models?
-More data: Require a lot more training data to estimate the mapping function. -Slower: A lot slower to train, as they often have far more parameters to train. -Overfitting: Have a higher risk of overfitting against the training data
Strengths of Non-Parametric Models?
-Flexibility: Capable of fitting a large number of functional forms. -Power: No assumptions (or weak assumptions) about the underlying function. -Performance: Can result in higher performance models for prediction.
A class of non-parametric learning methods that do not generate a model but instead make use of verbatim training data for classification?
Lazy or instance-based learners or
rote learners
The _________________algorithm gets its name
from the fact that it classifies an unlabeled observation
based on information about the _______labeled
________ of the observation.
k-Nearest Neighbor (k-NN)
k-nearest
neighbors
Choosing the right K
A ____ reduces the impact of
noisy data but increases the risk of
ignoring important patterns
large K
Choosing the right K
A _______ makes the model
susceptible to noise and/or outliers.
small K
Note that the ______ the dataset, the ____
important the difference between two choices
for k becomes.
larger
less
Strengths of K-NN?
-Simple and effective.
-Makes no assumptions about the
underlying data distribution.
-Training phase is very fast
Weaknesses of K-NN?
-Does not produce a model.
-The selection of an appropriate
k is often arbitrary.
-Rather slow classification
phase.
-Does not handle missing, outlier
and nominal data well without
pre-processing.