Ch. 10 Flashcards
K-nearest neighbors algorithm
You get a new data point
Look at its X nearest neighbors
Classify the new data point based on its nearest neighbors
Normalization exa
Look at a users average rating ie 3.3 and raise or lower their ratings a little until they reach an average value ie 3.5
Classification
Categorization into a group
Regression
Predicting a response (like a number)
Cosine similarity
An alternative to the distance formula that compares the angles of vectors
OCR
Optical character recognition
Having a computer recognize characters off a page ie scanning a book
What do ocr algorithms typically measure
Lines
Points
Curves
Training
The first phase of ocr where you go through images of numbers and extract features
Naive Bayes Classifier
Spam filters
Break a sentence into words to estimate the probability of a word appearing in a spam email
Feature
An aspect of data you will compare. Ie size, color, weight, a numeric rating etc.