Class 6 Flashcards
computational learning theory
lies at the intersection of AI, stats, and theoretical CS
sample complexity
number of required examples to get to probably approximately correct
approximately correct
a hypothesis that is consistent with low error rate after a large set of training examples
linear functions
“fitting a straight line”
linear regression
task of finding the best fitting line
weight space
all of the possible settings for the weights
alpha
step size, also called learning rate
epoch
step that covers all the training examples
decision boundary
line that separates two classes
linear separator
linear decision boundary
logistic regression
process of fitting the weights to a model to minimize loss
parametric model
learning model that summarizes data with a set of parameters of fixed size (independent of the number of training examples)
nonparametric model
learning model that cannot be characterized by a bounded set of parameters – this method retains all data points as part of the model
table lookup
simplest instance based learning model – all training examples put into table, doesn’t generalize well
curse of dimensionality
nearest neighbors works well in low dimensions with plenty of data – at higher dimensions it doesn’t work well
locality sensitive hash
get around randomness and exact matching issues found in hash tables
maximum margin separator
a decision boundary with the largest possible distance to example points – helps them generalize well
kernel trick
SVMs can embed data into a higher dimensional space using this trick
ensemble learning
selects a collection of hypotheses and combines their predictions by averaging, voting, or some other means of ML
bagging
short for bootstrap aggregating
random forest model
form of decision tree bagging in which extra steps to make ensemble of K trees more diverse to reduce variance
boosting
most popular ensemble method, used a weighted training set, all examples start with equal weight, weight increased if training example fails, able to overcome any amount of bias in base model, approximates Bayesian learning
gradient boosting
form of boosting that uses gradient descent as opposed to weighted examples
semisupervised learning
typeof learning that involves giving a few labeled examples and using them to mine more information from a large collection of unlabeled examples
crowdsourcing
using paid workers or unpaid volunteers operating over the internet