Chapter 4: Machine Learning and Optimisation Flashcards
what datatypes do we have in machine learning
objects and pairwise relations
what data containers do we have and what do they do
hold data types
scalar, matrix, vector, tensor
what data structures do we use. what do they do
set, tree, graph
they allow us to create a set of containers
what is a parameter
they control model behaviour. They are set during training.
a non parametric model has none
give an example of a non parametric model
KNN
what is an objective function
it is either maximised or minimised by a technique:
by example- supervised
by reward- reinforcement
by exploration- unsupervised
also called loss/cost
describe the three approaches for decision making
- direct- map from input to output
2/3, inference- give the probability
2- calculate probability directly
3- calculate probability indirectly i.e. using a base theory
give the three approaches for decision making
1 discriminant function
2 direct model
3 bayes theorem
within approach 1, give the different approaches
linear model
linear basis function
kernel method
within approach 2, give the different approaches
for classification- logistic regression
for regression- Bayesian regression
what is the approach for approach 3
naïve bayes
describe a linear model
the output is the weighted sum of the inputs
how is the linear model used for classification
threshold function. this creates a hyperplane which is a classification/division/separating boundary
how can we create a ROC curve in a binary model
change the threshold value
how do we handle non linear data patterns
map to linear within a new feature space. We then have an input space and a feature space
what are the two basis functions described in this course
gaussian basis function and polynomial basis function
what does gaussian basis function use as parameters
mean and variance
what does a polynomial basis function use as parameters
integers
describe the kernel method
like linear basis function, in that it creates a non linear separating boundary, but directly defines the mapping function rather than relying on user set values
uses the inner products
give example kernel function
linear, polynomial, gaussian, hyperbolic tangent
describe logistic regression
gives the probability of a sample belonging to a class.
apply SoftMax to a linear model to obtain class posterior.
describe gaussian Bayesian linear regression
for regression
assumes the output follows a gaussian distribution
Tread w and o^2 as random
how do we infer p(c | x)
bayes theorem
what is approach 3
Instead of calculating probability directly, we apply a base theory for probability.
give p(c | x) ( bayes theorem )
p(x)