lecture 7 machine learning bigger picture Flashcards
how many types of machine learning methods are there ?
two types supervised learning and unsupervised learning
what is supervised learning ?
learning rules that describes input/output relationship
two forms- regression and classification
what is unsupervised learning ?
learning rules that describes input only
two dimension reduction and clustering.
is it true regssion belongs to supervised learning ?
T
is it true classification belongs to supervised learning ?
T
does reduction belong to supervised learning ?
no
does clustering belong to supervised learning ?
no
what regression model ?
task of fitting model to training data. The learned model can be used to make predictions about continuous valued output
what is classification ?
instead of continuous valued output. Classification outputs a prediction from a discrete set of values, collapsed and standing
what is clustering ?
identifies structure in data by grouping samples that share same characteristics
what is dimension reduction ?
if you have data set with a lot of features and you want to reduce the number of features while having most of data original variability. Dimenssion reduction using principal component analysis helps reduce the dimensionality of the dataset by transforming data into fewer dimenssion that still captures most of the variance
What are machine learning typical steps ?
This is done through
-Data collection ( available sources, your own collection )
-feature design (manual versus automated )
-model training
-model validation
what is training data set?
the set of data on which actual training takes place. Bigger set and model learns from. Anlogy would be example/homework problems
what is validation set ?
the set of data is used during the training phase of the model to provide unbiased evaluation of the models performance and to fine tune the model. i helps select model. Analogy would be sample/mock exam.
what is test data set ?
after model has been trained fully, use test to assess/report your model performance . Analogy actual exam
How do divide the data into test, validation and training?
we split data randomly
what is training error ?
training error measures how well a model fits the training data
what is validation error
help in the model selection and tuning process during development
what is test error ?
measure how well a model generalizes to new data. In machine learning we only care about test error.
Training error is high for low complexity/ less flexiable model i.e underfitting (T/F)
T
test error intially goes down, but eventually increases ( overfitting ). T/F
T
In supervised learning, one can have only one input feature ? (prof)
Fale
Pandas is imported in your program
To read .csv file
– To see some rows of imported data on your screen
– To get some basic info about the data on your
screen
– All of the above
– All of the above
Clustering is an example of supervised
learning
– True or False
False
Regression is an example of supervised
learning
– True or False
True
Classification is an unsupervised learning
– True or False
false
Test error does not depend on the model
complexity at all
– False or True
False
Training error always increases as the model
complexity increases
– False or True
False
Our objective to have the model
– Over fit
– Under fit
– High complexity
– None of the above