Chp 4 Components of Learning Flashcards
How is data usually represented
As a matrix
Data science spend the most time
cleaning data
Target function
A function that maps X to Y, we do not know this function and the goal is to recreate this function
Learning job
Create a hypothesis function that also maps X to Y very similarly to the target function
Learning steps 5
- We have an unknown target function F which maps X to Y
- We have certain training examples and we use those training examples as part of a learning algorithm
- The learning algorithm has a number of hypotheses.
- These training examples and hypotheses together will produce a final hypothesis.
- We hope this final hypothesis is very close to the target function
Learner input output
seen data as input, classifier as output
Classifier input output
unseen data, response to that data
Model is an
artifact, learner builds a model and classifier uses that model to predict
Curse of dimensionality
The various challenges and complications that come from data that is very high in dimensions, too much data to handle every single case
Generalizing
Being able to adapt to data that the model has not seen before
Selection
Selecting the data you need
Preprocessing
Clean data and understand what you need to remove
Transformation
Transform it into the shape you want, add/remove attributes
Data Mining
Get patterns from the data
Supervised learning
Model is trained on labeled data, input output pairs. Algorithm learns to map input to output