Machine learning Flashcards
What is machine learning?
Computer system or software learns by itself by developing models and training them to predict future outputs
What is supervised learning?
Infers a function from labelled training data
What is unsupervised learning?
Infers a function from unlabelled data
What is reinforcement learning?
Learns over time via trial and error using feedback
- Award from actions
Give some examples of supervised learning
- Linear regression
- Decision tree
- Artificial neural networks
Give some examples of unsupervised learning
- Clustering
- Association rules
What is top-down machine learning?
Model different functions and wire them together
- Deduction
What is bottom up machine learning
Give the system lots of data so it can discover the concepts by itself
- Induction
How does supervised learning work?
- Data pre-processing
- Partition data into training and testing
- Train model
How does unsupervised learning work?
- Data pre-processing
- Clustering or association technique
How does clustering work?
- Choose number of clusters K
- Initialise K cluster centroids randomly
Repeat steps 3 and 4 - Assign each data point to the nearest cluster
- Update centroids by computing the mean of all the data points assigned
- Output final cluster assignments and centroids
How does association work?
- Discover correlation between two or more variables
- Produce dependancy rules to predict occurrence of x with y
What are the three pillars that machine learning is built from
- Models and algorithms
- Powerful and cheap computation
- Massive data warehouses
What is data mining?
The exploration and analysis of large quantities of data to discover valid, novel, useful and understandable patterns in data
What is the difference between machine learning and data mining?
- Machine learning predicts using models
- Data mining explains patterns
What is regression?
A relationship between variable Y and variable X
How do we describe a linear regression model?
An underfitted model
A good model
An overfitted model
What is meant by underfitting a model?
A model which doesn’t caputre any logic
- High loss
- Low accuracy
What is meant by a good model?
Caputres the underlying logic of the dataset
- low loss
- high accuracy
What is meant by an overfitted model?
Caputures all the noise, so “misses the point”. Over complex with lots of parameters
- low loss
- low accuracy
How may overfitting occur?
Training data size is too small
-> take more samples (could use deeplearning GANs to do this)
How may underfitting occur?
Model is too simple, too little parameters
-> more training time or input features
What are the advantages of regression?
- Short training time
- Easy to interpret
- Easy to implement
What are the disadvantages of regression?
- Sensitive to noise and outliers (overfitting)
- Cannot handle complicated relationships (linear only)