ML Overview Flashcards
What are the 6 stages of the ML Pipeline
- Define the problem
- Data collection
- Data preprocessing
- Data modelling / machine learning
- Model evaluation
- Model application (on new/unseen data)
Define ML
Input data and answers and output rules that can be applied by the computer to new situations.
What are the 4 types of data sets?
- Record
- Graph and network
- Ordered
- Spatial, image, multimedia
What are data objects?
- Make up datasets
- Represent the entity being measured
- Are a row in a database
- Aka entities, instances, points, samples, tuples, patterns, vectors, examples
What are data attributes?
- Describe the data objects
- Are the columns in the database
- Aka features, variables, dimensions, predictors
What are the five types of data attributes
- Nominal - categories
- Binary - 0,1
- Ordinal - meaningful order but magnitude between successive values is not necessarily meaningful
- Interval scaled - equal sized units, ordered, no tire zero
- Ratio scaled - as per interval but with a true zero
3 major categories of machine learning
Supervised
Unsupervised
Reinforcement
Define supervised ML
Learning with a labelled training dataset
Define unsupervised ML
Learning patterns in unlabelled data
Define reinforcement ML
Learning based on feedback and reward
2 types of supervised ML
Classification
Regression
2 types of unsupervised ML
Clustering
Anomaly detection
2 types of reinforcement learning
Game play
Control
3 regression methods
Linear regression, polynomial regression
Ridge regression, LASSO, elastic net
Artificial neural networks
4 classification methods
Logistic regression
K nearest neighbours
Support vector machines
Decision trees