Lecture 1 - Introduction Flashcards
What is machine learning about
Machine learning is about using the right features to build the right models that achieve the right tasks
What are Tasks
Problems that can be solved with machine learning
What are Predictive Tasks
Predicting a target variable from a number of features
What are descriptive tasks
exploiting the underlying structure of the data
What are 3 Predictive Tasks
- Classification
- Regression
- Predictive clustering
What are 3 Descriptive Tasks
- Descriptive clustering
- Association rule mining
- Subgroup Discovery
Descriptive tasks
What is the difference between predictive and descriptive tasks
Model output of predictive models involves a target variable, while the model output of the descriptive models does not.
Predictive Classification
What is Classification
Also give 2 examples
Classification tasks predict categorical target variable from a set of features.
* image classification
* weather type prediction
Predictive regression
What is regression
Give 2 examples
Regression tasks predict a numerical target variable from a set of features
* stock price forecasting
* weather temperature forecast
Predictive clustering
What is predictive clustering
give 1 example
Predictive clustering predicts with the intention to assign class labels (predicting a target)
* fraud detection
Descriptive Clustering
What is descriptive clustering
give 2 examples
The clusters are representing different groups formed in data without the intention of predicting a target.
* grouping plant data
* pattern mining
Association rule mining
What is association rule mining
Give 2 examples
A rule-based task for discovering interesting relations between variables
* market basket anaylsis
* online shopping
Sub-group discovery
What is subgroup discovery
give 2 examples
Technique that discovers interesting associations among different variables, with respect to a property of interest
* detection of risk groups with disease
* finding patterns in traffic accidents
Supervised vs Unsupervised
What is supervised learning
In supervised learning tasks, we provide a traning set of examples: instances, labelled with the true target value.
Supervised vs Unsupervised
What is Unsupervised learning
In unsupervised the data is unlabelled
Name two Supervised and Predictive models
- Classification
- Regression
Name one Supervised learning descriptive model
- Subgroup discovery
Name one unsupervised learning predictive model
- Predictive clustering
Name two unsupervised learning descriptive models
- Descriptive clustering
- association rule discovery
What are Models?
Models are what is being learned from the data, in order to solve a given task
How does a model of regression look like?
Equasion with Yi, Xi, ei
Yi = f(Xi+B)+ei
where Yi, Xi, ei are the target, features, and noise of specific instance i, and B and f are model paramters and model function
What are the two ways machine learning models can be distinguished
- Main intuition
- Modus operandi (mode of operations)
Main intuition
What are geometric models
using geometrical concepts. shit like linear tranformations, distance metrics, seperating hyperplanes
Main intuition
what are Probabilistic models
aim for reducing uncertanty using probability distributions
Main intuition
what are Logical Models
defined in terms of easily interpretable logical expressions
Second Categorization (modus operandi)
What are gouping models
dividing the instance space into segments
in each segment a very simple model is learnt
Modus operandi
What are Grading models
learning a single, global model over the instance space
Geometric models
Instance Space?
the set of all possible instances, whether they are present in the data set or not
Geometric models
Distances?
distance between two points
* Euclidean distance for two points. (could work in multiple dimentions)
Geometric models
What are Hyper-Planes
a decision boundary that divides the input space into two or more regions, each corresponding to a different class or output label
Probabilitic models
What is the bayes rule
P(Y|X)=P(X|Y)P(Y)
P(X)
probabilitic models
What is posterior
posterior = (likelihood x prior)/evidence
Proabilitic models
P(Y|X)
P(X|Y)
P(Y)
P(X)
P(Y|X) = posterior dist.
P(X|Y) = likelihood prob.
P(Y) = prior
P(X) = evidence
Logical models
what is meant by Declerative?
Declerative: models of this type can be earlisy translated into ruls that are understandable by humans
* Rules can be organised into a feature tree
Grading vs Grouping
Difference between grouping models and grading models.
Give 3 points for each
Grouping models
* break up the instance space into groups of segments
* grouping models have a fixed and finite resolution
* cannot distingiush between indivual instances beyond this resolution
Grading models
* Do not work based on the notion of segments
* they form one **global model **over the instance space
* infinite resolution
Give 1 example of a grouping model
- Decision trees
Give two examples of a grading model
- Linear regression
- Linear classifiers
Training vs inference
What is the Training phase
Training is the process of creating a machine learning model that has learned ot perform a task using a training set of numerous data points.
What is inference phase
Inference is the process of using a machine learning model to perform the task on a new data point.
what are Features
Kind of measurement taht can be easily performed on any instance
what if the data is not in the form you want?
- Feature construction
- Discretisation (numerical into categorical)
- Feature transformation (project data into a new space)
- Feature selection (removing redudntant features)