Intro To Linear Regression + General Flashcards
Give a definition of ML
A computer program (machine learning) is said to learn from experience E with some class of tasks T and a performance measure P
Give a definition of supervised learning
Supervised learning is a type of machine learning where an algorithm learns from a labeled dataset, which means that each input data point is associated with a corresponding target output.
what is the difference between classification and regression?
the aim of classification is to classify the output in predefined class (if it has two value is binary) however the aim of regression is to predict a continuos numerical value
make two examples, one of classification and one of regression
the two classic examples could be classification of spam email and the prediction of house value
give a definition of unsupervised learning
Unsupervised learning is a type of machine learning where the algorithm is trained on unlabeled data, meaning that the input data does not have corresponding target outputs or class labels. The main objective of unsupervised learning is to discover patterns, structures, or relationships within the data without explicit guidance.
what is a discriminative model?
These models are trained over a training set.
When these models take an input, they estimate the most probable
output. The purpose is to estimate the conditional probability p (y|x).
what is a recommender system?
A recommender system, also known as a recommendation system or recommendation engine, is a type of machine learning system that provides personalized suggestions or recommendations to users.
what is a generative model?
The purpose is to estimate the joint probability p(x, y).
These are probabilistic models that produce both input and
output. After the model is trained, the conditional probability can
be inferred
what is the difference between joint probability and conditional probability
The joint probability of two or more events occurring is the probability that all of those events occur simultaneously. while Conditional probability is the probability of an event occurring given that another event has already occurred
what is linear regression?
Linear regression is a supervised machine learning technique used for modeling the relationship between a dependent variable (or target) and one or more independent variables (or predictors) by fitting a linear equation to the observed data.
how can we choose theta in linear regression models?
In linear regression, the goal is to choose the values of the model parameters (θ or coefficients) that best fit the observed data.
Firstly we define the Cost Function than we define a method to calculate the minimum Gradient Descent
what is the cost function?
the cost function, also known as the objective function or loss function, It is a mathematical function that quantifies the error or discrepancy between the predicted values generated by a model and the actual target values in a supervised learning problem.
FORMULA
** J(θ) = 1/2m *
m
∑
i=1 (h(θ,x ^ (i) −y ^(i))^2 **
where m is the number of samples and in parentes we have the **mean square error (MSE) **
The division by 2m is a convenience factor for simplifying computations and doesn’t affect the optimization process.
what is gradient descent?
Gradient Descent is an optimization algorithm used to minimize a cost function.
** θj = θj − α∗∂(cost)/∂θj **
where α is the learning rate and affect the convergence and j=1…n where j is the index associate to the parameter.
for example in a easy LR we use
θ0 and θ1 so j= 0,1.
what is the learning rate α
α is an hyperparameter which means we have to put it before the computation and it is not learn by the algorithm.
if α is small the convergence is sure but slow however if it is big the convergence is fast but not sure.
what is the idea behind linear regression? Intuition of LR
the idea is the search a function that given an input predict an output.
in this case we have a straight line -> h(x) = θ0 + θ1 * x .
h (x) approximate the behaviour of f(x), the aim is to find θ* in order to have the precise h(x).