Statistical concepts Flashcards
Statistics - What is the goal of Maximum Likelihood Estimation (MLE)?
The goal of MLE is to find the parameters of a probability distribution that make the observed data most probable. It ‘fits’ a distribution to the data by maximizing the likelihood function.
Statistics - What is the likelihood function?
The likelihood function measures how probable the observed data is given a set of model parameters. It is denoted as L(θ) = P(X | θ).
Statistics - Why do we use the log-likelihood function instead of the likelihood function?
Because it simplifies calculations by turning products (usually probabilities come in a product) into sums and avoids numerical underflow.
Statistics - How do we find the MLE of a model with parameters theta?
To find the MLE, we take the derivative of the log-likelihood function, set it to zero, and solve for the parameters.
Statistics - What does it mean to fit a distribution to data using MLE?
It means estimating the best parameters for a probability distribution that most likely generated the observed data.
Statistics - What is an example of a distribution where MLE is used?
MLE is commonly used to estimate parameters of Normal distribution. But also Poisson, Exponential, and other probability distributions are being used.
Statistics - What are the MLE estimates for a given observed data assuming normal distribution?
For a normal distribution, MLE estimates the mean as the sample average and the variance as the sample variance.
Statistics - What are some applications of MLE?
MLE is used in machine learning, statistical inference, financial modeling, and natural language processing.
Statistics - What assumption does MLE rely on?
MLE assumes that the observed data follows a known probability distribution with unknown parameters.
Statistics - How is MLE related to Bayesian estimation?
MLE finds the most likely parameters without prior knowledge, while Bayesian estimation incorporates prior beliefs through Bayes’ theorem.
Statistics - What is the goal of Maximum Likelihood Estimation (MLE) in classification?
MLE aims to maximize the likelihood of the correct class label given the input data.
Statistics - What is the loss function derived from the negative log-likelihood in a classification problem?
Cross-entropy loss.
Statistics - What does the cross-entropy loss measure?
It measures how different the predicted probability distribution is from the true one.
Statistics - Why is softmax used in classification neural networks?
Softmax converts raw scores (logits) into a valid probability distribution, making them interpretable for categorical classification.
Statistics - How is softmax mathematically defined?
Softmax for class c: p_c = exp(z_c) / sum(exp(z_j)), ensuring all outputs sum to 1.
Statistics - What is the relationship between softmax and cross-entropy?
Softmax outputs probabilities, and cross-entropy measures the difference between these and the true labels, forming the standard classification loss.
Statistics - What assumption about data leads to using softmax + cross-entropy?
We assume a categorical distribution over classes, making softmax + cross-entropy the natural choice under MLE.
Statistics - Why does softmax ensure a valid probability distribution?
Softmax ensures all outputs are positive and sum to 1.
Statistics - Break down the steps for MLE in a classification problem.
1 Assume a categorical distribution: given data x, the true label follows a one-hot encoded probability distribution (Dirac delta), meaning it’s 1 for the correct class and 0 for the others.
2 Using MLE we want to maximise the probability for all the data points.
3 We create a model p(y∣x,θ) (using an NN) which gives a value for each class (logits).
4 We use Softmax so the output can be interpreted as a probability function for each class.
5 To maximise the log-likelihood, we choose Cross entropy loss to make the predicted distribution as close as possible to our assumption (Dirac function).
What is stochastic?
It is the property of being well-described by a random probability distribution.