Section 2 Logistic Regression Flashcards
What is w0 in linear regression
Parameter w0 is the intercept allowing for any fixed offset in the data. It is often called bias.
How are the regression coefficients found in linear regression?
Maximisation of the log-likelihood is equivalent to minimization of the squared loss function will lead to the same optimisation.
What is the motivation for logistic regression?
Logistic regression models a binary target categorical variables given a collection of predictors. The motivation for logistic regression is to map R to the 0,1 set to model Pr(Y=1 given the predictor X).Logistic regression provides an interpretable model where can understand how the traits of individual observations contribute to their classification.
In linear regression what is the error assumption?
We assume errors in our linear regression are normal with zero mean and variance sigma
Explain classification
Predicting the value of Y using inputs of X can be called classification since we assign the observation to a category, or class.
What are linear regression models used for
Linear regression models relationships between numerical response and multiple predictors using a linear model (real set).This cannot be used for a binary/categorical response.
What is logistic regression?
Logistic regression is used to model a binary categorical variable given a collection predictors. Maps the real numbers to just 0,1. The logistic regression model defines the probability p as a function of parameter and predictors in terms of the logistic function.
Interpret w0
The intercept w0 is the value of the logit corresponding to xi = 0. The intercept w0 is sometimes called bias.
Interpret W weights
The coefficient w measures the effect of the variable on the logit.
R function to fit a logistic regression
glm() to fit logistic regression models. This is because it allows variables forming the regression to be of all different categories.
What does R return using glm function?
estimates for weights parameters, std errors of these estimates, z value and p value
Define a hypothesis test to determine if a variable has a significant effect on the target variable
To determine if a variable Xj has a significant effect on the target variable Y , we may wish to perform the hypothesis test:
H0:wj=0 vs HA wj not equal to 0
Name three estimation options for parameter estimates
Maximum likelihood estimation, minimization of least squares loss function or minimising the mean squared error loss function.
What is the response variable modelled as under logistic regression?
Response variable is modelled by a Bernoulli distribution given the values of the input variables.
How is wj generally estimated?
Maximising the log-likelihood. or minimising loss functions. No closed form solution for wj is available, and optimization is performed numerically and software is used. In the GLM literature, maximisation of the logistic regression log-likelihood is performed using the Newton-Raphson algorithm.