How is multiclass classification connected to Logistic Regression
In multiclass classifcation, a simular method is used to convert predicted outputs into probabilities called the softmax function. In this function, each class has its own predicted output aka “score” which is the dot product between the input vetcor x and the paramter vector theta. These scores are then converted to probabilities by the softmaxfucntion, ensuring that they sum to one.
How do you train a multiclass?
How to estimate parameters for likelihood functions using bayes rule? what parameters needed to be calculated to obtain P(Y|x)?
The parameters we need to estimate are the class priors (P(Y): the probablity of each class in your training data) and the liklihoods (P(X|Y): (take the product of the probabilities of all features.).
How do you handle continuous and discrete X in NB?
How many independent parameters do we need to estimate for calculation of joint probabilities? How does NB assumption improve it?
What are the subtleties of Naive Bayes
How is model complexity connected to bias, variance, and test error
How does L1 and L2 regualrization affect classifiers?
L1 and L2 regularization simplifies classifiers by controlling the strength of the parameters.
- L1 regularization (LASSO) can bring parameters completely to zero.
-L2 regularization (RIDGE) reduces parameters but does not bring them completely to zero.
How does cross validation help us to better generalize?
Instead of relying on a single train test split, cross validation divides the data into multiple folds. The model is trained on a combination of these folds and validated on the remaining fold, ensuring that every data point is used for both training and testing.
what is the cost function and what is the log-liklihood of Logistic regression?
How do you obtain the gradient descent update rule from cost function
how do you get to log-liklihood from h(x)
why do we need 0-1 and perceptron loss?
What is cross validation?
cross validation is a method to determine how well a model performs on an independent dataset. It involves dividing the data into multiple folds or subset, using one as the validation set and the remaining to train the model. This process is repeated multiple times using a different fold as the validations et each time. Finally, the results from each validation set are averaged to produce a robust estimate of the models performance.
How can cross validation be used to to choose the value of a model parameter P?
True or False: When evaluating ML algorithms, steeper learning curves are better?
True
How does the perceptron algorithm work?
What is Support Vector Machine?
SVM is a ML algorithm used for classification and regression. It finds the best line (or hyperplane) to separate data into groups, maxing the distance between the closest points (support vectors)
What is Bayes rule for classification?
How to train Naive Bayes Classifiers?