lecture 6 Flashcards
What is the purpose of neural networks in machine learning?
To learn nonlinear decision boundaries by automatically extracting features.
What are the two approaches to making linear models more powerful?
Expanding features manually and using neural networks.
What is a perceptron?
A simple type of artificial neuron that makes decisions based on weighted input.
What are the components of a perceptron?
Inputs, weights, bias, and an activation function.
What activation function is commonly used in perceptrons?
Step function (sign function).
What is a limitation of perceptrons?
They can only learn linearly separable functions.
What is the difference between a perceptron and a neuron in a neural network?
Neurons in a neural network use nonlinear activation functions, making them more expressive.
What activation functions are commonly used in neural networks?
Sigmoid, ReLU, and Softmax.
What is a feedforward neural network?
A type of neural network where information moves in one direction, from input to output.
What is a multilayer perceptron (MLP)?
A feedforward neural network with at least one hidden layer.
What is the purpose of a hidden layer in an MLP?
To transform inputs into new representations that can model complex functions.
Why do neural networks need nonlinear activation functions?
Without them, a multi-layer network would collapse into a linear model.
What is the sigmoid activation function?
A function that maps any real number into the range (0,1).
What is the ReLU activation function?
ReLU (Rectified Linear Unit) sets negative values to zero while keeping positive values unchanged.
What is the benefit of using ReLU over sigmoid?
ReLU mitigates the vanishing gradient problem and accelerates training.
What is softmax activation used for?
For multi-class classification, converting scores into probabilities.
What is backpropagation?
An algorithm for training neural networks by adjusting weights based on error gradients.
What is the loss function in neural networks?
A function that measures how far the predictions are from the true values.
What is stochastic gradient descent (SGD)?
An optimization method that updates weights using small batches of data to improve learning efficiency.
What is the role of learning rate in gradient descent?
It controls how much the weights are adjusted at each step.
What happens if the learning rate is too high?
The model may diverge and fail to converge to a solution.
What happens if the learning rate is too low?
The model may take too long to converge or get stuck in local minima.
What is the difference between batch gradient descent and stochastic gradient descent?
Batch GD uses the entire dataset for each update, while SGD updates weights using one example at a time.
What is overfitting in neural networks?
When the model learns the training data too well, including noise, and performs poorly on unseen data.
What is regularization in neural networks?
Techniques like dropout and weight decay to prevent overfitting.
What is dropout?
A regularization technique that randomly disables some neurons during training to improve generalization.
What is an epoch in neural network training?
One complete pass through the entire training dataset.
What is the universal approximation theorem?
A neural network with at least one hidden layer can approximate any continuous function given enough neurons.
How do neural networks compare to SVMs?
SVMs use kernel tricks to expand features, while neural networks learn their own feature representations.