Neural Networks Flashcards
What is logistic regression used for?
Binary Classification tasks
What is the sigmoid function in logistic regression?
σ(x) = 1 / (1 + e^-x), it converts the relation of the input with the separating hyperplane and outputs a probability.
How are the weights estimated in logistic regression?
Using maximum likelihood estimation
What is a key advantage of neural networks over logistic regression?
Neural networks can learn complex non-linear relationships in data
What is the vanishing gradient problem in deep neural networks?
The gradient becomes very small for earlier layers, making it difficult to train deep networks
What is an embedding in machine learning?
A learned representation of categorical data in a continuous vector space
What is the purpose of an autoencoder?
To learn compact representations (encodings) of data
What is a convolutional neural network (CNN) typically used for?
Back: Image processing and computer vision tasks
What property do convolutions provide to CNNs?
Translation equivariance
What is transfer learning?
Adapting a pre-trained model to a new but related task
What activation function helps mitigate the vanishing gradient problem?
ReLU (Rectified Linear Unit)
What is the purpose of pooling layers in CNNs?
To add some translation invariance and reduce spatial dimensions
What is the softmax function used for in neural networks?
To convert raw scores into probabilities for multi-class classification
What is the difference between batch gradient descent and stochastic gradient descent?
Batch uses all training samples per update, while stochastic uses one sample at a time
What is mini-batch gradient descent?
A compromise between batch and stochastic, using a small subset of samples per update
What is the purpose of momentum in gradient descent optimization?
To accelerate convergence and help overcome local optima
What is a skip connection in neural networks?
A connection that bypasses one or more layers, helping to mitigate the vanishing gradient problem
What is batch normalization?
A technique to normalize the inputs of each layer, improving training stability and speed
What is the difference between a dense layer and a convolutional layer?
Dense layers connect all inputs to all outputs, while convolutional layers use local receptive fields
What is the purpose of dropout in neural networks?
To prevent overfitting by randomly setting a fraction of inputs to zero during training
What is a variational autoencoder (VAE)?
A probabilistic version of autoencoders that learns a continuous latent space representation
What is the difference between an encoder and a decoder in an autoencoder?
The encoder compresses input data into a latent representation, while the decoder reconstructs the input from this representation