Neural Network Basics Flashcards
What is the purpose of an activation function in a neural network?
To help the network learn complex patterns.
What is backpropagation used for in training a neural network?
To help the network learn by adjusting weights to reduce errors.
Why is it important to normalize input data before training a neural network?
To keep data on a similar scale so the network learns better.
What does an epoch represent in machine learning?
One complete pass through the entire training dataset during training.
What problem does the ReLU activation function help solve?
It helps the network learn faster by avoiding vanishing gradients.It avoids the vanishing gradient problem and speeds up training by allowing gradients to flow through the network more effectively.
Imagine you’re shouting instructions down a long hallway, but with each person passing the message, it gets quieter. By the time it reaches the last person, the message is so faint they can’t hear it. That’s the vanishing gradient—the learning signal fades as it moves backward through the network.
What is overfitting, and how can it be reduced?
Overfitting is when a model memorizes data instead of learning patterns. It can be reduced with regularization, dropout, or early stopping.
- Regularization
What it is: A technique to prevent overfitting by adding a penalty to the model for being too complex.
Simple version: It keeps the model simple so it doesn’t memorize too much. - Dropout
What it is: A method where random neurons are temporarily turned off during training, forcing the network to learn more general patterns.
Simple version: It randomly turns off parts of the network to help it learn better. - Early Stopping
What it is: A technique where training stops as soon as the model’s performance starts getting worse on new data, even if it’s still improving on training data.
Simple version: It stops training before the model starts memorizing the data.
What is the role of a loss function in neural network training?
It measures the error between predicted and actual outputs, guiding the network’s learning process.
Why is the XOR problem significant in neural network history?
It demonstrated that single-layer perceptrons are incapable of solving non-linearly separable problems, leading to the development of multi-layer networks.
What’s the XOR Problem?
XOR (short for “exclusive OR”) is a simple logic rule:
It outputs 1 if the inputs are different (like 0 and 1 or 1 and 0).
It outputs 0 if the inputs are the same (like 0 and 0 or 1 and 1).
Why Is It a Problem?
A single-layer neural network (also called a perceptron) can only solve problems where you can draw a straight line to separate the outputs (called linearly separable problems).
But XOR doesn’t work that way. You can’t draw a straight line to separate the 1’s from the 0’s on a graph. You’d need something more flexible, like curves or more complex boundaries.
The Big Discovery:
Back in the 1960s, scientists realized this and thought, “Neural networks are useless if they can’t solve XOR.” This almost killed AI research for a while.
Then, in the 1980s, researchers figured out that by adding hidden layers (making multi-layer networks), the problem could be solved. This discovery sparked a huge comeback for AI and neural networks!
What does the learning rate in gradient descent control?
The size of the steps taken to adjust weights during training.
What is the difference between supervised and unsupervised learning?
Supervised learning uses labeled data to train the model, while unsupervised learning finds patterns in data without labels.
What’s the difference between overfitting and underfitting?
Overfitting memorizes training data too well; underfitting doesn’t learn enough patterns.
What makes CNNs special for image recognition?
They use filters to detect patterns (edges, shapes) in images.
What’s an RNN used for?
It’s designed for sequences, like predicting the next word in a sentence.
How is an RNN different from a regular neural network?
It has memory—previous outputs influence future ones.
Q: What is a feature in machine learning?
A: An input variable used to make predictions (like size, age, or color)
Q: What is a decision tree?
A: A simple ML model that splits data based on questions to make predictions.
Q: What is K-Nearest Neighbors (KNN)?
A: A model that predicts based on the closest examples in the data. EX: Cough no, Fever yes = Sick
Q: What is gradient descent used for in training a neural network?
A: To find the best weights by minimizing the error step by step. EX: Educated guess going off already known info