Feedforward Neural Networks 2 Flashcards
1
Q
Node
A
- Processes input data and contributes to the network’s output
- can be tensor, matrix, vector, or scalar value
- Input can be from features from data or output value of other node
2
Q
Edge
A
- Represents a function argument
- Pointers to nodes
- Can be adjusted during optimization to make better predictions
3
Q
Epoch
A
- One full cycle through the entire training dataset
- Adjust parameters (weights and biases) based on the gradients of the loss function
4
Q
Training a Model Steps
A
- Define a computation graph (FFNN class)
- For each epoch:
a. for each batch of data: compute loss, autograd to compute gradients, and take step with optimizer
b. evaluate on validation set to avoid overfitting
5
Q
Shuffling the Training Data
A
- What if “I love you” at end of training set 1000 times
- Parameters will be inaccurately updated
- Randomly shuffling the order at each time step of before training epoch
6
Q
Early Stopping
A
- Prevents overfitting
- Stop training when performance starts to decline
- Return best parameters
7
Q
Training Tricks
A
- Shuffling the training data
- Early stopping
- Parameter dropout
8
Q
Parameter Dropout
A
- Prevents overfitting
- Randomly setting a portion of the model’s parameters (weights) to zero
- Done only at training time
9
Q
Mini-batching
A
- Splitting the training set into smaller, manageable subsets
- Combines smaller operations into one big one (more efficient computation)
10
Q
Batch Data Loading
A
- Instead of processing one single sentence, process mini-batch of sentences
- Must perform sentence padding on mini-batches
11
Q
Sentence Padding
A
- done during batch data loading
- pad shorter sentences in batch to match lengths