lesson_1_flashcards
What is hierarchical compositionality in deep learning?
A principle where features are learned through a composition of simple and complex transformations, mirroring real-world structures.
What is end-to-end learning in deep learning?
A learning method where models optimize directly from raw inputs to final outputs, automating feature extraction and classification in one process.
What is distributed representation in deep learning?
A feature representation where information is distributed across multiple neurons, enabling rich and generalizable representations.
What is gradient descent used for in deep learning?
An optimization algorithm that iteratively updates model parameters to minimize a loss function by moving in the direction of steepest descent.
What are mini-batches in gradient descent?
Mini-batches are small subsets of training data used to compute gradients and update weights, balancing computational efficiency and convergence stability.
What is the role of softmax in classification tasks?
Softmax converts raw class scores into normalized probabilities, making them interpretable for classification.
What is cross-entropy loss?
A loss function used for classification tasks that penalizes incorrect predictions by comparing predicted probabilities with true labels.
How are computation graphs utilized in deep learning?
Computation graphs represent models as differentiable operations, enabling efficient backpropagation for optimization.
What are parametric models in machine learning?
Models that explicitly represent a function ( f(x, W) ) with parameters ( W ), such as linear models or neural networks, optimized during training.
What is the purpose of feature learning in deep learning?
Feature learning automates the process of extracting meaningful representations from raw data, reducing dependency on manual feature engineering.
What is supervised learning?
A learning paradigm where models are trained on labeled datasets to learn mappings from inputs (X) to outputs (Y).
What are regularization techniques used for in deep learning?
Regularization techniques like L1 or L2 prevent overfitting by penalizing large weight values, encouraging simpler models.
What makes deep learning unique compared to traditional machine learning?
Deep learning uses hierarchical, end-to-end learning and distributed representations to generalize across tasks without manual feature engineering.
What are loss functions, and why are they important?
Loss functions measure the error between predicted outputs and ground truth, guiding optimization during training.
How does hierarchical compositionality mirror real-world data?
It reflects natural structures, such as edges forming shapes in images or phonemes forming words in speech.