lesson_2_flashcards

Question 1

Q

What is a computation graph?

Answer

A

A directed acyclic graph representing a function as interconnected modules, allowing gradient computations for optimization.

Question 2

Q

Why must layers in computation graphs be differentiable?

Answer

A

Differentiability ensures gradients can be computed, which are essential for optimization using gradient descent.

Question 3

Q

What is backpropagation?

Answer

A

An algorithm to compute gradients for all parameters in a computation graph by recursively applying the chain rule from outputs to inputs.

Question 4

Q

What are the steps of backpropagation?

Answer

A

Forward pass to compute activations; 2. Backward pass to calculate gradients of loss with respect to parameters using chain rule.

Question 5

Q

What is the difference between forward and reverse mode automatic differentiation?

Answer

A

Forward mode propagates gradients from inputs to outputs, while reverse mode computes gradients from outputs to inputs, which is more efficient for deep learning.

Question 6

Q

What is the significance of automatic differentiation in deep learning?

Answer

A

It automates gradient computation for arbitrary computation graphs, simplifying implementation and enabling differentiable programming.

Question 7

Q

How does the chain rule apply in backpropagation?

Answer

A

Gradients of the loss with respect to parameters are computed as products of intermediate gradients along the computation graph.

Question 8

Q

What is a fully connected (linear) layer in neural networks?

Answer

A

A layer where each input is connected to every output through a weighted sum, followed by an optional activation function.

Question 9

Q

What are hidden layers in a neural network?

Answer

A

Layers between input and output that learn intermediate features, increasing the representational power of the model.

Question 10

Q

What is a rectified linear unit (ReLU)?

Answer

A

A non-linear activation function defined as max(0, x), providing better gradient flow than sigmoid functions.

Question 11

Q

What is the role of the Jacobian in neural networks?

Answer

A

Jacobians represent gradients of vector-valued functions with respect to their inputs, aiding in efficient gradient computations.

Question 12

Q

What is logistic regression in the context of computation graphs?

Answer

A

A binary classifier using the sigmoid function on a weighted sum of inputs, represented as a simple computation graph.

Question 13

Q

What is gradient flow in deep learning?

Answer

A

The propagation of gradient information through a network during backpropagation, critical for effective learning.

Question 14

Q

What is differentiable programming?

Answer

A

A paradigm where entire programs, including control flows, are made differentiable to enable optimization through backpropagation.

Question 15

Q

What are mini-batches, and why are they used?

Answer

A

Small subsets of training data used in gradient descent to balance computational efficiency and stable optimization.

lesson_2_flashcards

(15 cards)