lesson_2_flashcards

1
Q

What is a computation graph?

A

A directed acyclic graph representing a function as interconnected modules, allowing gradient computations for optimization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why must layers in computation graphs be differentiable?

A

Differentiability ensures gradients can be computed, which are essential for optimization using gradient descent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is backpropagation?

A

An algorithm to compute gradients for all parameters in a computation graph by recursively applying the chain rule from outputs to inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps of backpropagation?

A
  1. Forward pass to compute activations; 2. Backward pass to calculate gradients of loss with respect to parameters using chain rule.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between forward and reverse mode automatic differentiation?

A

Forward mode propagates gradients from inputs to outputs, while reverse mode computes gradients from outputs to inputs, which is more efficient for deep learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the significance of automatic differentiation in deep learning?

A

It automates gradient computation for arbitrary computation graphs, simplifying implementation and enabling differentiable programming.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does the chain rule apply in backpropagation?

A

Gradients of the loss with respect to parameters are computed as products of intermediate gradients along the computation graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a fully connected (linear) layer in neural networks?

A

A layer where each input is connected to every output through a weighted sum, followed by an optional activation function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are hidden layers in a neural network?

A

Layers between input and output that learn intermediate features, increasing the representational power of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a rectified linear unit (ReLU)?

A

A non-linear activation function defined as max(0, x), providing better gradient flow than sigmoid functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the role of the Jacobian in neural networks?

A

Jacobians represent gradients of vector-valued functions with respect to their inputs, aiding in efficient gradient computations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is logistic regression in the context of computation graphs?

A

A binary classifier using the sigmoid function on a weighted sum of inputs, represented as a simple computation graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is gradient flow in deep learning?

A

The propagation of gradient information through a network during backpropagation, critical for effective learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is differentiable programming?

A

A paradigm where entire programs, including control flows, are made differentiable to enable optimization through backpropagation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are mini-batches, and why are they used?

A

Small subsets of training data used in gradient descent to balance computational efficiency and stable optimization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly