2. Mathematical Building Blocks Flashcards

1
Q

What are Tensors?

A

At its core, a tensor is a container for data—almost always numerical data. So, it’s a container for numbers. You may be already familiar with matrices, which are 2D tensors: tensors are a generalization of matrices to an arbitrary number of dimensions (note that in the context of tensors, a dimension is often called an axis).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Scalars

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Vectors

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Matrices (2D tensors)

A

An array of vectors is a matrix, or 2D tensor. A matrix has two axes (often referred to rows and columns). You can visually interpret a matrix as a rectangular grid of numbers. This is a Numpy matrix:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3D tensors and higher-dimensional tensors

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A tensor is defined by three key attributes:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Displaying mnist digit images

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Batches

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Real World Tensor Examples

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Vector for: “An actuarial dataset of people, where we consider each person’s age, ZIP code, and income.”

A

Each person can be characterized as a vector of 3 values, and thus an entire dataset of 100,000 people can be stored in a 2D tensor of shape (100000, 3).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Vector for: “A dataset of text documents, where we represent each document by the counts of how many times each word appears in it (out of a dictionary of 20,000 common words).”

A

Each document can be encoded as a vector of 20,000 values (one count per word in the dictionary), and thus an entire dataset of 500 documents can be stored in a tensor of shape (500, 20000).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vector for: “A dataset of stock prices. Every minute, we store the current price of the stock, the highest price in the past minute, and the lowest price in the past minute.”

A

Thus every minute is encoded as a 3D vector, an entire day of trading is encoded as a 2D tensor of shape (390, 3) (there are 390 minutes in a trading day), and 250 days’ worth of data can be stored in a 3D tensor of shape (250, 390,3). Here, each sample would be one day’s worth of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Vector for: “A dataset of tweets, where we encode each tweet as a sequence of 280 characters out of an alphabet of 128 unique characters.”

A

In this setting, each character can be encoded as a binary vector of size 128 (an all-zeros vector except for a 1 entry at the index corresponding to the character). Then each tweet can be encoded as a 2D tensor of shape (280, 128), and a dataset of 1 million tweets can be stored in a tensor of shape (1000000, 280, 128).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Vector for: “A batch of 128 grayscale images of size 256 × 256”

A

A batch of 128 grayscale images of size 256 × 256 could thus be stored in a tensor of shape (128, 256, 256, 1), and a batch of 128 color images could be stored in a tensor of shape (128, 256, 256, 3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe as a function: “keras.layers.Dense(512, activation=’relu’)”

A

output = relu(dot(W, input) + b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

relu pseudocode

A
17
Q

naive add pseudocode

A
18
Q

Broadcasting

A
19
Q

naive add matrix and vector pseudocode

A
20
Q

naive vector dot pseudocode

A
def naive\_vector\_dot(x, y):
 assert len(x.shape) == 1
 assert len(y.shape) == 1
 assert x.shape[0] == y.shape[0]

z = 0.
for i in range(x.shape[0]):

z += x[i] * y[i]

return z

21
Q

Training Loop

A

1 Draw a batch of training samples x and corresponding targets y.

2 Run the network on x (a step called the forward pass) to obtain predictions y_pred.

3 Compute the loss of the network on the batch, a measure of the mismatch between y_pred and y.

4 Update all weights of the network in a way that slightly reduces the loss on this batch.

22
Q
A