Deep Learning Using SAS® Software Flashcards

1
Q

What are the three Deep Learning model variants?

A
  1. Deep fully connected neural networks (DNN)
  2. Convolutional neural networks (CNN)
  3. Recurrent neural networks (RNN)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What other languages can read CASL?

A

Python, R, Java, and Lua

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Curriculum Learning?

A

slowly build up learning concepts aka shuffle action to randomize data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What method of weight initialization is used in deep learning?

A

a normalized initialization in which the variance of the hidden weights is a function of the amount of incoming information and outgoing information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does RMSE stand for?

A

root means square error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does MAPE stand for?

A

mean absolute percentage error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is regularization?

A

the process of introducing or removing information to stabilize an algorithm’s understanding of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the dropout regularization method do?

A

Dropout adds noise to the learning process so that the model is more generalizable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can you improve model generalization?

A

Dropout can improve model generalization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a thinned network?

A

Each time that units are removed (USING DROPOUT), the resulting network is referred to as a thinned network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the batch normalization regularization method do?

A

The batch normalization operation normalizes data being passed between layers in a neural network to prevent large input values in the combination function, which can lead to overfitting of the data.
batch normalization normalizes the information back to the linear region of the sigmoid, which is a safe output region;
batch normalization brings the weight values back to a familiar range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do weight initializations have less impact on model performance if batch normalization is used?

A

batch normalization standardizes information that is passed between hidden layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When should you use GPUs instead of CPUs in Deep Learning?

A

The use of GPUs should be reserved for larger neural networks. The difference in performance between CPUs and GPUs is negligible in neural networks with a small number of parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why are GPUs effective when modeling neural networks?

A

GPUs are designed to perform many operations in parallel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does Loss refer to in Neural Network output?

A

loss specifies the training error function value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does Validation Loss refer to in Neural Network output?

A

specifies the validation error function value

17
Q

What does Validation Error refer to in Neural Network output?

A

the validation misclassification rate

18
Q

What does Fit Error refer to in Neural Network output?

A

the misclassification rate for the training data

19
Q

What does a sudden increase in the L1 or L2 norm values indicate?

A

Overfitting (because the weights in the model must be getting really large to push those values up)

20
Q

What does ADAM optimization do if the signal to noise ratio is high?

A

adjusts the step size in order to take larger steps towards error minima

21
Q

What does ADAM optimization do if the signal to noise ratio is low?

A

adjusts the step size to move more slowly