Neural Networks Flashcards

1
Q

An AI Neural Network Is an ___ paradigm

It is inspired by the way biological nervous systems, such has the brain, ___ information

A

information processing

process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AI NN learn by ___, like people

Learning in biological systems involves adjustments to the synaptic ___ that exist between the ___

A

example
connections
neurones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NN derive ___ from ___ and ___ data

A

meaning
complicated
imprecise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
NN characteristics
1- \_\_\_
2- \_\_\_
3- \_\_\_
4- \_\_\_
A

A. Adaptive Learning
S. Self-Organization
R. Real Time Operation
F. Fault Tolerance

A Sopa RreFeceu

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A Perceptron is a simple model that consists of a single trainable ___
It receives several ___ and its ___ and has a ___ T (real value)

A

neuron
inputs
weights
Threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To train a Perceptron we give ___ and the ___, then we give him ___ and tell him if he got it right or wrong

A

inputs
desired outputs
examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What if the Perceptron has the wrong output?

A

If the Desired Output is 0, we showld decrease the weights

If the Desired Output is 1, we showld increase the weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The decrease in weights of an edge should be ___ to the input through that edge
Meaning that if an input is really high then it should be accountable for ___ of the error of the output

A

directly proportional

most

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can a Perceptron solve problems that are not linearly separable?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What algorithm do we use to train a MLP?

A

Backpropagation Algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In Backpropagation Algorithm we need to follow the following steps:
1- ___
2- For each training example do a ___
3- Obtain ___ by comparing the result with the ___
4- Do a ___
5- if loss ___ ℇ, or if loss is still ___ at a reasonable rate, go to 2

A
1- Initialization
2- forward propagation
3- loss (error) / desired output 
4- backward propagation
5- >= /decreasing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or false
The gradient descent method involves calculating the derivative of the loss error function with respect to the weights of the network

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can we solve any problem with a single hidden layer?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In Hidden Layers:
1- Too few neurons can lead to ___ as there are not enough to capture the problem ___
2- Too much neurons can lead to ___ as the information in the training set is not enough to ___ all neurons in the hiiden layers
Also there is an ___ increase in training time

A

1- Underfitting / intricate

2- Overfittin /train / Exponential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The porpose of the Activation Function is to introduce ___ to artificial NN

A

nonlinear real-world properties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the most common Activation Functions on MLPs?

A

S. Sigmoid
T. Tanh
R. Relu

SToR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

1- Large learning rates result in ___ training and ___ results
2- Tiny learning rates ___ the training process and might result in a ___ to train

A

1- unstable / non-optimal

2- lengthen / failure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What Hyperparameters exist on MLPs?

A
Inputs
Outputs
Hidden Layers
Activation Function
Learning Rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the typical values for the Learning Rate?

A

0.01 to 0.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are Epocs?

A

One Epoch is one run over the whole training set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

NN extract ___ and detect ___ that are too ___ for humans or other techs

A

patterns
trends
complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

NN can perform tasks that are ___ to humans but ___ for other techs (like )

A

trivial
difficult
handwriting recognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In a Perceptron If the ___ of the ___ multiplied by the respective ___ is greater than ___, then the Output is ___, and ___ otherwise

A
sum
inputs 
weight
T 
1
0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The Backpropagation algorithm works by doing the following:

Given a set of input/output training data, find a set of weights to ___ using the ___ method

A

minimize

gradient descent

25
Q

We can solve any problem with a single hidden layer, but for ___ problems it might be tricky and highly dependent on the quality of the ___. But if we have too many ___ we may need better ___ and the processing time is much ___

A
complex
training set
layers 
learning algorithms
slower
26
Q

The gradient descent method involves calculating the ___ with respect to the ___ of the network

A

derivative of the loss error function

weights

27
Q

Some rules of thumb are of hidden lyers are:
1- Size of Input layer ___ Size of Hidden layer ___ size of output layer
2- Size of Hidden layer = ___ Size of Input layer + Size of Output layer
3- Size of Hidden layer < ___ Size of Input layer

A

1- > / >
2- 2/3
3- 2x

28
Q

Hopfield Nets is a ___ NN where neurons are ___ units

A

Recurrent

binary threshold

29
Q

Elman is a ___ NN with ___ inputs where the output from the previous step is ___ as the input to the current step

A

Recurrent
non-stationary
fed

30
Q

Some Elman problems are:
1- Training a RNN is a very ___ task
2- It has ___ and ___ problems

A

1- difficult

2- vanishing and exploding

31
Q

Hopefield Nets aplications are:
1- recalling or recontructing ___ patterns
2- Image ___

A

1- corrupted

2- detection

32
Q
Elman aplications are:
1- \_\_\_ the next word in a sentence
2- \_\_\_ in time-series
3- \_\_\_ in computer networks
4- Human Action \_\_\_
A

1- Predicting
2- Anomaly detection
3- Intrusion Detection
4- recognition

33
Q

LSTM were developed to deal with RNN ___ problem, allowing them to learn ___ dependencies by ___ longer sequences

A

vanishing gradient
long-term
remembering

34
Q

The Hopfield Network has a capacity of ___ patterns for every ___ nodes

A

138

1000

35
Q

LSTM Cell state allows easy flow of ___ through the subsequent ___ thereby helping preserve ___

A

unchanged information
LSTM cell
context

36
Q

LSTM Forget Gate tells us what information can ___

A

be thrown away

37
Q

LSTM Input Gate tells us what new information ___

A

should be stored

38
Q

LSTM Update Current State ___ the things decided to forget earlier and add the ___ values

A

forgets

new candidate

39
Q

LSTM Output Gate ___ and ____ values to be updated in the ___

A

decides and computes

hidden state

40
Q

LSTM Regularization represents the techniques that ___ the learning algorithm to ___ better

A

modifie

generalize

41
Q

Some regularization techniques are:
1- ___
2- ___
3- ___

A

1- Weight Regularization
2- Dropout
3- Early Stopping

42
Q

Weight Regularization technique:

Slows down ___ and ___ of the network by ___ the weights of the nodes

A

learning and overfitting

penalizing

43
Q

Dropout regularization technique:
___ of some input and recurrent connections
in order to prevent some ___ and ___ during the
learning process (slowing the learning process)

A

Probabilistic removal

activations and weight updates

44
Q

Early Stopping regularization technique:
___ strategy which stops training
when performance on the validation set is not ___ after a certain iteration

A

Cross-validation

improving

45
Q

LSTM is mostly used on ____ recognition and ___ recognition

A

handwriting and speech

46
Q

GRU is a simplified ___ with fewer ___ and no ___

A

LSTM
parameters
output gate

47
Q

CNN have become the standard for computer ___ tasks

A

vision

48
Q

NN are ___ as it is impossible to understand why they produce their results

A

black boxes

49
Q

NN should be used :
1- When a large amount of ___ is available and one cannot formulate an ___ solution
2- When continuous learning from ___ is important
3- When there is no need to extract ___ from the results

A

1- examples / algorithmic
2- previous results
3- knowledge

50
Q

CNN its a way of incorporating invariance arising from ___, ___ and ___, into the NN model

A

scaling
translation
rotation

51
Q

CNN exploits the strong correlarion between ___ pixels

A

neighboring

52
Q

CNN has local ___ fields, ___ sharing and ___

A

receptive
weight
subsampling

53
Q

CNN is composed by ___ and ___ layers, followed by an ___ layer that is task ___

A

convolutional and subsampling
output
dependent

54
Q

The CNN is organized into planes, each known as ___

A

feature map

55
Q

Each feature map in CNN is composed of ___

A

units

56
Q

Units in CNN receive inputs from a small subrigion of the input image, know as its ___

A

receptive fiedl

57
Q

In a CNN all units of a feature map share the same weight matrix (___)

A

kernel

58
Q

Convolution is the operation where the ___ slides along the input image (stride), computing the value of a ___ in the feature map for each ___

A

kernel
unit
movement

59
Q

In CNN, subsampling is the process of reducing a LxL patch of the feature map into a ___

A

single number