Neural Networks Flashcards by Carlos Sequeira

An AI Neural Network Is an ___ paradigm

It is inspired by the way biological nervous systems, such has the brain, ___ information

information processing

process

How well did you know this?

Not at all

Perfectly

AI NN learn by ___, like people

Learning in biological systems involves adjustments to the synaptic ___ that exist between the ___

example
connections
neurones

How well did you know this?

Not at all

Perfectly

NN derive ___ from ___ and ___ data

meaning
complicated
imprecise

How well did you know this?

Not at all

Perfectly

NN characteristics
1- \_\_\_
2- \_\_\_
3- \_\_\_
4- \_\_\_

A. Adaptive Learning
S. Self-Organization
R. Real Time Operation
F. Fault Tolerance

A Sopa RreFeceu

How well did you know this?

Not at all

Perfectly

A Perceptron is a simple model that consists of a single trainable ___
It receives several ___ and its ___ and has a ___ T (real value)

neuron
inputs
weights
Threshold

How well did you know this?

Not at all

Perfectly

To train a Perceptron we give ___ and the ___, then we give him ___ and tell him if he got it right or wrong

inputs
desired outputs
examples

How well did you know this?

Not at all

Perfectly

What if the Perceptron has the wrong output?

If the Desired Output is 0, we showld decrease the weights

If the Desired Output is 1, we showld increase the weights

How well did you know this?

Not at all

Perfectly

The decrease in weights of an edge should be ___ to the input through that edge
Meaning that if an input is really high then it should be accountable for ___ of the error of the output

directly proportional

most

How well did you know this?

Not at all

Perfectly

Can a Perceptron solve problems that are not linearly separable?

How well did you know this?

Not at all

Perfectly

What algorithm do we use to train a MLP?

Backpropagation Algorithm

How well did you know this?

Not at all

Perfectly

In Backpropagation Algorithm we need to follow the following steps:
1- ___
2- For each training example do a ___
3- Obtain ___ by comparing the result with the ___
4- Do a ___
5- if loss ___ ℇ, or if loss is still ___ at a reasonable rate, go to 2

1- Initialization
2- forward propagation
3- loss (error) / desired output 
4- backward propagation
5- >= /decreasing

How well did you know this?

Not at all

Perfectly

True or false
The gradient descent method involves calculating the derivative of the loss error function with respect to the weights of the network

True

How well did you know this?

Not at all

Perfectly

Can we solve any problem with a single hidden layer?

Yes

How well did you know this?

Not at all

Perfectly

In Hidden Layers:
1- Too few neurons can lead to ___ as there are not enough to capture the problem ___
2- Too much neurons can lead to ___ as the information in the training set is not enough to ___ all neurons in the hiiden layers
Also there is an ___ increase in training time

1- Underfitting / intricate

2- Overfittin /train / Exponential

How well did you know this?

Not at all

Perfectly

The porpose of the Activation Function is to introduce ___ to artificial NN

nonlinear real-world properties

How well did you know this?

Not at all

Perfectly

What are the most common Activation Functions on MLPs?

S. Sigmoid
T. Tanh
R. Relu

SToR

How well did you know this?

Not at all

Perfectly

1- Large learning rates result in ___ training and ___ results
2- Tiny learning rates ___ the training process and might result in a ___ to train

1- unstable / non-optimal

2- lengthen / failure

How well did you know this?

Not at all

Perfectly

What Hyperparameters exist on MLPs?

Inputs
Outputs
Hidden Layers
Activation Function
Learning Rate

How well did you know this?

Not at all

Perfectly

What are the typical values for the Learning Rate?

0.01 to 0.1

How well did you know this?

Not at all

Perfectly

What are Epocs?

One Epoch is one run over the whole training set

How well did you know this?

Not at all

Perfectly

NN extract ___ and detect ___ that are too ___ for humans or other techs

patterns
trends
complex

How well did you know this?

Not at all

Perfectly

NN can perform tasks that are ___ to humans but ___ for other techs (like )

trivial
difficult
handwriting recognition

How well did you know this?

Not at all

Perfectly

In a Perceptron If the ___ of the ___ multiplied by the respective ___ is greater than ___, then the Output is ___, and ___ otherwise

sum
inputs 
weight
T 
1
0

How well did you know this?

Not at all

Perfectly

The Backpropagation algorithm works by doing the following:

Given a set of input/output training data, find a set of weights to ___ using the ___ method

Study These Flashcards

minimize

gradient descent

We can solve any problem with a single hidden layer, but for ___ problems it might be tricky and highly dependent on the quality of the ___. But if we have too many ___ we may need better ___ and the processing time is much ___

``` complex training set layers learning algorithms slower ```

The gradient descent method involves calculating the ___ with respect to the ___ of the network

derivative of the loss error function | weights

Some rules of thumb are of hidden lyers are: 1- Size of Input layer ___ Size of Hidden layer ___ size of output layer 2- Size of Hidden layer = ___ Size of Input layer + Size of Output layer 3- Size of Hidden layer < ___ Size of Input layer

1- > / > 2- 2/3 3- 2x

Hopfield Nets is a ___ NN where neurons are ___ units

Recurrent | binary threshold

Elman is a ___ NN with ___ inputs where the output from the previous step is ___ as the input to the current step

Recurrent non-stationary fed

Some Elman problems are: 1- Training a RNN is a very ___ task 2- It has ___ and ___ problems

1- difficult | 2- vanishing and exploding

Hopefield Nets aplications are: 1- recalling or recontructing ___ patterns 2- Image ___

1- corrupted | 2- detection

``` Elman aplications are: 1- ___ the next word in a sentence 2- ___ in time-series 3- ___ in computer networks 4- Human Action ___ ```

1- Predicting 2- Anomaly detection 3- Intrusion Detection 4- recognition

LSTM were developed to deal with RNN ___ problem, allowing them to learn ___ dependencies by ___ longer sequences

vanishing gradient long-term remembering

The Hopfield Network has a capacity of ___ patterns for every ___ nodes

138 | 1000

LSTM Cell state allows easy flow of ___ through the subsequent ___ thereby helping preserve ___

unchanged information LSTM cell context

LSTM Forget Gate tells us what information can ___

be thrown away

LSTM Input Gate tells us what new information ___

should be stored

LSTM Update Current State ___ the things decided to forget earlier and add the ___ values

forgets | new candidate

LSTM Output Gate ___ and ____ values to be updated in the ___

decides and computes | hidden state

LSTM Regularization represents the techniques that ___ the learning algorithm to ___ better

modifie | generalize

Some regularization techniques are: 1- ___ 2- ___ 3- ___

1- Weight Regularization 2- Dropout 3- Early Stopping

Weight Regularization technique: | Slows down ___ and ___ of the network by ___ the weights of the nodes

learning and overfitting | penalizing

Dropout regularization technique: ___ of some input and recurrent connections in order to prevent some ___ and ___ during the learning process (slowing the learning process)

Probabilistic removal | activations and weight updates

Early Stopping regularization technique: ___ strategy which stops training when performance on the validation set is not ___ after a certain iteration

Cross-validation | improving

LSTM is mostly used on ____ recognition and ___ recognition

handwriting and speech

GRU is a simplified ___ with fewer ___ and no ___

LSTM parameters output gate

CNN have become the standard for computer ___ tasks

vision

NN are ___ as it is impossible to understand why they produce their results

black boxes

NN should be used : 1- When a large amount of ___ is available and one cannot formulate an ___ solution 2- When continuous learning from ___ is important 3- When there is no need to extract ___ from the results

1- examples / algorithmic 2- previous results 3- knowledge

CNN its a way of incorporating invariance arising from ___, ___ and ___, into the NN model

scaling translation rotation

CNN exploits the strong correlarion between ___ pixels

neighboring

CNN has local ___ fields, ___ sharing and ___

receptive weight subsampling

CNN is composed by ___ and ___ layers, followed by an ___ layer that is task ___

convolutional and subsampling output dependent

The CNN is organized into planes, each known as ___

feature map

Each feature map in CNN is composed of ___

units

Units in CNN receive inputs from a small subrigion of the input image, know as its ___

receptive fiedl

In a CNN all units of a feature map share the same weight matrix (___)

kernel

Convolution is the operation where the ___ slides along the input image (stride), computing the value of a ___ in the feature map for each ___

kernel unit movement

In CNN, subsampling is the process of reducing a LxL patch of the feature map into a ___

single number

Neural Networks Flashcards

(59 cards)