Neural Networks Flashcards

1
Q

sequential environment

A

agent’s current action affects next future actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

episodic environment

A

agent’s current action does not affect future actions
output due to percept at time step n is independent of percept and output at time step n+1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

feedforward neural network

A

FNN is a directed acyclic graph with internal state of network only depending on its current input (model process each input independently)
information moves in one direction and there is no cycles or loops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

FNN example

A

MLP, fully connected network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

RNN

A

type of neural network specifically designed for processing sequential dat

directed graph with cycles
internal state of network depends on its previous state
model exhibits dynamic temporal behaviour
uses output of hidden layer from previous tilmestep as additional input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

RNN example

A

single hidden layer network with weights connecting output of hidden layer to itself with a time unit delay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

temporal patterns

A

data that exhibit sequences/patterns over time
for sequential data problems temporal patterns are critical to learning dependencies between data points across time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RNN and temporal dependencies

A

RNN captures temporal dependencies by maintaining a hidden state allowing model to store important past info

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

RNN architecture and how its trained

A

processes sequential data by maintaining information about previous inputs through a recurrent hidden state. This gives RNNs a “memory” that captures dependencies over time

input layer: Each element in the sequence is fed to the network one timestep at a time

hidden layer: Has recurrent connections that allow it to use the previous hidden state along with the current input to capture dependencies over time

output layer: ouput depending on task

calculates the difference between the predicted output and the actual output and backpropogates that using BPTT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

BPTT and how it works

A

same as BP but RNN is “unrolled” across all timesteps in the sequence

back propagation through time
average error is propagated back through all of the time steps through the network
the predicted output for the next word is compared to the actual next word using a loss function
gradients of the loss function with respect to the weights are computed for each time step.
Using the accumulated gradients from BPTT, the optimizer updates the weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

long distance dependencies and BPTT

A

BPTT can face challenges of vanishing gradient (gradient becomes very small and learning stalls) and exploding gradient (gradient excessively large and destabilises training ) if there are lot of words
hard to train earlier later with error from later layers
for example: Mary grew up in China………… She speaks Mandarin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

eigen values and BPTT

A

eigenvalues of the weight matrix indicate whether the network’s output will explode (grow very large) or vanish (shrink towards zero) over time

e>1 grad wil explode (fixed by clipping weights)
e = 1 gradient will propagate nicely
e< 1 vanishing gradient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

RNN and sequential environments

A

RNN is ideal for tasks where order of data is important
allows network o consider context and dependencies between inputs that occur over time
RNN enable learning in evirnment with temporal sequences by capturing relationships between inputs over time
allowing model to adapt its predictions based on past experiences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

perplexity

A

evaluation for a language model
what is the chance of you picking selection of good words?
want models that computer high probability for corpus (The entire set of language data to be analyzed) and low perplexity (as they have an inverse relationship)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RNN Usage

A

1:1 image classficiation
1:m text generation
m:1 emotion classification
m:m translation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly