class 14 15 Flashcards

Question 1

Q

What is Sequential domains?

Answer

A

Learning in sequential domains is different from learning in static domains. In static domains each sample is independent and identically distributed. In sequential domains there is a dependency among all the points in sequence. A sequential domain means that, an instance in time t depends on another instance in time t-1, overall, the instances in different timesteps are dependent on one another.

Question 2

Q

Explain what is statical data and sequential data in details.

Answer

A

Static Data: learn the probabilistic distribution of the output given the input.
P(o|x): We are learning probabilistic distribution because it is strong against noise.
typeof(x) is fixed size tuple
o: classification or regression

Sequential data: P(o|x)
typeof(x): a sequence x(1) ,x(2) , x(t), where each x(t) has static type
o: can be either static or a sequence

Question 3

Q

What is a sequence, what is sequential transduction?

Answer

A

Sequence is or an ordered pair(t,h) where head is the vertex and tail is a sequence.

Sequential transduction:
Let the X and O be the input and output label spaces. The transduction, transforms any input sequence into an output sequence.
A general transduction T is a subset of X* x O*
Limiting it: T:X->O

A transduction T(.) is algebraic if it has a limited memory.

If the transduction has a finite memory k, limited memory, it means that an instance at time t can only depend on t,t-1 t-2 .. t-k
But sequences can have variable length, which means that the length can vary, therefore we need a fixed size window. Everytime you make a prediction you do it from that window.

Question 4

Q

What is casuality?

What is recursive state representation?

Answer

A

A transduction T(.) is casual if the output at time t does not depend on future inputs at time t+1 t+2 ..

Recursive State Representation:
A recursive state representation exists if only the transaction T is casual.
Output depend on hidden state variables(label space H)

h(t) =f( h(t-1) , x(t) , t )
o(t) = g(h(t) ,x(t))

f: HxX -> H
g: HxX -> O

Transduction T is stationary if f(.) and g(.) does not depend on t.

Question 5

Q

Time Shift Operator and Graphical Description

Answer

A

q^-1 is a time shift operator where it means that h(t) depends on h(t-1) that one means that it depends on 1 time step backward

o(t)
|
h(t) [ (q^-1) loop]
|
x(t)

Question 6

Q

Time Unfolding

Answer

A

The unfolded network had a feed forward structure, Weights are shared(replicated) meaning that the same weights are usedin different timesteps

Question 7

Q

Examples of Sequential Transductions

Answer

A

Sequence Classification(n->1)
I-O transduction(n->n)
Sequence Generation(1->n)
Sequence Transduction(n->m)

input and output objects

Question 8

Q

RECURRENT NEURAL NETWORKS

NON LINEAR
1)Shallow Recurrent Neural Networks

Answer

A

Shallow Recurrent neural networks are non linear.
h(t) = f(U x(t) + Wh(t-1) + b)
o(t) = g(Vh(t) +c)

h(t) = tanh(Ux(t)+Wh(t-1) + b)
o(t) = V h(t) + c
y(t) = softmax(o(t))
Loss = TOPLAM logpmodel(y(t) {x(1),x(2),x(3) .. ]

Question 9

Q

Some Architectural features for RNN

Answer

A

1)shortcut connections:
İnput is also connected to output (the output not only depends on the hidden but also input)
O(t) = Vx(t) + Vh(t) + b

2) Higher order:
Hidden representation has a connectiion on q^-1 and q^-2
h(t) = W(1)h(t-1) + W(2) h(t-2) + Vx(t) + c
3) feedback from output:
the output unit of the previous time step is fed into the next timesteps hidden state
h(t) = Ux(t) ++ wh(t-1) + Z o(t-1) + c
4)TEACHER FORCİNG:
the previous layers target value is fed into the next time steps hidden layer

ALL OF THESE HAVE CASUAL TRANSDUCTION WHICH MEANS THAT THERE IS A DEPENDENCY IN THE PAST BUUUTTT:
5)BIDIRECTIONAL RNN: Possibility to look at the future

Question 10

Q

What is bidirectional RNN and what is the difference between the other RNN methods?

Answer

A

In bidirectional RNN, the hidden representation takes both future and past inputs, therefore the transduction is not casual anymore, since the input depends on the future inputs.
h(t) = Ux(t)+Wph(t-1)
h(t) =Ux(t) + Wf h(t+1)
DNA sequences

Question 11

Q

What is back propagation over time and real time recurrent neural networks?

Answer

A

We use those in RNNs, in backpropagation through time we compute the full sequence, if we have a sequence with lenght 1000 we have to sum everything up for each sequence and calculate the full gradient.
In real time RNN in each round we compute partial derivatives whenever we go forward.
At the end both calculates the same but in different ways. For Real time recurrent neural networks, the memory requirement and the time complexity is bigger than the back prropagation through time

class 14 15 Flashcards

(11 cards)