7 - Reinforcement Learning Flashcards

1
Q

Reinforcement Learning

A

Learn to take actions in an environemtn to maximise rewards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

State

A

Information from the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Policy (pi symbol)

A

A map from state space (s) to action space (a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Reward Function (R) meaning

A

Maps each state (or state-action pair) to a reward number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Value FUnction, (Q pi) meaning

A

Value of a state/state-action pair.

Total expected reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Q learning method

A

For each episode:
- Select a random initial state

While (not goal):
- Select one action for the current state
- Bellman Equation
- Set the next state as the current state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Equation involved in Q Learning

A

Q(s, a) = R(s,a) + Gamma * Max[Q(next state, all actions)]

s - state space
a - action space
R - reward function
Gamma is a value (?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reward Matrix

A

Links states with reward values.

Eg Going to room 5 might mean giving moves to room 5 a 100 reward etc

The matrix is State (rows) by Action (columns)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Q Matrix - rows are ? columns are ? start values are ?

A

Matrix of state (rows) and actions (columns)

All values start 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly