Algoritmes Lecture 2 Flashcards

1
Q

Reinforcement Learning

A

Learning from experience, rewards and punishment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

environment

A

state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

agent

A

has control
state rewards
actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Markov decision process

Transition process

A
model for decision making action
set of states S
set of Action(s)
transition model P(s' | s,a)
Rewards R(s, a, s)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

reward

A
r = postive OR
r = negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Q-learning

Q = function
s= state
a = action
A

Methode for learning a functionQ(s, a)

estimated of the value of performing action a in state s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Overview Q learning

A

star with Q(s, a) for all s, a
when we take action and receive and receive a reward
estimate the value of Q(s, a) based on the current
rewards and expected future rewards
update Q(s, a) to take into the account the old estimate as well as the new one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

formula Q-learning

A

start with Q(s,a ) = 0 for all s, a
every time we take an action a in state s and observe a reward r, we update
Q(s, a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Greedy Decison-making

A

When in state s, choose action a with highest Q(s, a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explore VS exploit

A

AI know the way to the reward

Explore there are more possibility’s to get to the reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

epsilon

A

ɛ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ɛ-greedy

A

Set ɛ equal to how often we want to move randomly
with probablity ɛ, choose a random move
with prob ɛ chose a random move

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Code NIM

A

import random

from nim import train, play
ai = train(0) //add number to train it
play(ai)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly