Intelligent Agents Flashcards

1
Q

Kan MDPs ha kända states?

A

Ja

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When is it best to use Q-Learning?

A

When the optimal action depend on the current state and we dont beforehand know the reward of each state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is Q-learning modelfree?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the advantages of thompson samling over UBC?

A

It is extensible for contexutal bandits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is state factorization important?

A

Allows us to handle combinatorial explosion of states

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does mixed policies mean?

A

That we assign propabilities to policies rather than choose policiy entirely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are need for a policy to be differentiable?

A

That it’s mixed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does policicy gradients work?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly