Exam Questions Flashcards

Question 1

Q

Answer

A

Vergeet dus niet dat je hier twee verschillende initialisaties hebt. Daarom moet je ook het aantal goed beantwoorde vragen toevoegen.

Question 2

Q

Wat zou je hier moeten doen?

Answer

A

Probeer gewoon hier een mogelijk antwoord, itereer door als nodig.

Question 3

Q

Question 4

Q

How to solve for which ɑ one type of strategies is better then another one?

Answer

A

Solve x₁, x₂ and y₁, y₂, and then set x₁≤ y₁ and x₂≤ y₂

Question 5

Q

How to solve using iteration?

Answer

A

Solve the current system, then fill in the current values in the minimization formula’s. Those are the values to want to minimize/maximize.

Question 6

Q

Question 7

Q

How to show that a Markov chain is unichain?

Answer

A

Claim/show that the Markov chain will eventually become some sort of cycle.

Question 8

Q

Question 9

Q

How to write down the transitition functions in a DP model?

Answer

A

Just write down t(state, action) = new state, do this for every possible action. In the case of probabilities make sure you write the probability afterwards.

Question 10

Q

How to do value iteration?

Answer

A

Start with the values V₀ = (0, 0, ..) (usually, sometimes different starting values are defined.
Calculate V_n(i) = min _{a in A(i)} {c(i, a) + 𝛼∑p_ij(a)V_n-1(i)} for all i in I.
Write down R_n(i), which is the minimum value

Obv. if you want to maximize change min into max

Question 11

Q

How to do policy iteration?

Answer

A

Solve current system (thus with current actions), system x_i = min _{a in A(i)} {c(i, a) + ∑p_ij(a)x_j}
Fill in these new x’s in the system and find minimum again (you can put this step in a table, s.t. you can easily see which is best, though putting it in a min {.} statement is also possible.
Stop when you get that the previous x₁ is the same as the current x₁, etc.

Question 12

Q

What do we know about the value of g, after value iteration?

Exam Questions Flashcards

(12 cards)