Last Quiz Flashcards

Question 1

Q

Why are continuous value functions harder?

Answer

A

You lose the ability to track the outcome of each action

Question 2

Q

Name the three big nuisance factors

Answer

A

Random seed
Hyperparameters
Network architecture

Question 3

Q

Explain why random seed is a nuisance factor

Answer

A

Initial policy weights and following exploration actions are randomized: some agents might “get lucky

Question 4

Q

Explain why hyperparameters are a nuisance factor

Answer

A

There’s no systematic way to find them yet (e.g. learning rate, reward scaling) but they have a big effect on algol success

Question 5

Q

Explain why network architecture is a nuisance factor

Answer

A

People running the same code on different setups get different results.

Question 6

Q

What are three ML “cheats”?

Answer

A

1) Report the max of many trials w/o mean and std dev
2) Selecting a random seed
3) Small sample size

Question 7

Q

What did Henderson do?

Answer

A

Create a “reproducibility checklist” that ML algols have to pass to be statistically significant

Question 8

Q

What’s the problem with his checklist?

Answer

A

People pretend to follow, but they don’t and it’s hard to check

Question 9

Q

Why does Q-learning over-estimate?

Answer

A

Scott’s story: continuously training on the same set creates overconfidence

Question 10

Q

How does ORB-SLAM work?

Answer

A

Recognize features and parallax to locate self in world

Question 11

Q

What is feature mapping?

Answer

A

creating feature descriptors, comparing feature vectors to assess motion

Question 12

Q

What is the pinhole camera model?

Answer

A

That all rays of light that fall onto a plane converge onto a single point.

Question 13

Q

What is Goodheart’s law?

Answer

A

When a measure becomes a target, it ceases to be a good measure

Question 14

Q

What is double Q-learning?

Answer

A

Actor-Critic where the critic criticizes the other actor