A.I. Flashcards

1
Q

which classifier is most easilt understandable to humand

A

decision tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

You have about 1 million training examples in a 6-dimensional space.You only
expect to be asked to classify 100 test examples.

A

Nearest Neighbours is a good choice. The dimensionality is low and so
appropriate for NN. For NN, training is very slow and since there are few classications, the fact that this will be slow does not matter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You are trying to predict the average rainfall in Bristol as a function of the
measured currents and tides in the Atlantic ocean in the previous six months.

A

This is a regression problem; least squares regression, regression trees or
locally weighted nearest neighbours are all appropriate choices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MDP or reinforcement learning?

mental model of how the world works

A

MDP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MDP or reinforcement learning?

Perform actions in the world to nd out and collect rewards.

A

RL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MDP or reinforcement learning?

The goal is to find policy to collect maximum rewards.

A

MDP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

MDP or reinforcement learning?

its an offline process

A

MDP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Value iteration vs Policy iteration

A

VALUE:

  • Cheaper
  • good for large state spaces and a few actions

Policy Iteration
-good for small state spaces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When to use Policy iteration or Value iteration?

A

For small state spaces: policy evaluation using exact solution methods is often the most efficient approach, typically very fast and converges quickly.

For large state spaces, when time might be prohibitive. Value iteration is preferred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reinforcement learning

A

Reinforcement learning is on-line methodology that we use when the model of world is unknown and/or rewards are delayed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the main ideas behind the EM algorithm

A

Expectation Maximization (EM), tries to maximize the marginal likelihood

The EM algorithm works analogously

EM consists of alternating between
two steps, the E-step and the M-step.

In the E-step, we dont know what
the hidden variables are, so we compute the posterior distribution over them
given our current parameters

In the M-step, we take in our set of full
assignments with weights, and we just do maximum likelihood estimation

If we repeat the E-step and the M-step
over and over again, we are guaranteed to converge to a local optima

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define what is meant by an admissible heuristic

A

estimate of the cost of reaching the goal state
in an informed search algorithm.

In order for a heuristic to be admissible to
the search problem, the estimated cost must always be lower than or equal
to the actual cost of reaching the goal state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the main idea of Principal Component Analysis (PCA)?

A

Maximize the variance of projection along each component or minimize the
reconstruction error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

reconstruction error

A

the squared distance between the original data and

its estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is PCA useful for visualization tasks

A

Dimensionality reduction. the first and second component
can be plotted against each other to obtain a two-dimensional representation
of the data that captures most of the variance, useful to analyze and interpret
the structure of a dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Indicate how the method (K-NN) could overfit training data

A

Every point in dataset (including noise) denfies its
own decision boundary. The distance function can be chosen to do well on training set but less well on new data.
2. How can you reduce overtting? Use larger K for K-NN. Use cross-validation to choose K and the metric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Markov Decision process

A

Agents make actions, the successor states of those actions are
controlled by chance.

18
Q

mutual information

A

I(edibility; colour) = H(edibility) - H(edibility|colour)

19
Q

In a problem where each example has n binary attributes, to select the best
attribute for a decision tree node at depth k, where the root node is at depth
0, how many attributes are candidates for selection at this node?

A

n - k

20
Q

under what conditions is breadth first search better than depth first search

A
  1. A shallow solution (path from initial state to goal state) is preferred.
  2. The search tree may contain large or possibly innite branches (NOT RIGHT)
  3. Very large memory space to store the search tree (or the queue) is available.
21
Q

Iterative deepening or depth first search, if:

  1. A shallow solution is preferred.
  2. The search tree may contain large or infinite branches.
A

ID

22
Q

Breadth-first search is a special case of uniform-cost search. (T or F?)

A

True - when the costs of all actions are equal.

23
Q

Depth-first search is a special case of best-first search.

A

True - when f(n) = - depth(n).

24
Q

Uniform-cost search is a special case of A* search

A

True - when h(n) = 0.

25
Q

MDP instances with small discount factors tend to emphasise short-term rewards (T or F?)

A

True

26
Q

If the only difference between two MDPs is the value of the discount factor then they must have the same optimal policy (T or F?)

A

False

27
Q

For an innite horizon MDP with a nite number of states and actions and with a discount factor between 0 and 1, value iteration is guaranteed to converge. (T or F?)

A

True

28
Q

When rewards are infrequent or unknown, reinforcement learning is preferred (T or F?)

A

True

29
Q

What are the main factors in order to decide whether to apply Value Iteration or Policy Iteration

A

State space size, Number of possible actions

30
Q

If a player has a dominant strategy in a simultaneous-move game, then she
is sure to get her best possible outcome in any Nash equilibrium of the game. (T or F?)

A

False - ie prisoners dilemma

31
Q

Alpha-beta pruning:

A

Alpha-beta will always find the optimal strategy for players
playing optimally. If a heuristic is available, we can expand nodes in an order that maximises pruning. Alpha-beta will require less run-time than minimax.

32
Q

Dominant strategy

A

outcome is better for the player regardless of the strategy chosen by the other player

33
Q

Nash equilibrium

A

a pair of strategies by which no player can achieve a better payoff by switching strategies, if the other player keeps the same strategy

34
Q

Zero sum game

A

the total payoff to all players is the same for every instance of the game

35
Q

As the training set size increases, what do you expect will happen with the mean training and mean testing errors?

A

The training error tends to increase. The test error tends to decrease

36
Q

Marginalisation of a leaf node yields a Bayesian network without the node.

A

True

37
Q

A Bayesian network is a directed acyclic graph

A

True

38
Q

The likelihood is the distribution over the hidden variable given the evidence

A

False

39
Q

Bayesian networks can be used to capture common reasoning patterns under uncertainty

A

True

40
Q

Do Value Iteration and Policy Iteration must always converge to the same policy?

A

In general this is true. Both algorithms are guaranteed to converge to the optimal policy. However, if there are multiple policies that are optimal (meaning they yield the same maximal values for every state), then the algorithms might diverge. Both Value Iteration and Policy Iteration will
always lead to the same optimal values.

41
Q

Propose a general, practical method for ordering

children of nodes which will tend to increase the opportunities for pruning.

A

In general we would want to use an evaluation function to estimate the values of the children and then order them so that at max nodes we order the children with larger estimated values first and at min nodes we order
the children with smaller estimated values first.