A.I. Flashcards
which classifier is most easilt understandable to humand
decision tree
You have about 1 million training examples in a 6-dimensional space.You only
expect to be asked to classify 100 test examples.
Nearest Neighbours is a good choice. The dimensionality is low and so
appropriate for NN. For NN, training is very slow and since there are few classications, the fact that this will be slow does not matter
You are trying to predict the average rainfall in Bristol as a function of the
measured currents and tides in the Atlantic ocean in the previous six months.
This is a regression problem; least squares regression, regression trees or
locally weighted nearest neighbours are all appropriate choices
MDP or reinforcement learning?
mental model of how the world works
MDP
MDP or reinforcement learning?
Perform actions in the world to nd out and collect rewards.
RL
MDP or reinforcement learning?
The goal is to find policy to collect maximum rewards.
MDP
MDP or reinforcement learning?
its an offline process
MDP
Value iteration vs Policy iteration
VALUE:
- Cheaper
- good for large state spaces and a few actions
Policy Iteration
-good for small state spaces
When to use Policy iteration or Value iteration?
For small state spaces: policy evaluation using exact solution methods is often the most efficient approach, typically very fast and converges quickly.
For large state spaces, when time might be prohibitive. Value iteration is preferred.
Reinforcement learning
Reinforcement learning is on-line methodology that we use when the model of world is unknown and/or rewards are delayed.
Describe the main ideas behind the EM algorithm
Expectation Maximization (EM), tries to maximize the marginal likelihood
The EM algorithm works analogously
EM consists of alternating between
two steps, the E-step and the M-step.
In the E-step, we dont know what
the hidden variables are, so we compute the posterior distribution over them
given our current parameters
In the M-step, we take in our set of full
assignments with weights, and we just do maximum likelihood estimation
If we repeat the E-step and the M-step
over and over again, we are guaranteed to converge to a local optima
Define what is meant by an admissible heuristic
estimate of the cost of reaching the goal state
in an informed search algorithm.
In order for a heuristic to be admissible to
the search problem, the estimated cost must always be lower than or equal
to the actual cost of reaching the goal state
What is the main idea of Principal Component Analysis (PCA)?
Maximize the variance of projection along each component or minimize the
reconstruction error
reconstruction error
the squared distance between the original data and
its estimate
Why is PCA useful for visualization tasks
Dimensionality reduction. the first and second component
can be plotted against each other to obtain a two-dimensional representation
of the data that captures most of the variance, useful to analyze and interpret
the structure of a dataset
Indicate how the method (K-NN) could overfit training data
Every point in dataset (including noise) denfies its
own decision boundary. The distance function can be chosen to do well on training set but less well on new data.
2. How can you reduce overtting? Use larger K for K-NN. Use cross-validation to choose K and the metric.
Markov Decision process
Agents make actions, the successor states of those actions are
controlled by chance.
mutual information
I(edibility; colour) = H(edibility) - H(edibility|colour)
In a problem where each example has n binary attributes, to select the best
attribute for a decision tree node at depth k, where the root node is at depth
0, how many attributes are candidates for selection at this node?
n - k
under what conditions is breadth first search better than depth first search
- A shallow solution (path from initial state to goal state) is preferred.
- The search tree may contain large or possibly innite branches (NOT RIGHT)
- Very large memory space to store the search tree (or the queue) is available.
Iterative deepening or depth first search, if:
- A shallow solution is preferred.
- The search tree may contain large or infinite branches.
ID
Breadth-first search is a special case of uniform-cost search. (T or F?)
True - when the costs of all actions are equal.
Depth-first search is a special case of best-first search.
True - when f(n) = - depth(n).
Uniform-cost search is a special case of A* search
True - when h(n) = 0.
MDP instances with small discount factors tend to emphasise short-term rewards (T or F?)
True
If the only difference between two MDPs is the value of the discount factor then they must have the same optimal policy (T or F?)
False
For an innite horizon MDP with a nite number of states and actions and with a discount factor between 0 and 1, value iteration is guaranteed to converge. (T or F?)
True
When rewards are infrequent or unknown, reinforcement learning is preferred (T or F?)
True
What are the main factors in order to decide whether to apply Value Iteration or Policy Iteration
State space size, Number of possible actions
If a player has a dominant strategy in a simultaneous-move game, then she
is sure to get her best possible outcome in any Nash equilibrium of the game. (T or F?)
False - ie prisoners dilemma
Alpha-beta pruning:
Alpha-beta will always find the optimal strategy for players
playing optimally. If a heuristic is available, we can expand nodes in an order that maximises pruning. Alpha-beta will require less run-time than minimax.
Dominant strategy
outcome is better for the player regardless of the strategy chosen by the other player
Nash equilibrium
a pair of strategies by which no player can achieve a better payoff by switching strategies, if the other player keeps the same strategy
Zero sum game
the total payoff to all players is the same for every instance of the game
As the training set size increases, what do you expect will happen with the mean training and mean testing errors?
The training error tends to increase. The test error tends to decrease
Marginalisation of a leaf node yields a Bayesian network without the node.
True
A Bayesian network is a directed acyclic graph
True
The likelihood is the distribution over the hidden variable given the evidence
False
Bayesian networks can be used to capture common reasoning patterns under uncertainty
True
Do Value Iteration and Policy Iteration must always converge to the same policy?
In general this is true. Both algorithms are guaranteed to converge to the optimal policy. However, if there are multiple policies that are optimal (meaning they yield the same maximal values for every state), then the algorithms might diverge. Both Value Iteration and Policy Iteration will
always lead to the same optimal values.
Propose a general, practical method for ordering
children of nodes which will tend to increase the opportunities for pruning.
In general we would want to use an evaluation function to estimate the values of the children and then order them so that at max nodes we order the children with larger estimated values first and at min nodes we order
the children with smaller estimated values first.