Frogs Flashcards

Question

Describe the agent type and environment of a weather forcasting system.

Answer 1

Environment: Partially observable, stochastic, sequential, single-agent. Agent Type: Model-based agent.

Answer 2

involves finding an unknown path from an initial state to a goal state in a defined space of possible states. Each state can transition to others through specific actions, and the goal is to identify the correct sequence of actions to reach the goal state.

Answer 3

starts from the initial state, creating a root node. It systematically expands nodes by selecting valid actions, transitioning between states using a transition model. The process continues until the goal state is reached. The search traces back through the nodes to identify the action sequence leading to the goal.

Answer 4

Initial state s0 Set of possible actions for that state {a1, a2, ...} Transition model that returns the state s' from taking action a when in state s Goal test that tells agent if the current state is the goal state

Answer 5

Essentially the rules of the game that define how each action changes the current state - has current state, action and resulting state after action is taken. It tells the agent about the next state, predicting where the agent would end up if it took a specific action from current state - it doesn't actually move it.

Answer 6

Uninformed Search: No additional information beyond the problem definition is available (e.g., BFS, DFS). Informed Search: Uses heuristics to guide the search toward the goal more efficiently (e.g., A*). Adversarial Search: Involves competing agents with opposing goals, like in games (e.g., Minimax). Stochastic Search: Handles uncertainty in transitions between states by considering probabilistic outcomes

Answer 7

Tree search and graph search Tree - Uninformed search (some informed searches) where you don't need to check for previously visited states. Graph - complex informed search problems, stochastic, adversarial - problems where you can loop back to previous states - avoids repeted exploration by keeping track of explored nodes

Answer 8

Breadth-first search (BFS) explores nodes level by level, using a queue to expand the nodes in order of adding them to the fringe. It’s complete and optimal for uniform costs but can be memory-intensive. DFS - expands the most recent node added to the fringe. Depth-first search (DFS) explores by going as deep as possible along each branch before backtracking, using a stack. It's memory-efficient but can miss solutions in infinite or looping paths and isn’t guaranteed to be optimal.

Answer 9

Completeness - Does the search strategy always frin a solution if one exists in the search space? Optimality - Does it always find the least cost or best solution

Answer 10

Iterative Deepening search Combines space efficientcy of DFS and completeness of BFS by repeatedly applying depth limited search with increasing depth limits. It explores all nodes at a given depth before increasing hte limit, ensuring all solutions are found while keep memory usage low

Answer 11

Described a heuristic for A* search. admissible (it never overestimates the true cost to reach the goal) for tree search and consistent (the estimated cost is always less than or equal to the estimated cost from any neighboring state plus the step cost) - graog search.

Answer 12

NO. Genetic algorithms explore multiple solutions simultaneously and improve them through selection, crossover, and mutation, rather than following a specific path like traditional search algorithms. They don't guarantee finding the optimal solution but aim for good solutions quickly.

Answer 13

Minimax algorithm with alpha-beta pruning. It evaluates possible moves to maximize the player's chances of winning while minimizing the opponent's chances.

Answer 14

Difficulty can be adjusted by altering the depth of the search tree. For easier levels, the AI could evaluate fewer moves, while for harder levels, it could search deeper for optimal play. You'd need a depth parameter

Answer 15

To facilitate optimal play, game mechanics should include well-defined rules, a finite state space, and clear win/loss conditions. This ensures the AI can effectively evaluate outcomes and make strategic decisions.

Answer 16

Yes. Both will give optimal solutions since both are admissible. But Manhattan is usually faster/more efficient cos it provides a more precise estimate of tile movement as it accounts for the actual distance of the tiles to their goal state, leading to fewer node expansions in the search.

Answer 17

depth-first search (DFS) algorithm. It explores the game tree by going deep into the branches first, making it efficient in evaluating potential moves and pruning branches that won't affect the final decision.

Answer 18

To reduce the number of nodes evaluated in the minimax algorithm by eliminating branches that won’t influence the final decision. It works best when the game tree is well-ordered (ie. considers best moves first), allowing the algorithm to prune large sections of the tree early. This results in faster search times while still ensuring optimality in the chosen moves.

Answer 19

- a log2(a) - b log2(b)

Answer 20

Because it can overfit the training data, meaning it can't be generalised i.e. perform poorly on new data . Pruning: Simplifying the tree by removing less important nodes.

Answer 21

Choose attributes that provide the most information gain or reduce uncertainty about the outcome, split the data at each node accordingly, and continue this process until a stopping criterion is met. This approach helps ensure the tree generalizes well to unseen data while accurately predicting outcomes.

Answer 22

One that agrees with all the training examples

Answer 23

Occams Razor - prefer simplest hypothesis, they generlise better to new examples

Answer 24

Test date (verify performance of the model), good performane (consistentcy) on the taining data isn't necessarily a good indicator of generalisation - your hyp could be overtrained

Answer 25

Trains agent to make decisions in an unknown environment to maximise cumulative rewards. Over time agent builds a model of environment based on interactions which helps improve decision making. Agent takes action, gets feedback (rewards/penalties), learns by balancing exploration (new actions) and exploitation (using known rewarding actions). Throufh repeated interaction the agent refined its strategy to achieve its goals. Algorithms like Q learning help the agent estimate state action pairs while policy based methods adjust the agents strategy

Answer 26

Search and planning special case. Environment is a set of distinct cases that are non-deterministic (it's stochastic), this means no gaurantee of the same outcome from taking the same action in the same state.

Answer 27

Transition model - which returns a probability ditribution for possible states that results from taking action a when in state s. Reward associated with each state (zero or negative)

Answer 28

RL is a set of methods that allow agent to learn what to do in an unknown stochastic environment - iot deduces the optimal policy (ie what to do in a given state) from the rewards it receives at the send of a sequence of steps (not necessaily which states in the sequenc realte to this outcome). No transtition model, there is a reward assocuated with each state

Answer 29

Lookahead search allows an agent to evaluate future states by "looking ahead" at each possible route and calculating the cost associated with moving from one town to another. By systematically considering paths and their distances, the agent can generate a sequence of actions that leads from the starting town (e.g., Otepoti) to the destination (e.g., Areketanara) by selecting the path with the least cost (e.g., the shortest or most optimal route).

Answer 30

Steepest gradient descent works for all the models because it optimizes the loss function by iteratively adjusting the model parameters in the direction of the steepest decrease of the loss.

Answer 31

State: Represents the current values of the hypothesis parameters. Actions: Adjust the parameters in the direction of the steepest descent (the negative gradient). Evaluation Function: Measures the loss or cost associated with the current parameters.

Answer 32

To identify direction of steepest increase in the error/loss function. By taking the negative of the gradient, the algorithm determines how to adjust the model parameters to minimise the trot, guiding the search towards optimal solutions

Answer 33

The search is not guaranteed to be complete or optimal, as it may get stuck in local minima or saddle points, depending on the shape of the loss landscape.

Answer 34

No. SL aims to minimise error on labeled data while RL focuses on maximising cumulative reward over time which depends on delayed rewards, exploration vs exploitation - which are not present in SL. In RL the agent learns from consequences of its actions, receiving feedback from the environment, whereas SL relies on state labeled data without interactive feedback. There learning objectives are distinct

Answer 35

one complete pass through the entire training dataset during the training process.

Answer 36

ReLU Or(v) = max(0, v) Sigmoid Os(v) = 1/ 1 + e^-v

Answer 37

It's a no if there's non-linear activations functions (Os, or Or), if there's multiple layers (i.e. 2 or more weights), If the output structure is a matrix - perceptron rule is designed for binary output is non-binary

Answer 38

Yes, if. if the heurisitic is admissible (never overestimates cost to reach the goal from any node) and consistent

Answer 39

Choose a splitting criterion (e.g entropy/ info gain) recursively evaluate each feature to determine the best split that maximizes the criterion the feature that provides the best split becomes the root node of the tree; create child nodes based on this split repeat the process for each node until stopping criteria are met; , assign class labels to the leaf nodes based on label that appears most frequently in the node), optionally prune the tree to enhance generalization, and traverse the tree for new instances to predict their class.

Frogs Flashcards

(64 cards)