AI Flashcards
What is an “environment”?
An enviroment is where a AI will be performing it’s tasks. Like a maze
What is the Bellman Equation?
It writes the value of a decision problem at a certain point in time in terms of the payoff from some initial choices and the value of the remaining decision problem that results from those initial choices
The bellman equation assigns points with the best outcome has the highest points and worst outcome has the lovest.
What is the Markov Decision Process (MDP) ?
It is the process where outcomes are partly random and partly under the control of the decision maker. It provieds a mathematical framwork for decision making.
What is an “Agent?”
an agent is our Artificial intelligence that will perform the actions inside the environment. It will learn from the feedback of the environment
What is an plan
a plan is like a treasure map for AI. This indicate what direction the agent should proceed
What is deterministic search?
A deterministic search gives a pre set of probability that the agent will perform a certain action
What is deterministic search?
A deterministic search gives a pre set of probability that the agent will perform a certain action.
This means that if the agent choses an action then it will perform that action 100% of the time
What is non-deterministic search?
Thisis when we have an environment that mimics a real world application. This means that if the agent chooses an action that there are more random variables in play.
What is non-deterministic search?
Thisis when we have an environment that mimics a real world application. This means that if the agent chooses an action that there are more random variables in play.
What is the Markov property?
It is when the future state only depends on the state you are in now. Not the states that where before the present state.
What does stochastic mean?
It means that there is some randomness.
What is “Living penalty”?
Living penalty is when the agent is getting rewards while performing actions that will take it closer to the goal. It is called “living penalty” beacuse the reward is given as a negative number, rather than a positive. The incentive is then to complete the goal as fast as possible.
What is Q-learning intuition?
Q-learning is a reinforcement learning technique used in machine learning. The goal of Q-Learning is to learn a policy, which tells an agent what action to take under what circumstances. It does not require a model of the environment and can handle problems with stochastic transitions and rewards, without requiring adaptations.
How is Q different from V ?
We are looking for the value of each action rather than the value of each state.
Meaning it will look at what action is more lucrative.
Why do we use the letter Q?
Probably beacuse the word Quality.