Lecture 3 - Optimisation and Hypothesis Space Flashcards
Map Example of Optimisation
REFER TO SLIDES
But essentially:
- Attempting to find the shortest apth using nodes, then calculating the minimum path
Define Hypothesis Space
The set of all models or hypotheses that can be represented using the selected language or representation.
What are the key characteristics of Hypothesis Space?
Key characteristics:
* It is determined by your choice of language - language defined later.
* It defines what’s possible to describe in your AI system.
* It includes all theoretical candidate solutions, regardless of whether they are “good” or “bad”.
REFER TO SLIDES FOR EXAMPLE
Define Candidate Solution
Definition: A single model within the hypothesis space; a potential solution to the problem.
Think of it as:
* An individual point inside the hypothesis space that is being tested or evaluated.
Define Solution Space
Definition: This is often used synonymously with hypothesis space, but in some contexts it refers to the space of all possible outputs or behaviours of the system based on the hypothesis space.
What are the three ingredients of optimisation?
As part of optimisation , there are three requirements that are used to ensure proper optimisation takes place, these are:
- Language (Solution Space/Hypothesis Space)
- Model (Candidate Solution)
- Metric (How good is the model?)
Define Language
Language: The formal system or structure used to describe possible solutions or hypothesis.
- Examples include:
○ Mathematical equations
○ Matural languages
○ Grammars
○ Logics
○ Finite automata/finite-state machines
○ Computer programs
○ Logic programs
○ Gantt charts
○ PERT charts
○ Simulation languages
○ Popsticks and glue
Why is Language considered important?
If you can’t describe it - you can’t model it, language allow you to do this
Generation VS Parsing OR Testing Vs Generate
○ Parsing (Testing):
○ Determines if a solution is valid within a language (efficient).
○ Such as the example above
○ Generating: Enumerates all valid solutions (inefficient, often infinite).
What is the Expressiveness of a Language?
Expressiveness: Some languages can express more complex ideas or solutions than others.
- When everything in one language (B) can be also be describe in the other language (A), we say A subsumes B
- If something can be be in one language (A) but not in another language (B), we say B does not subsume A
What is Chompsky Hierarchy?
Demonstrates a layered structure of language complexity (Type 3 < Type 2 < Type 1 < Type 0), in terms of expressiveness
What does the Chompsky Hierarchy look like
Its build on four types Type 3 < Type 2 < Type 1 < Type 0, with Type 3 being the least powerful and Type 0 being the most powerful
Typically it follows this level:
Type 3 are Regular Languages such as strings or regex
Type 2 are Context Free Languages such as matching parenthesis
Type 1 are Context Sensitive Languages such as symbol matching
Type 0 are Recursively Enumerable Languages which is any lanaguage that can be understood by a computer program
Why is Chompsky Hierarchy important (why does it matter)?
It gives us an idea of:
○ What kinds of models can be described
○ How complex the models can be
○ What computational resources are needed to test or parse them
Define Model
Model: A specific instance of a hypothesis described in the chosen language.
- Essentially this is an abstraction or approximation of the real world
- An instance of all the possible things that can be described in the language
- Where in the context of AI, it is a candidate solution to a problem.
Why are Models important - what does it represent?
The model is the subject of evaluation. It represents our best attempt at mimicking or understanding the target system or data.
What are some key characteristics of Models (Key Ideas)
- Models may not be perfect representations; they approximate.
- All real-world systems can be represented as functions.
- The model space is sometimes too large to search exhaustively.
Define Metric (Evaluation)
Definition: A function used to assess how good a model (hypothesis) is in comparison to the target (or real-world phenomenon).
Also called: Error function, Cost function, Fitness function, Objective function, Penalty function, Utility function
Breakdown of Metric and Types of Metrics
Formally:
* A function from hypothesis space to real numbers: f : H -> ℝ
○ f is a function.
○ It takes an input from the set H — the hypothesis space.
○ It outputs a real number (ℝ), which represents a metric or evaluation score (e.g., error, cost, fitness).
=====
Types of Metrics:
- Numerical: e.g., Mean Squared Error (MSE)
Why are Metrics important?
We need a way to measure which solution is better. This is critical in optimisation.
What is the issues with Metrics, what can be done instead?
- Usually we are happy if we can determine relative closeness
- Doesn’t need to be meaningful in an absolute sense, only relative to another
- Sometimes we don’t know much about the “real” thing
○ Can just assume its infinitely ‘good’
○ Seek the best hypothesis
What is Ideal 1 Defintion of Optimisation?
Find a model within the hypothesis space that is indistinguishable from the target (zero error).
- This is the theoretical best-case scenario.
- You’re aiming to find a model that perfectly replicates the real-world target.
- That means: when evaluated by the metric, the error is exactly zero.
Why is Ideal 1 Defintion of Optimisation important?
It gives us a goalpost—a target to aim for in optimisation.
Useful for evaluating how expressive your language is: if your representation can’t describe the perfect model, ideal optimisation is impossible.
What are the limitations of Ideal 1 Defintion of Optimisation?
- In real-world problems:
○ The hypothesis space might not contain the exact real-world model.
○ Real-world data is often noisy or incomplete.
○ Models are approximations, not exact replicas.
○ Ideal optimisation becomes intractable when the space is too large or the model is too complex.
What is Ideal 2 Definition of Optimisation?
Find a model in the hypothesis space that is closest (minimal error) to the target.
- You’re no longer aiming for zero error, but the smallest possible error that can be achieved given your hypothesis space.
- Still assumes perfect knowledge of the metric and ability to search the space effectively.
Why is Ideal 2 Defintion of Optimisation important?
- This is the standard definition of optimisation in most of machine learning.
- You’re finding the best approximation that your system can express.
- This model is often referred to as the best hypothesis in your hypothesis space.
What are the limitations of Ideal 2 Defintion of Optimisation?
- Finding the best model in complex or infinite hypothesis spaces is often computationally infeasible (limited by search complexity).
- Sometimes has many local minima.
What is the Practical Definition of Optimisation?
Find a model within the hypothesis space (determined by the language) that is as close as possible (good enough) to the target within a specified amount of compute (time and space)
- This is what actually happens in real-world AI systems.
- You aim for a model that gives acceptable performance within real-world constraints:
○ Time (e.g., real-time decisions)
○ Memory/space (e.g., embedded devices)
○ Energy (e.g., mobile computing)
Why is the Practical Definition of Optimisation important?
- Most practical AI systems are resource-limited.
- There’s always a trade-off:
○ Accuracy vs speed
○ Accuracy vs memory
○ Accuracy vs interpretability
- There’s always a trade-off:
What are the limitation Practical Definition of Optimisation?
- “Good Enough” is Subjective
- Local Optima
- Limited Search Scope
- Adaptability Challenges
What is the “Good Enough” is Subjective limitation?
- “Good Enough” is Subjective
- How do you know what level of performance is “acceptable”?
- It might depend on:
○ Application risk (life-critical vs entertainment)
○ User expectations
○ Regulatory requirements
What is is the Local Optima limitation?
- Local Optima
- Often, due to time/resource limits, the system converges to a local optimum, not the global best.
- Especially true in non-convex problems like neural networks.
What is the Limited Search Scope limitation?
- Limited Search Scope
- If your compute budget is small, you might not explore enough of the hypothesis space.
- You could miss a much better model that’s just out of reach.
What is the Adaptability Challenges limitation?
- Adaptability Challenges
- In real-time environments, conditions change.
- The model might perform well initially, but degrade over time unless updated (requiring online learning or continual optimisation).
What is Online Optimisation?
- Online optimisation:
○ Needs to run in real-time (e.g., autonomous vehicles, robot control).
○ Must respond quickly, even if it’s not the best possible model.
○ Design Priorities:
* Prioritise speed over perfection.
* Use lightweight models or approximate solutions.
May use heuristics or simplified models to ensure fast responses.
What is Offline Optimisation?
- Offline optimisation:
○ Occurs before deployment.
○ Can take longer (e.g., training a model overnight).
○ Focus is on achieving better accuracy, even if it takes more time.
○ Design Priorities:
* Prioritise accuracy and generalisation.
* Use heavyweight models and explore larger hypothesis spaces.
* Perform hyperparameter tuning, ensemble methods, or neural architecture search.