AI Revision Flashcards

Question 1

Q

Cons of Hill Climbing:

Answer

A

No Guarantee to be COMPLETE or OPTIMAL
Can get stuck on local maxima & plateaus
(RUN FOREVER IF NOT PROPERLY FORMULATED)

Question 2

Q

Pros of Hill Climbing:

Answer

A

Rapidly find good solutions by Improving over bad initial state (GREEDY)
Lower Time & Space Complexity compared to search algorithms
No Requirements of problem-specific heuristics
(UNINFORMED)
Start from Candidate Solution, instead of building up step-by-step
(UNLIKELY BUT POSSIBLE, SOLUTION PICKED AT RANDOM )

Run algorithm for a Maximum number of iterations m

Variants of Hill climbing can sort out getting stuck on local maxima/ plateaus

Question 3

Q

Variants of Hill Climbing:

Answer

A

Stochastic Hill Climbing
First - Choice Hill Climbing
Random - Restart Hill Climbing

Question 4

Q

Stochastic Hill Climbing :

Answer

A

Variants Randomly selects a neighbour that involves an uphill move
Probability of picking a specific move can depend on the steepness
Converges slower than steepest ascent but can find higher solutions

Question 5

Q

First - Choice Hill Climbing

Answer

A

-Randomly generates a single SUCCESSOR neighbour solution & moves to it if it’s better than current solution

No UPHILL, keeps randomly generating solutions until there is an uphill move
After MAX num of tries OR generating all neighbours, hasn’t found UPHILL move, it gives up & assumes that it’s now at Optimal solution.
Time complexity of this is lower as not all neighbours need to be generates before 1 is picked
(GOOD WHEN THERE ARE LOADS OF NEIGHBOURS FOR EACH SOLUTIONS)

Question 6

Q

Random - Restart Hill Climbing:

Answer

A

Generates a series of different hill climbing searches of the same problem from random to initial states
Stops when goal is found
Can be parallelised / Threaded easily so does not take much time on modern Computers
RARE to have to wait for this to happen

Question 7

Q

What is the only Complete variant of Hill Climbing?

Answer

A

Random - Restart Hill Climbing
It generates a solution if one exists, as eventually random start solution will be OPTIMAL SOLUTION

Question 8

Q

Pros of Simulating Annealing:

Answer

A

Find near Optimal Solutions in reasonable time
(High Chance of going to GLOBAL MAX, But only
Good as the formulation of Optimisation Problem)
Avoids getting stuck in poor LOCAL MAX & PLATEAU by combining EXPLORATION & EXLIOTATION
(many solutions we concurrently work with at the end of EXPLORATION )

Question 9

Q

Cons of Simulating Annealing:

Answer

A

Not Guaranteed to be complete OR OPTIMAL
(SENSITIVE TO FORMUALTION)
NOT reliable = can’t guarantee completeness
Time & Space Complexity is problem & Representation- dependent

Question 10

Q

Formulating Optimisation Problems:

Answer

A

min / max f(x) => min/max objective functions
s.t g_i(x) <= 0, i = 1,…,m
h_i(x) <= 0, i = 1,…,n => Feasibility Constraint

==> x is the DESIGN VARAIBLE (can be anything)

SEARCH SPACE is space of all possible x values

Question 11

Q

What is Search Space of a Problem:

Answer

A

is the space of all possible x values ALLOWED by constraints in CANDIDATE SOLUTIONS

Question 12

Q

Define Explicit Constraint:

Answer

A

Explicitly mentioned
CANNOT BE ASSUMED

Question 13

Q

Define Implicit Constraint:

Answer

A

Rules that must be in place by the problem definitions in order for solutions to be CONSIDERED FEASIBLE

e.g: x,y > 0

Question 14

Q

Define A*:

Answer

A

Heuristic path finding algorithm & most used example of Informed Search

Question 15

Q

What is the Evaluation Function:

Answer

A

f(n) = g(n) + h(n)
g(n) cost to reach node n
h(heuristic from node n to goal)

It determines which node to expand next

Question 16

Q

Steps of A*:

BASIC

Answer

A

1) Expand the shallowest node in frontier with SMALLEST EVALUATION FUNCTION

2) Node is in the list of visited nodes, DON’T ADD TO FRONTIER

3) Stop when a GOAL node is visited
(KEEP EXPANDING OTHERWISE)

Question 17

Q

What is the Final Cost of A*?

Answer

A

Sum of g(n) along the paths

DON’T INCLUDE HEURSITICS

Question 18

Q

Dijkstra’s:

Answer

A

UNINFORMED
Goes Through all the nodes of the graph
No Heuristics / ADDITIONAL INFO
Basic Attempt to find shortest distance from the nodes on the graph to the destination

Question 19

Q

what is h(n)?

Answer

A

Euclidean Distance from each node to destination

Heuristics measure of cost as if a node is physically closer to destination

Good Assumption the cost to the destination will be lower

Question 20

Q

Pros & Cons of A* :

Answer

A

Pros:
- Doesn’t go through all nodes

safer to assume to expand node with lower straight line distance (LEAVE OUT OTHER NODES )

Cons:
- Can lead to non-optimal paths (UNLIKELY)

Question 21

Q

Uninformed Search ?

Answer

A

Define the set of strategies having no additional information about the State Space beyond what is give during Problem formulation

Question 22

Q

What does Uninformed Search do?

Answer

A

-Only generates successors
-Distinguish a goal state from a non goal-state

Question 23

Q

What is Breadth First Search ?

Answer

A

-An Uninformed Searching Strategy

FRONTIER NODES ARE EXPANDED LAYER BY LAYER

Expanding shallowest unexpanded node in frontier

Similar to a QUEUE (FIFO)

Unless State otherwise, never add children that are already in the Frontier

Question 24

Q

Steps for BFS?

Answer

A

-Root node is expanded
- Then the successors of the root
- Repeatedly expand all the successors of the all children, till goal node reached

Question 25

Q

What are the 3 Quantities used to measure the Performance?

Answer

A

Branching Factor (b):
Max No. of successor of any Node
Depth (d):
Depth of Shallowest goal Node
Max Length (m):
MAX length of any path in state space

Question 26

Q

4 quantities that Determine the Performance?

Answer

A

Completeness=>Guarantees finding solution, if 1?
Optimality =>Capability of finding Optimal Solution
Time Complexity => How long to find solution
Space Complexity => Memory space required?

Question 27

Q

Performance of BFS

Answer

A

Complete:
Yes, if goal node is at some finite d and will find goal node. b must also be finite

Optimal:
Yes, if path cost is a non-decreasing function of the depth of the node
(ALL ACTIONS HAVE SAME COST)

Time: O(b^(d))
Assume Uniform Tree, each node has b successors

Space: O(b^(d))
Stores all Expanded nodes
Frontier is O(b^(d)) AND in Memory it is O(b^(d-1))

Question 28

Q

What does
I
Always
Take
Goal
Path
stand for?

Answer

A

I = Initial State (agent starts its search)

A=Action Set
(Actions that can be executed in any state)

T = Transition Model
(Mapping between States and Actions )

G = Goal Test (Determine if State is a Goal State)
P= Path Cost Function (Assigns values to each cost)

Question 29

Q

What is the Solution?

Answer

A

Sequence of Actions from initial state to Goal

Question 30

Q

What is Cost of Solution?

Answer

A

Sum of the cost of actions from initial to goal

Question 31

Q

What is the Path?

Answer

A

Sequence of states connected by a sequence of actions

Question 32

Q

Does goal node appear in Order of nodes visited in BFS?

Question 33

Q

Does goal node appear in Order of nodes visited in DFS?

Question 34

Q

What is Depth First search?

Answer

A

-An Uninformed Searching Strategy

==> Expanding the deepest unexpanded node in Frontier

Stack LIFO is used for expansion

Most recently generated node is expanded
- Usually just do the left most node

Question 35

Q

Steps for DFS?

Answer

A

-Expand the Root first
- Then expand the first successor of root node
(CAN DO IT RANDOMLY)
- Repeat Expanding deepest node until goal is found otherwise go back to try alternative paths

Question 36

Q

Performance of DFS?

Answer

A

Completeness:
Not complete if search space is Infinite or we
don’t avoid infinite loops.
Yes, only is search space is finite

Optimality:
Not Optimal, because it will expand entire left
subtree, even if goal is first level of subtree

Time: O(b^m)
Depends on the Max length of the path is
Search space
Space: O(b^m)
Store all the nodes from each path from Root
to leaf

Question 37

Q

Variants of DFS:
b) what performance attribute stays the same for all of these variants?

Answer

A

-Less Memory Usage (LMU)
- Depth LIMITED Search (DLS)
- DLS with LMU

b) Optimality is ALWAYS NO

Question 38

Q

What is LMU?

Answer

A

If u reach the leaf node for the left-subtree and there is no GOAL node u can remove the entire subtree from memory

Space Complexity become O(bm)

Store a single path with siblings for each node on the path
COMPLETE IF the search space is finite

Question 39

Q

What is DLS?

Answer

A

has a DEPTH LIMIT L
once limit is reached we go find an alternate path

if L<d may be incomplete if goal node is L+1 position

if L>d it is not optimal

Time complexity reduces to O(b^L) , when L<d

Question 40

Q

What is DLS with LMU?

Answer

A

Once we reach our DEPTH LIMIT L and if we have found NO goal node we remove the subtree from our memory

Space Complexity = O(bL)
COMPLETE if L => d

Question 41

Q

What are Informed Search Strategies?

Answer

A

Use problem-specific knowledge beyond problem definition

Can find solutions more efficiently compared to uniformed

Question 42

Q

General Approach for Informed?

Answer

A

Best-First search:
-Determines which node to expand based on an
evaluation function f(n).

f(n) - acts as cost estimate =>
Lowest cost expanded first

Question 43

Q

What is Best First Search?

Answer

A

-Determines which node to expand based on an
evaluation function f(n).

Most include a heuristic h(n), which is the estimated cost of the cheapest path from current node to goal

goal node h(n) = 0
Known as GREEDY as will always pick cheapest path

Question 44

Q

What is the f(n) for A*?

Answer

A

f(n) = g(n)+h(n)
g(n)=> cost to reach node n
h(n) => heuristic from n to goal

Question 45

Q

Steps for A*?

COMPLEX

Answer

A

-Expand the node in the frontier with smallest f(n)
- Repeated States & Loopy Paths.
- Node already in frontier don’t add
- If the state of a given child is in the Frontier:
-if frontier node has large g(n), replace child
& remove node with the larger g(n)
- Stop when goal is visited

Question 46

Q

Performance of A*?

Answer

A

A* is Complete & Optimal if h(n) is consistent

A* is exponential in the length of the solution
CONSTANT STEPS COSTS : O(b^(Σd))

h* is actual cost from root to goal, Σ = (h-h)/h
Σ is RELATIVE ERROR

Space: O(b^d)
is Main Issue with A*
-Keeps all generated nodes in memory
- Keep ALL expanded & ALL nodes in frontier
-NOT suitable for LARGE SCALE PROBLEMS

Question 47

Q

What does Consistent mean in terms of h(n)?

Answer

A

If the estimate is no greater than the estimated distance from any neighbouring node n’ to the goal, + the cost of reaching the neighbour:

h(n) <= cost(n, n’) + h(n’)

cost is just distance from n to n’ (g(n))

Question 48

Q

What do design variables represent?

Answer

A

-Candidate solutions of a Problem
-Are variables belonging to pre-defined domains

-There may be one or more design variable in
a given optimisation problem.

These can be possible decision needed to be made in the problem

Question 49

Q

What is the objective function?

Answer

A

-Takes design variables as an Input

Outputs a NUMERICAL value that problem aims to MININMISE OR MAXIMISE

CAN have Multiple Objective functions in Formulation

Defines the Cost or Quality of a Solution

Question 50

Q

What are Constraints in Formulating Operation Problems?

Answer

A

-Constraints that design variables must satisfy for the solution to be FEASIBLE

-Usually depicted by functions that take
the design variables as input and output a numeric value.

-They specify the values that these functions are allowed to take for the solution to be feasible.

-There may be zero or more constraints in a problem

DEFINES THE FEASIBILITY OF THE SOLUTION

Question 51

Q

Benefits of Local Search?

Answer

A

DO NOT keep track of the paths or States that have been visited

NOT systematic ,but PROS are:
- Use very little Memory
- Find Reasonable solutions in Large or Infinite
State Space

Question 52

Q

What are Local Search Algorithms?

Answer

A

Optimisation Algorithms that operate by searching from initial state to neighbouring states

Question 53

Q

What is The Aim Of a MAXIMISING problem?

Answer

A

Reaching the HIGHEST PEAK/ Global Maximum

Question 54

Q

What is The Aim Of a MINIMISING problem?

Answer

A

Reaching the LOWEST TROUGH/ Global Minimum

Question 55

Q

Purpose Of Hill Climbing?

Answer

A

TO find & Reach Global Maximum

Question 56

Q

Purpose of Gradient Descent

Answer

A

TO find & Reach Global Minimum

Question 57

Q

Why is Hill Climbing Greedy?

Answer

A

Does not look beyond the immediate neighbours of the current state

Question 58

Q

What are the 3 Components of Hill Climbing we must design?

Answer

A

-Representation
-Initialisation Procedure
-Neighbourhood Operator

Question 59

Q

What is Representation?

Answer

A

How to store Design variables in the problem(s)

Should facilitate the Application of the Initialisation Procedure

Question 60

Q

What is Initialisation Procedure?

Answer

A

How to pick initial solution.
USUALLY RANDOM, Can Be Heuristic

Question 61

Q

What is Neighbourhood Operator?

Answer

A

How to generate Neighbourhood Solutions
(INCREMENT/STEP SIZE)

Question 62

Q

Performance of Hill Climbing:

Answer

A

Completeness:
No, Depends on problem formulation & design of the algorithms (GET STUCK ON LOCAL MINIMA)

OPTIMALITY:
Not Optimal, (GET STUCK ON LOCAL MINIMA)

Time: O(mnp)
m = MAX no. of iteration,
n = MAX no. of neighbours,
EACH take O(p) to generate

Space: O(nq+r) ==> r is a constant so ==> O(nq)
n = MAX no. of neighbours,
Variable takes O(q) and
r represents the space to generate the neighbours sequentially(NEGLIGIBLE COMPARED TO n & q)

Question 63

Q

What are the 3 Types of Machine Learning?

Answer

A

-Supervised Learning,
-Unsupervised Learning
-Reinforcement Learning

Question 64

Q

What is Supervised Learning?

Answer

A

Most Prevalent Form
Learning with a teacher
Teacher: expected output, label, class, etc
Solve 2 Types of problems: Classification & Regression

Answer 63

A

Automatically create models from data to perform certain tasks through machine learning
Not Guaranteed perfect model, but will find good model depending on difficulty of problem
Good for problem where it is difficult to create good models manually
Good for problems that don’t require perfect answers

Answer 64

A

Solve them in a reasonable amount of time through optimisation techniques
No guarantee to find optimal solution in reasonable amount of time but a good solution
Good for problems where no specific technique exists that guarantees that optimal solution can be found

Answer 65

A

-Can machine think humanly?
- Can we consider machine as human?
- How to define think humanly?
==> With AI we need to define things with mathematical forms

Answer 66

A

What can humans do?
What if human’s action is wrong?
==> Doesn’t mean we shouldn’t copy the wrong actions

Answer 67

A

==> Rationality: Doing the right thing can be mathematically defined & General enough, linked to human behaviour

Answer 68

A

Think Logically?
Logical AI?
Too Narrow?

Answer 69

A

Rational Agents are computer programs that perceive their environments & take actions that maximize their chances of achieving best EXPECTED outcome

Answer 70

A

An agent is learning if it improves its performance after making observations about the world.

Machine Learning when the agent is a Computer

Answer 71

A

Problems that require a model to be built automatically from data e.g => to make classification

Answer 72

A

Most popular form in Real World
Learning with a Teacher
Teacher: expected output, labels, classes, etc.
Solve 2 types of Problems : Classification & Regression Problem

Answer 73

A

Predict categorical class Labels

==> Spam Detection

Answer 74

A

Prediction of a Real Value

==> Student Grades, Stock Price Prediction

Answer 75

A

Agents Observe input-output pairs & learns a function that maps from input to output

Answer 76

A

Agent Learns patterns in the input without any explicit feedback

Answer 77

A

Learning without a teacher
Find Hidden Structures
Clustering ==> Group inputs based on similar properties

Answer 78

A

COMBO of Supervised & Unsupervised
Learning with (delayed) feedback / reward –> Don’t have instant labels
Learn series of actions => Sequence of decision Making

Agents learn from a series of Reinforcement Rewards & Punishment. Decides which of the actions prior to reinforcement were most responsible for it and alter actions towards more reward in Future.

Answer 79

A

Fitting it too well is not helpful as the data you want to classify or predict is not the same as the training data
Learning ever irrelevant detail in training data is irrelevant
Occurs when model is more complex than required

Answer 80

A

when model is more Simpler than required
IS BAD CAUSE will Not classify all point into correct classes

Answer 81

A

Parametric models are learning models that summarise data with a set of parameters.

e.g Logistic & Linear Regression

Answer 82

A

Non-parametric models are learning models that do not assume any parameters

e.g KNN

Answer 83

A

Outliers can influence the clusters that are found & increase WCSSS
Problem when Clusters are differing

Answer 84

A

key tool in image and Signal Compression
(SPECIALLY VECTOR QUANTIZATION)

Answer 85

A

Pick k random elements & these are our centroids
Assign each element to its closest centroid
(EUCLIDEAN DISTANCE IF NOT SPECIFIED)
Recalculate each centroid’s centre & repeat until centroid don’t change an you get Min WCSS

Answer 86

A

Use Prior Info like no of group we want to cluster

Answer 87

A

Runs K means with different K values
See where WCSS has inflection point
(KINK IN GRAPH BEFORE IT LEVELS OFF)

Answer 88

A

Finds the optimal K value

Answer 89

A

-Bottom UP
- Each item starts as it’s own Cluster
- Merge the 2 Similar (SMALLEST INTER CLUSTER DISSIMILARITY) clusters in each step, until there is one cluster

Answer 90

A

-TOP down

Each items starts in ONE cluster
Splits Cluster into 2 New cluster with Largest inter-cluster dissimilarity, until each item has its own cluster

Answer 91

A

If it has outliers, which heavily affect cluster being merged, but they are not representative of the whole cluster

Answer 92

A

Cause a ‘chaining effect’ where clusters are combined at close intermediate examples
Cluster may not be as compact as required

Answer 93

A

Provide clusters with small diameter
Cluster may end up being crowded, as items can be very close to items in other clusters

Answer 94

A

Attempt to produce a relatively compact cluster, which are relatively close

Answer 95

A

Methods that ensure Optimisation Algorithms can effectively search the feasible regions.

Avoiding or Penalising INFEASIBLE solutions by Modifying the Algorithm Operator OR Objective Function

Answer 96

A

Modifying the Search Space to generate ONLY feasible Solutions
Algorithm Operators is a function that generates a new candidate solution based on current solution
AVOIDS generating Invalid Solutions, where constraint dictate whether solution fits

Answer 97

A

Will not generate Infeasible Solutions, allowing search for optimal solutions
Makes Hill Climbing & Simulated Annealing Complete

Answer 98

A

May be difficult to design, Problem-Dependent
Restricts the search space too much harder to find Optimal Solutions.
(GLOBAL OPTIMUM MAYBE BETWEEN FEASIBLE & NON FEASIBLE SOLUTIONS)

Answer 99

A

Incorporating Constraints into Objective Function, often adding a penalty term for constraint violation

OBJECTIVE FUNCTION WOULD BE INCREASED/DECREASED IF CONSTRAINT VIOLATED

Answer 100

A

A method of Modifying the Objective Function
MIN PROBLEM
Let x be a Solution
Objective function is f(x) + Q(x) // Q(x) is penalty term
- if x is FEASIBLE Q(x) = 0
- Else Q(x) is a LARGE +VE constant C
// So that feasible solutions are smaller than
infeasible ones

Answer 101

A

C is Always the same Constant, so INFEASIBLE SOLUTION will have same Penalty

Hard to find solutions in a dominated Region of Infeasible solutions

Answer 102

A

A method of Modifying the Objective Function
- This distinguishes the objective value of infeasible solution to help the algorithm find Feasible ones.

Objective function still gets a Penalty added/subbed to it

Answer 103

A

THE PROBLEM IS MINIMISE f(x)+Q(x)
Q(x) =
vg_1Cg_1(x) +vg_2Cg_2(x) +…+vg_nCg_n(x)+
vh_1Ch_1(x) +vh_2Ch_2(x) +…+vh_nCh_n(x)

i = 1,…,n
so if g_i(x) / h_i(x) is VIOLATED, then vg_i/vh_i is 1
OTHERWISE vg_i/vh_i is 0

More Violations for each g_i & h_i then higher output of Q(x)

Only adds Penalties corresponding to Violated Constraints

Answer 104

A

The Constraints

Answer 105

A

A Constant, that represents how important a specific constraints is.

e.g=>
say vg_1Cg_1(x) is more important than vg_2Cg_2(x), so C will be different values
(SCALING EACH CONSTRAINT BY HOW IMPORTANT IT IS)

Answer 106

A

As it Solves the issue with with Death Penalty, As it Gives Different Penalties for ALL Infeasible solutions.

-Smaller for Solutions that are close to constraints.

-Larger for Solutions that are Further away from the constraint

Answer 107

A

Make distinction between objective values even larger.
Distinguish Different infeasible solutions more effectively
Removes -VEs
Bigger Span of Solutions
Effectively Distinguish

Answer 108

A

Easy to Desgin

Answer 109

A

-Algorithms have to search for Feasible solution to design rather than just having the search space only certain feasible solution

-(STILL GENERATES SOLUTIONS BUT THEY GET IGNORED DUE TO LARGE/SMALL OBJECTIVE FUNCTION IF WE ARE MIN/MAX)

-(THEY WILL JUST BE A NEIGHBOURS NOT A SOLUTION WE MOVE TOO)

Answer 110

A

If the Strategy never enables infeasible solution to be generated at all

Answer 111

A

Modifying Algorithm Operator can as it does not Generate Infeasible Solutions

Answer 112

A

HIGH LEVEL free parameters

Answer 113

A

By training the free parameters of the considered model using available data.

Answer 114

A

Estimate the free parameters

Answer 115

A

Evaluate the performance of the trained predictor before deploying it

Reserved for the Evaluation of the predictor, so can’t use it as the model learn nothing

Answer 116

A

1) Randomly Choose 30% of data to form a Validation set
2) Remaining data forms the training set
3) Training your model on the training set
4) Estimate the test performance on the validation set
5) Choose the Model with lowest Validation Error
6) Retrain with chosen model on joined training validation to obtain predictor
7) Estimate future performance of the obtained predictor on TEST SET
8) Ready to deploy the Predictor

Answer 117

A

(f(x)-y)^2

Answer 118

A

Linear Model
Quadratic Model
Line Model (OVERFITTING)

Answer 119

A

REGRESSION : we compute the cost Function (L2) on the examples of the validation set (INSTEAD OF TRAINING SET)

CLASSIFICATION: Don’t compute cross entropy cost on the validation set
COMPUTE the 0-1 error metric

0 -1 error Matrix =
(num of wrong Predictions)/num of predictions =
1- accuracy

Answer 120

A

1) Split the training set randomly into k (equal sized) disjoint sets
2) Use K-1 of those together for training
3) Use the remaining one for validation
4) Permute the k sets & repeat k times
(CHANGE THE PARTITONS OF VALIDATION SETS & REPEAT k TIMES)
EACH PARTITION WILL BE USED AS A VALIDATION SET
5) Average the Performances on the k validation sets
6) Choose the model with smallest AVG k fold cross validation error
7) Re-train with chosen model on joined training & validation to obtain the predictor
8)Estimate future performance of the obtained predictor on TEST SET
9) Ready to deploy the Predictor

Answer 121

A

Leave out a single example & train on all the rest of the annotated data
For a total of N example, we repeat this N times leaving out a single example
Take Avg. of the validation errors as measured on the left-out points
Same as N-folds cross validation where N is the number of labelled points

(EVERY POINT WILL BE USED AS A VALIDATION POINT)

Answer 122

A

For each of the partition create separate lines/graphs & Then compute validation error using the points in each partition

Then take the mean of these ERRORS for each of the graphs

Answer 123

A

Computationally Cheapest
(IDEAL FOR LARGE SAMPLES)

Answer 124

A

Not reliable if sample size is not large enough

Answer 125

A

Slightly more reliable than Hold out

Answer 126

A

Only waste 10%
- Fairly reliable

Answer 127

A

Wastes 1/3-rd annotated data
Computationally 3-times as expensive as holdout

Answer 128

A

Wastes annotated data
Computationally 10-times as expensive as holdout

Answer 129

A

Doesn’t waste data
(IDEAL FOR SMALL SAMPLES)

Answer 130

A

Computationally most Expensive

Answer 131

A

O(n^2)
- Storing the distance matrix require s storage (n^2)/2 entries (Don’t need to store b to a if a to b is already there)

Answer 132

A

O(n^3)
- n iterations
- Every iteration the n^2 size distance matrix has to be updated and searched
(GOING THROUGH ONCE IS n^2 & THROUGH n TIMES is n^3 )
- Complexity can be reduced to O(n^2 log(n)) if using different algorithms

Answer 133

A

-Limits the size of the dataset that can be processed
- Anything more than n^2/n^3 is hard to work on
NOT IDEAL FOR LARGE DATABASES

Answer 134

A

-VERSATILE: classification & regression & Non parametric

Easy to Implement & Interpret
Can Approx. complex functions, so it has very good Accuracy
Instance based so it defers all calculations until end

Answer 135

A

Performance decreases as dimensionality increase
Sensitive to noise / INACCURATE DATA especially when the value of k is small
Specify the distance function & pre-define k value
ALL training data it needs to calculate the value of new points based on training data only
Computationally Expensive as dataset grow

Answer 136

A

Relatively low computational cost

Don’t worry about K value
Less Space required does not have to store the training set

Answer 137

A

Supervised Learning Technique
Takes Inputs and plots them like Linear Regression
Classifies the outputs into discrete distinct category using sigmoid function

Answer 138

A

1/(1-e^-u)

where u is the line or function w0 + w1*x1

Answer 139

A

Pairs don’t have same cluster & class

Answer 140

A

Pairs have same cluster, but not class

Answer 141

A

Pairs don’t have same cluster , but same class

Answer 142

A

Pairs have same cluster & class

Answer 143

A

The cost function is NOT non-decreasing

Answer 144

A

Creates a line in 2D space relating x values to predictors
w_0 is the y intercept/offset
w_1 is the gradient

Answer 145

A

Compares difference between actual label data & model’s prediction

squares to eliminate -VES
Done for every element in training set/example
SUMS value for cost function value

Answer 146

A

Takes in ONE x value
- Uses it raised to different powers to get a polynomial function
- THUS NON LINEAR

Answer 147

A

For examples/data points it create an nxn matrix
Then compares whether each point is in same Cluster giving 1 if they are or otherwise 0

Answer 148

A

For examples/data points it create an nxn matrix
Then compares whether each point is in same Class giving 1 if they are or otherwise 0

Answer 149

A

Take the Average y_i for all K neighbours
(VALUE U ARE CLASSIFIYING IN FOR NEW DATA)

The Average value is the y_i value for new data

e.g:
we want to find the height of new data point p,
- Find closet k neighbours
- Find Average height of those neighbour
- the Average is the height of p

Answer 150

A

Just look at the y_i values for all k neighbours, whichever is more prevalent that is the value of our new data point’s y_i

e.g
we want to find the height of new data point p,
- Find closet k neighbours
- Look at the height for the neighbours
- Pick the one that shows up the most in the Neighbours

Answer 151

A

First pick random initial centroid

Then find the centroid furthest away using the via finding EUC distance, so picking the next point the probability is proportional to distance ^2.

Then repeat till you have k centroid

Then do k means to find clusters

Answer 152

A

Different initialisation lead to different local optima

Answer 153

A

Algorithm may not converge at global minimum, rather to a local minima

Answer 154

A

WCSS monotoically decreases
works with finite number of partitions of data

Brainscape's Knowledge GenomeTM

AI Revision Flashcards

Brainscape's Knowledge Genome^TM