AIML Flashcards

aiml

1
Q

What is supervised learning, and why is it used?

A

Learning with labels attached. Trained to generalize new features in the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is unsupervised learning, and why is it used?

A

Learning with features but no labels. Trained to predict future information, using a hypothesis of equation y = p1(x) + p0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is cost?

A

Poorness of fitted line to data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the K-NN algorithm?

A

Supervised learning algorithm to classify new incoming data with low complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are three advantages of the K-NN algorithm?

A

Simple to implement.
Flexible to all features and distance equations.
Easily handles multi-class data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are three disadvantages of the K-NN algorithm?

A

Large search problem to find nearest neighbors, can be intensive
Requires a large amount of stored data with many classes
Distance function should be meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does K-NN operate?

A

Choose some K as the number of neighbors to take.
Locate K nearest neighbors for unclassified example. Should not be multiple of K. Optimize K on observation.
Most votes wins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the K-means Clustering algorithm?

A

Unsupervised learner for classification or regression
Finds K groups in the set, defined by centroids.
Guaranteed to converge on a result, though it may not be local optimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are three advantages of the K-means clustering algorithm?

A

Can be used for any kind of grouping
Thanks to the simple layout of data, new data can easily be applied to a cluster
Clustering allows finding groups that have formed organically without definition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are three disadvantages of the K-means clustering algorithm?

A

Cannot handle outliers
Cannot handle complicated cluster types
Cluster assignments change with each run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does K-means clustering work?

A

Randomly generates K centroid locations in result space
Each point assigned to nearest centroid
Centroid relocated to mean of all assigned points
Iterates until stopping, due to no change or sum of distances minimized, iteration cap etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the search space?

A

Graph representing how good each solution is, attempting to find the global optimum (vs. the local optimum)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the components of a neuron?

A

Synapses, receiving numerical input
Summation sub-unit sums weighted inputs to single value
Activation sub-unit maps to new output
Output through axon to all connected neurons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a hidden layer?

A

Layers between the input and output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is feed-forward?

A

The input is fed through all connections, weighted and passed forward. An output is produced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is linear seperability?

A

Whether a single straight line can be drawn that would separate all positive and negative values. XOR is not linearly separable.

17
Q

Define bias vs variance.

A

Bias is a simple model which fails to capture relationships within the data.
Variance is a complex model that has overfitted to the data, producing noisy results.

18
Q

What is regularisation?

A

Group of techniques coercing a model to infer simpler results, aiding the bias-variance trade-off.

19
Q

Give three examples of regularization with brief descriptions.

A

L1 - Prevents overfitting by applying penalty to loss, driving unimportant values to zero. Smaller models.
L2 - Adds extra term to loss. Drives towards smaller values.
Dropout - Randomly shuts off neurons. Forces robust learning.

20
Q

What is a genetic algorithm?

A

Probabilistic search method adopting natural selection approach

21
Q

List three benefits of a genetic algorithm.

A

Supports multi-objective optimization
Good for noisy data
Always provides an answer
Easily parallelable

22
Q

How does a genetic algorithm represent solutions?

A

Encoded, often using binary bit strings (chromosomes)

23
Q

Describe the broad steps of a genetic algorithm.

A

Initialize a population of solutions
Evaluate each using an objective function
Create new solutions via selection, crossover, mutation and elitism
Replace old population and repeat

24
Q

Define selection in a genetic algorithm.

A

Each solution arranged by fitness value
Total value calculated
Each solution calculates its % of total fitness
Biased roulette wheel built from probabilities

25
Q

Define crossover in a genetic algorithm.

A

From selection roulette wheel, choose two parents and switch one or more pairs of bits

26
Q

Define elitism in a genetic algorithm.

A

Select some best candidate solutions and grant free passage without modification

27
Q

Define mutation in a genetic algorithm.

A

Random alteration or flip of a bit in a solution.