1 Flashcards

Question 1

Q

Define Machine Learning

Answer

A

A system that learns from a set of data to perform a given task

Question 2

Q

What are 4 problems suited to machine learning

Answer

A

When the problem is complex problems and has no specific solution
When long lists of hand tuned rules are required
When the environment fluctuates over time
To help humans learn

Question 3

Q

Name 2 common supervised tasks

Answer

A

Classification – Putting instances into different classes

Regression – Predicting a Target Numeric Value

Question 4

Q

Name 4 common unsupervised tasks

Answer

A

Clustering – Categorising instances into approximate groups
Visualization – Producing a 2D or 3D representation of data
Dimensionality Reduction – Simplifying data without losing information e.g. by feature extraction (combining features)
Association – Discover relations between attributes

Question 5

Q

What type of ML algorithm would you use to allow the robot to walk in unknown locations

Answer

A

Reinforcement Learning

Question 6

Q

What type of algorithm would you use to segment customers into multiple groups

Answer

A

Unsupervised Learning clustering algorithm if you don’t know how to segment customers

Question 7

Q

Is spam detection supervised or unsupervised

Answer

A

Supervised as data is labbled

Question 8

Q

Main challenges of machine learning

Answer

A

Lack of Data
Poor data quality
Non-representative data
Uninformative features
Excessively simple models underfitting data
Excessively complex data that overfits the data

Question 9

Q

Whats a test set and why would you use it

Answer

A

Data is often split into 2 sets:
One is for training the model (usually 80%)
The other is for testing the model and estimating the generalization error

Question 10

Q

What algorithm relies on similarity

Answer

A

Instance based learning systems learns raw training data then uses a similarity measure on new instances to make predictions

Question 11

Q

What do model based algorithms search for and what is the most common strategy they use to succeed and how do they make predictions

Answer

A

Optimal value for parameters so the model generalises well

Trained by minimising a cost function

Question 12

Q

If your model performs well on training data but poorly on test data what is happening and name 3 solutions

Answer

A

More data
Simplifying the model
Reducing noise

Question 13

Q

Whats a validation set

Answer

A

A set that compared models

Question 14

Q

What can go wrong if you use the test set for hyper parameter tuning

Answer

A

Overfitting and poor generalisation

Question 15

Q

What is cross validation and why is it better then using a validation set

Answer

A

Allows comparing models without need for a separate validation set

Question 16

Q

What is reinforcement learning

Answer

Study These Flashcards

A

A learning agent observes an environment then selects and perfroms actions and gets reward or penalties in return to learn what the best strategy (policy) is

Question 17

Q

What is semi supervised learning

Answer

Study These Flashcards

A

When some data is labelled but the majority isn’t

An example includes photo services that recognised photos are of the same people

1 Flashcards

(17 cards)