ML Flashcards

Question 1

Q

What are features and labels?

Answer

A

Features: Values that usefully characterize the things we wish to classify.
Labels: The one feature that is the outcome of the training data.

Question 2

Q

Name the three types of Machine Learning (plus a very short explanation).

Answer

A

-> Supervised machine learning algorithms
° the possible outcomes are already known and the training data is labeled with the correct answers
-> Unsupervised machine learning algorithms
° the training data will have no correct answer or no specific outcome or label. Algorithms help to discover interesting patterns in data.
->Reinforcement machine learning algorithms
° the machine is exposed to an environment where it trains itself continually using the trial and error method to make accurate decisions

Question 3

Q

What are the seven steps of Machine Learning (plus a very short explanation).

Answer

A

Data collection
collecting training data
Data preparation
de-duplicated and errors need to be removed
Choosing a model
the model needs to meet the business goal
Training
use your training data and incrementally improve the predictions of the model
Evaluation
testing the machine learning against an unused control dataset
Parameter tuning
test the originally set parameters to improve the AI
Prediction
answer questions using predictions

Question 4

Q

What are the two kinds of problems that can be solved using supervised learning (plus a very short explanation)?

Answer

A

-> Classification
we have categorized output such as “black”, “teaching”, “non-teaching”
-> Regression
we have real value output such as “distance”, “kilogram”

Question 5

Q

Give al ML techniques and there kind of problems

Answer

A

-> supervised learning
- classification
- regression

-> unsupervised learning
- Clustering (simular data)
- Anomaly dectection (unusual detection)
- Association (interesting relations)

Question 6

Q

Using linear regression, the system estimates a regression function with the equation 𝑓(𝑥) = 𝑏 + m𝑥. Can you explain the values of b and m?

Answer

A

b
the point where the estimated regression line crosses the 𝑦 axis

m
determines the slope of the estimated regression line

Question 7

Q

Can you explain the terms bias and variance? Is bias, resp. variance low or high for the straight line / squiggly line?

Answer

A

bias: it describes how well the model matches the training data set
variance: the changes in the model when using different portions of the training data set

straight line: bias high and variance low
squiggly line: bias low and variance high

Question 8

Q

Name the three types of recommender systems (plus a short explanation).

Answer

A

Popularity based
houd enkel rekening met de populariteit in het algemeen om een aanbeveling te doen
Content based
it analyses the content and finds similar content
Collaborative filtering
find similar users and recommend something the other user liked/watched

Question 9

Q

Name three companies that use recommender systems intensively. What are the items they recommend?

Answer

A

Netflix -> movies
Amazon -> products to buy eg. books
facebook -> people you might know: friends

Question 10

Q

Name the three challenges of collaborative filtering. Can you explain them?

Answer

A

Data sparsity
-> Users in general rate only a limited number of items

Cold start
-> Difficulty in recommendations in new users or new items

Scalability
-> Increase in number of users or items

Question 11

Q

Explain the following term:
entropy

Answer

A

een maat voor de waarschijnlijkheid van een bepaalde verdeling

Question 12

Q

Explain the following term:
information gain

Answer

A

the amount of information gained about a random variable

Question 13

Q

Explain the following term:
leaf node

Question 14

Q

Explain the following term:
decision node

Answer

A

node waarin een beslissing van verdeling gemaakt wordt

Question 15

Q

Explain the following term:
root node

Answer

A

Eerste node

Question 16

Q

Name one advantage and one disadvantage of decision trees.

Answer

Study These Flashcards

A

+ easy to understand
- overfitting is quite common

Question 17

Q

What are the four steps in the Random Forest algorithm?

Answer

Study These Flashcards

A

Select random samples from a given dataset (= bootstrapped datasets).
Construct a decision tree for each sample (using only a random subset of variables) and get a prediction result from each decision tree.
Perform a vote for each predicted result.
Select the prediction result with the most votes as the final prediction.

Question 18

Q

Explain, using the dog analogy, how reinforcement learning works.

Answer

Study These Flashcards

A

een hond is een agent in een omgeving. De omgeving kan je huis zijn.
De situaties die de hond tegenkomt is een state. vb: een hond staat en er word een bepaald commando op een bapaalde toon gegeven in de living.
De agent reageerd door een actie uit te voeren om van de ene state over te gaan naar een andere state, de hond gaat bijvoorbeeld van staan naar zitten.
Na de overgang kan de agent een beloning of een straf terugkrijgen. De hond krijgt een traktatie of een nee
Het beleid is de strategie van het kiezen van een actie gegeven een state in de verwachting van betere uitkomsten.

Question 19

Q

Why are there 500 different states in the taxi problem?

Answer

Study These Flashcards

A

5×5×5×4 = 500
-5x5 grid
-5 possible locations for our passenger
-4 possible locations where we can drop of our passenger

Question 20

Q

What rewards and/or penalties are involved in the taxi problem?

Answer

Study These Flashcards

A

high positive reward for a successful dropoff
penalized if it tries to drop off a passenger in wrong locations
slight negative reward for not making it to the destination after every time-step

Question 21

Q

What are the six different actions that can be taken in the taxi problem?

Answer

Study These Flashcards

A

south
north
east
west
pickup
dropoff

ML Flashcards

(21 cards)