Lecture 3-Intro to ML Flashcards

1
Q

Why Machine Learning(4)?

A

-Increase in Data Generation
-Improve Decision Making
-Uncover patterns and trends in data
-Solve complex problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When was ML created and who did?

A

1959 Arthur Samuel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the ML process if its training data(8)?

A

1.Dataset
2.Data cleaning
3.Feature Engineering
4.Training data
5.Learning algorithm
6.Train model
7.Score model
8.Evaluate model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the ML process if its new data(6)?

A

1.Dataset
2.Data cleaning
3.Feature Engineering
4.New data
5.Score model
6.Evaluate model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the task that takes the most time in ML process?

A

Data cleaning takes 80-90% of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 types of Machine Learning?

A

-Supervised learning
-Unsupervised learning
-Reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is supervised learning?

A

The machine learns by using labelled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is unsupervised learning?

A

The machine is trained on unlabeled data without any guidance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is reinforcement learning?

A

An agent interacts with its environment by producing actions and discovers errors and rewards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does EDA stand for in supervised and unsupervised learning?

A

Exploratory Data Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is ML widely used in?

A

-In data mining aka Knowledge Discovery Detection(KDD)
Examples: clustering, anomaly detection, association rule mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is prior(or unconditional) probability ?

A

Probability of an event before any evidence is obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is posterior(or conditional) probability?

A

Probability of an event given that you know that some evidence is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Naive Bayes Classifier?

A

A simple probabilistic classifier based on Bayes’ theorem where:
-there’s strong independence assumption (often does not hold)
-the features/attributes are conditionally independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are 4 pros of Naive Bayes Classification?

A

-Very effective on real-world tasks
-Used as baseline algo before trying other methods
-Fast, simple
-Gives confidence in its class predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the main con in Naive Bayes Classification?

A

-Makes a strong assumption of conditional independence that is often INCORRECT

17
Q

How do we evaluate a learning model/what you learned is correct?

A

You run your classifier on a data set of unseen examples(that you did not use for training) for which you know the correct classification

18
Q

What are the 3 sub-sets we can divide the data set into?

A

1.Actual training set(~80%)
2.Validating set(~20%)
3.Test set(~80%)

19
Q

What are the metrics used when evaluating a learning model?

A

-Accuracy
-Recall
-Precision
-F-measure

20
Q

What is the def of accuracy?

A

-% of instances of the test set the algo correctly classifies
-How many % were correct overall?

21
Q

What is the definition of recall?

A

How many % of instances of C were found correctly?

22
Q

What is the def of precision?

A

Of the detected instance of C, how many % were correct?

23
Q

When to use accuracy?

A

When all classes are equally important and represented

24
Q

When to use recall, precision & f-measure?

A

When one class is more important than the others