Machine Learning Flashcards

Learn basics of ML

1
Q

What does ML allows us to do

A

By using data from past observations we can predict future outcomes or values
ex. icecream sales based on historical sales and weather records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is function in ML terms

A

ML encapsulates a function to calculate output value based on one or more input values

process of defining function is known as training
predict new values based on function is known as inferencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ML Steps for Training and Inferencing

A
  1. Obtain training data of past observations. This data has the attributes or features(x) of thing being observed and known output (label(y)). [x1,x2,x3], y … lots of this data

ex. for icecream sales case features could be temp, rainfall, windspeed and # sold would be the label

  1. Algorithm is applied to data to determine relation between features and label. Specific algorithm applied depends on the problem, but main goal is to fit a function to the data
  2. result of algorithm is a function that captures the model. y= f(x). We can use this to make predictions now
  3. Training phase is complete. Now we can perform inferencing. Use our function to get predicted value represented by y hat (y^)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the different types of Supervised ML?

A

Regression and Classification (Binary and Multiclass)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an example of Unsupervised ML?

A

Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Supervised ML

A

Supervised ML is when the training data includes both feature values and labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Regression

A

label predicted by model is a numeric value

i.e ice cream sales, fuel efficiency, property price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Binary Classification

A

label determines whether observed item is or isn’t something.

i.e. is patient diabetic based on weight, age, blood
i.e.2. will customer default on loan based on income, age, credit history

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Multiclass Classification

A

Extension of binary classification. to determine multiple outcomes.

i.e out of 3 species which penguin can this be based on physical measurements
genre of movie based on cast, director, and budget

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unsupervised ML

A

involves training models that consist only of feature values without any known labels. The ML algorithms will determine relation between features to group them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Clustering

A

most common form of unsupervised ML.
clustering algorithm identifies similarities between observations based on their features and groups them into discrete clusters

i.e. group similar flowers based on size, number of leaves and number of petals
It can be thought of to be similar to multiclass except we do not have the defined labels with clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Regression Model process

A
  1. split training data to data you will train with and subset to validate trained model
  2. use algorithm to fit training data to model - use regression algorithm like linear regression
  3. use validation data held back to test the model by predicted labels for features
  4. Compares actual labels for validation data against predicted. Then aggregate differences between predicted and actual label values to see how the model performs

This process can be repeated with different algorithms and parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some evaluation metrics for regression?

A

Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Coefficient of Determination (R^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is calculated in classification model

A

probability values — instead of numeric values like in regression
for binary classification the value is 1 or 0 (yes or no, true/false)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What kind of shape does the function for binary classification take?

A

it takes a sigmoid shape kind of like an S and one algorithm that can be used to get this is logistic regression (top and bottom part do look like a logarithmic function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is binary classification evaluated

A

using a matrix, we can see how many items were in correct and incorrect positions
true negatives, false positives, false neg, true pos. also called confusion matrix.

using confusion matrix we can do math for:
Accuracy: tn+tp/(tn+fn+fp+tp)
Recall: tp/(tp+fn)
Precision: tp/(tp+fp)
F1-Score: 2PrecisionRecall/(Precision + Recall)

17
Q

Are processes different for multiclass classficiation?

A

No, follows the same as binary and regression with iterative train, validate and evaluate process

18
Q

How do multiclass algorithms work?

A

calculate probability for multiple class labels

19
Q

What type of multi-class algorithms can you use

A

one-vs-rest (OvR) algorithm
Multinomial algorithm

20
Q

How does OvR algorithm work

A

We train a binary classification function for each class, that calculates probability of it being that class vs another class. Each algorithm produces a sigmoid function wiht values between 0 and 1. The model using this algorithm would select the class that produces highest probability output.

21
Q

How does multinomial algorithm work?

A

creates a single function that returns multi-valued output. Output is a vector (array of numbers) [0.2, 0.3, 0.5] that all add up to 1. ex function is softmax f’n.

We also use confusion matrix 3x3 for 3 classes example.

22
Q

Deep Learning

A

advanced form of ML that tries to copy human brain learning. creates artificial neural network

23
Q
A