machine learning interview prep Flashcards

1
Q

gradient boosting and random forest? And what are the advantages and disadvantages of each when compared to each other?

A

Random forest is less prone to overfitting compared to gradient boosting. It has faster training as trees are created in parallel and independent of each other.

Gradient boosting can be more accurate than random forests because we train them to minimize the previous tree’s error.
It can also capture complex patterns in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Briefly explain the K-Means clustering and how can we find the best value of K.

A

In K means each piece of data has its own vector representation which all fall into different parts of a grid.
K means assigns pieces of data to cluster based on their euclidean distance.

A method used to find the best value of k is the elbow, method, meaning past a certain point increasing k leads to very little variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is dimensionality reduction? Can you discuss one method of it?

A

Demensiaonly reduction is the process of reducing the amount of features in your data set in order to lessen the computational load.

One way method to reduce dimensionality is to look for highly correlated features and try to represent them with fewer features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are L1 and L2 regularizations?

A

L1 is know as lasso

L2 is known as ridge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the differences between the two L1 and L2 ?

A

L1 or lasso…. essentially makes sure that irreverent features value are removed almost to the point of zero.

L2 or Ridge….. tells the model to reduce the importance of a certain feature via adding. weight to the feature, to prevent it from over impacting the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between overfitting and underfitting, and how can you avoid them?

A

Under fitting is when your model fails to capture the scope of the data and in turn fails to make accurate predictions

Over fitting is when your model is too familiar with the training data and works poorly with other data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

overfitting and underfitting, and how can you avoid them?

A

You can avoid under fitting by giving the model better feature to work with, also you can reduce regularization prams and let the model familiarize themselves with the familiarize itself with the intricicatcies of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly