machine learning interview prep Flashcards

Question 1

Q

gradient boosting and random forest? And what are the advantages and disadvantages of each when compared to each other?

Answer

A

Random forest is less prone to overfitting compared to gradient boosting. It has faster training as trees are created in parallel and independent of each other.

Gradient boosting can be more accurate than random forests because we train them to minimize the previous tree’s error.
It can also capture complex patterns in the data.

Question 2

Q

Briefly explain the K-Means clustering and how can we find the best value of K.

Answer

A

In K means each piece of data has its own vector representation which all fall into different parts of a grid.
K means assigns pieces of data to cluster based on their euclidean distance.

A method used to find the best value of k is the elbow, method, meaning past a certain point increasing k leads to very little variation

Question 3

Q

What is dimensionality reduction? Can you discuss one method of it?

Answer

A

Demensiaonly reduction is the process of reducing the amount of features in your data set in order to lessen the computational load.

One way method to reduce dimensionality is to look for highly correlated features and try to represent them with fewer features.

Question 4

Q

What are L1 and L2 regularizations?

Answer

A

L1 is know as lasso

L2 is known as ridge

Question 5

Q

What are the differences between the two L1 and L2 ?

Answer

A

L1 or lasso…. essentially makes sure that irreverent features value are removed almost to the point of zero.

L2 or Ridge….. tells the model to reduce the importance of a certain feature via adding. weight to the feature, to prevent it from over impacting the model

Question 6

Q

What is the difference between overfitting and underfitting, and how can you avoid them?

Answer

A

Under fitting is when your model fails to capture the scope of the data and in turn fails to make accurate predictions

Over fitting is when your model is too familiar with the training data and works poorly with other data

Question 7

Q

overfitting and underfitting, and how can you avoid them?

Answer

A

You can avoid under fitting by giving the model better feature to work with, also you can reduce regularization prams and let the model familiarize themselves with the familiarize itself with the intricicatcies of

Question 8

Q

Question 9

Q

Question 10

Q

Question 11

Q

Question 12

Q

Question 13

Q

Question 14

Q

Question 15

Q

Question 16

Q

Question 17

Q

Question 18

Q

Question 19

Q

Question 20

Q

Question 21

Q