ML Flashcards

1
Q

Reinforcement learning is what type of learning?

A

A combination of unsupervised and supervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following is true about supervised learning? The outcomes is Expected or it is not expected?

A

Expected outcome is defined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of supervised learning problem is categorized into regression problem?

A

Predict the cost of a car on the basis of given parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following types of machine learning algorithms forms a significant part of the human learning?

A

Unsupervised Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Clustering algorithms fall under which of the following categories of machine learning models?

A

Unsupervised Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following is the equation for linear regression?

A

y = β0 + β1x1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which of the below-mentioned machine learning algorithms is/are used to predict continuously valued quantities?

A

2) Linear Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name a type of unsupervised machine learning

A

k means clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

K-means differs from other clustering methods

A

There are a predetermined amount of clusters in K-means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to import SVC

A

from sklearn.svm import SVC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Create classifier model ?

A
# Instantiate SVC()
svc = SVC()
clf = svc.fit(X_train,y_train)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The process to activate live trading

A
Initialise function
schedule function
optional function 
data fetching 
order placement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a persistent namespace in blueshift

A

persistent namespace for you to store variables you need to access

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is in schedule_function(

A
schedule_function(
func = <>,
date_rule = <>,
time_rule = <>
)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Difference between linear regression & logistic regression

A

Both are supervised models however the linear regression is used to solve regression problems whereas the logistic regression is used to solve classification problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SVMs

A

Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

KNN

A

K-Nearest Neighbors (KNN) is one of the simplest algorithms used in Machine Learning for regression and classification problem. KNN algorithms use data and classify new data points based on similarity measures (e.g. distance function). Classification is done by a majority vote to its neighbors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which of the following is incorrect about the random forest algorithm?

A

Random forest is a supervised ensemble technique that can be used to solve a regression or a classification problem. A random forest operates by building multiple decision trees and combines their prediction to get a more accurate and stable overall prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of performing cross-validation?

A

A statistical analysis on an independent data. It is one of the methods for assessing and choosing the best parameters in a prediction or machine learning task. The process of cross validation includes keeping aside a sample dataset, then training the model on the remaining dataset and finally, using the dataset kept aside to test if the model gives a positive result or not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does fully connected mean in neural networks

A

All neurons are full connected to each hidden layer neuron . P*H = connections in Feed network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does the activation function do

A

The activation function helps to incorporate non-linear mathematics into the neural network full stop

22
Q

Name 2 activation function s

A

Sigmoid function or 8 TanH : Sigmoid between 0 and 1

23
Q

What is the distance between high bias and high variance ?

A

High bias is machine that has not been able to do well leaning the data.
High variance is a machine that has over learned the training data and does badly on test data.

24
Q

What are CNN

A

Convolutional neural networks, used to help find the right indicator

25
Q

If your model requires a labelled target data set, then it is called?

A

Classification model and if the model requires a target dataset then it is a supervised model.

26
Q

Which of the following is not a supervised algorithm?

A

KMeansis a clustering technique and is an unsupervised learning method. K Means uses the distance from centroids to cluster the data.

27
Q

A Decision Tree divides the input data into

A

A Decision Tree uses the greater than and less than operations to split the data into different parts. When plotted these splits appear at right angles to each other.

28
Q

K-Fold cross-validation involves

A

The K in K-folds represents the number of parts into which the data would be split. K is always an integer.

29
Q

Which of the following metrics is not used in measuring the performance of a classification model?

A

Accuracy, Precision and Specificity are used as metrics while solving a regression problem.

30
Q

Which of the following is not a wrapper method?

A

Principal Component Analysis (PCA) is an unsupervised method. Recursive Feature Elimination(RFE), Forward Selection and Backwards Elimination are wrappers of feature selection.

31
Q

Logistic Regression is a

A

Although the name suggests a regression, it is used to classify the data into different labels.

32
Q

Which of the following is not a technique used in training a neuron?

A

Dropout layer is used to switch off the weights of neurons from performing updation.

33
Q

Which if the following is an activation function used in training a neural network?

A

Tanh
ReLU
Sigmoid

34
Q

Which of the following is not a part of a neuron?

A

A neuron consists of Kernel Weights, an Activation function and a Bias term. Not a Variance

35
Q

Back-propagation is used in supervised learning?

A

Backpropagation is used to improve the prediction quality. This is used when there is a target dataset available.

36
Q

Gradient Descent is used to

A

Gradient descent is used the reduce the loss or error in prediction.

37
Q

An Artificial Neural Network does not contain

A

An ANN is not built with decision trees, but it can contain multiple neurons.

38
Q

The Bias of a Neuron is a

A

A constant term is added to the kernel at every layer and this is known as the bias.

39
Q

Can Neural Networks solve nonlinear problems

A

Yes

40
Q

Neural Networks can be used in

A

Supervised Learning
Unsupervised Learning
Reinforcement Learning

41
Q

A decision tree have

A

A decision tree ends with leaves and contains branches and a root.

42
Q

Precision is defined as

A

Predicted positive outcomes that were actually positive

43
Q

The Predictor and Category are independent

A

A T-test is used to check the relation between the dependent and independent variables.

44
Q

Which of the following is used in hyperparameter tuning

A

Sklearn provides both Grid and Random search methods for cross-validation.

45
Q

After solving the Maximum Likelihood Estimate, we get

A

z-score and the P-values along with the coefficients

46
Q

The Decision Tree Classifier model in sklearn does give the following output

A

Classes
Feature Importance
Probability
you need to use accuracy score function as extra

47
Q

In Quadratic Discriminant Analysis

A

While performing QDA a covariance matrix for each of the classes in the target data set is created.

48
Q

Which if the following is true about Ridge regression equation

A

Ridge is an L2 type and minimizes a sum of the square of errors. An L1 type uses the sum of absolute values of the errors.

49
Q

Regularisation is used to address which problem

A

Regularization is very effective in solving a model overfitting problem. It penalizes the coefficients to solve this problem.

50
Q

To improve prediction an ensemble model uses

A

An Ensemble typically contains multiple models whose predictions are weighted.

51
Q

Which of the following is not an ensemble technique

A

When samples are taken from the population with replacement then it is called bootstrapping.

52
Q

Important parameter/s of Boosting

A

Number of trees
Learning Rate
Depth of the tree