Machine Learning Flashcards

1
Q

Linear Regression

A

It attempts to represent the relationship between independent variables (the x values) and a numeric outcome (the y values) by fitting the equation of a line to that data. This line can then be used to predict values to come.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Logistic Regression

A

This logistic function maps the intermediate outcome values into an outcome variable Y with values ranging from 0 to 1. These values can then be interpreted as the probability of occurrence of Y. The properties of the S-shaped logistic function make logistic regression better for classification tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Decision Trees

A

Decision Trees can be used for both regression and classification tasks.

In this algorithm, the training model learns to predict values of the target variable by learning decision rules with a tree representation. A tree is made up of nodes with corresponding attributes.

At each node we ask a question about the data based on the available features. The left and right branches represent the possible answers. The final nodes, leaf nodes, correspond to a predicted value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Naive Bayes

A

Naive Bayes is based on the Bayes Theorem. It measures the probability of each class, and the conditional probability for each class give values of x. This algorithm is used for classification problems to reach a binary yes/no outcome. Take a look at the equation below.

Naive Bayes classifiers are a popular statistical technique for filtering spam emails.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Support Vector Machines (SVM)

A

VM is a supervised algorithm used for classification problems. SVM tries to draw two lines between the data points with the largest margin between them. To do this, we plot data items as points in n-dimensional space, where n is the number of input features. Based on this, SVM finds an optimal boundary, called a hyperplane, which best separates the possible outputs by their class label.

The distance between the hyperplane and the closest class point is called the margin. The optimal hyperplane has the largest margin that classifies points to maximize the distance between the closest data point and both classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

K-Nearest Neighbors (KNN)

A

KNN algorithm is very simple. KNN classifies an object by searching through the entire training set for the k most similar instances, the k neighbors, and assigning a common output variable to all those k instances.

The selection of k is critical: a small value can result in a lot of noise and inaccurate results, while a large value is not feasible. It is most commonly used for classification, but it is also useful for regression problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random Forest

A

Random Forest is a very popular ensemble ML algorithm. The underlying idea for this algorithm is that the opinion of many is more accurate than the individual. In Random Forest, we use an ensemble of decision trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Artificial neural networks (ANNs)

A

NN can handle large, complex ML tasks. A neural network is essentially a set of interconnected layers with weighted edges and nodes called neurons. Between the input and output layers we can insert multiple hidden layers. NN uses two hidden layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

K-Means

A

K-Means is a distance-based unsupervised machine learning algorithm that accomplishes clustering tasks. In this algorithm, you classify datasets into clusters (K clusters) where the data points within one set remain homogenous, and the data points from two different clusters remain heterogeneous.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Recurrent neural networks (RNNs)

A

Recurrent neural networks refer to a specific type of ANN that processes sequential data. Here, the result of the previous step acts as the input to the current step. This is facilitated via the hidden state that remembers information about a sequence. It acts as a memory that maintains the information on what was previously calculated. The memory of RNN reduces the overall complexity of the neural network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Large Language Model (LLM)

A

A Large Language Model Is a Type of Neural Network. These may have millions of neurons with many hundreds of billions of connections between them, with each connection having its own weight. LLMs use a particular neural network architecture called a transformer, which is designed to process and generate data in sequence, like text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly