Machine Learning Flashcards

Question 1

Q

Cosine Similarity

Answer

A

Measures the cosine of the angle between two vectors to determine the similarity between two items.

Question 2

Q

Manhattan Distance

Answer

A

Calculates the distance between points in a grid-based layout as the sum of the absolute differences of their Cartesian coordinates.

Question 3

Q

Jaccard Similarity

Answer

A

Compares the similarity and diversity of sample sets, calculating the size of the intersection divided by the size of the union of the sets.

Question 4

Q

Spearman’s Rank Correlation

Answer

A

A measure of rank correlation that assesses how well the relationship between two variables can be described using a monotonic function.

Question 5

Q

K-Nearest Neighbors (KNN)

Answer

A

A classification algorithm that stores all cases and classifies new cases based on a majority vote of its k nearest neighbors.

Question 6

Q

Matrix Factorization

Answer

A

A collaborative filtering technique using decompositions like SVD to predict missing entries in a user-item interaction matrix.

Question 7

Q

Content-Based Filtering

Answer

A

Recommends items based on their similarity to items previously liked by the user, using the features of the items themselves.

Question 8

Q

Cold Start Problem

Answer

A

A challenge in recommendation systems where there is insufficient data on new users or items to make accurate recommendations.

Question 9

Q

Item-to-Item Collaborative Filtering

Answer

A

A form of collaborative filtering based on calculating the similarity between items using ratings given by users.

Question 10

Q

Hamming Distance

Answer

A

Measures the distance between two strings of equal length by counting the number of positions at which the corresponding symbols differ.

Question 11

Q

Supervised Learning

Answer

A

A type of machine learning where the model is trained on a labeled dataset, learning to predict the output from the input data.

Question 12

Q

Unsupervised Learning

Answer

A

Learning from data that has not been labeled, categorized, or classified, aiming to identify significant patterns.

Question 13

Q

Regression

Answer

A

A statistical method used in machine learning for predicting continuous outcomes based on previous data.

Question 14

Q

Classification

Answer

A

A process in machine learning for categorizing data into predefined classes or categories.

Question 15

Q

Decision Trees

Answer

A

A decision support tool that uses a tree-like model of decisions and their possible consequences or probability event outcomes.

Question 16

Q

Random Forest

Answer

Study These Flashcards

A

An ensemble learning method for classification, regression, and other tasks that operates by constructing multiple decision trees at training time.

Question 17

Q

Neural Networks

Answer

Study These Flashcards

A

Computing systems vaguely inspired by the biological neural networks that constitute animal brains, capable of pattern recognition and data classification.

Question 18

Q

Gradient Descent

Answer

Study These Flashcards

A

An optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.

Question 19

Q

Overfitting

Answer

Study These Flashcards

A

A modeling error in machine learning where a function is too closely fitted to a limited set of data points and fails to generalize to new data.

Question 20

Q

Cross-Validation

Answer

Study These Flashcards

A

A technique for assessing how the results of a statistical analysis will generalize to an independent data set, commonly used in settings where the goal is prediction and one wants to estimate how accurately a predictive model will perform in practice.

Question 21

Q

Collaborative filtering

Answer

Study These Flashcards

A

Collaborative filtering is a technique used in recommendation systems to predict the preferences of a user by collecting preferences or taste information from many users. The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.

Question 22

Q

Pearson Correlation

Answer

Study These Flashcards

A

Pearson correlation measures the linear relationship between two variables, providing a value between -1 and 1. A score of 1 means a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 means no correlation. It’s commonly used in statistics to assess the strength and direction of two continuous variables’ relationships.

Question 23

Q

Euclidean Distance

Answer

Study These Flashcards

A

Euclidean distance is the “straight-line” distance between two points in Euclidean space. In terms of data points, it represents the geometric distance in multidimensional space, calculated using the Pythagorean theorem. It is often used in clustering and classification to determine how similar or dissimilar data points are to each other.

Machine Learning Flashcards

(23 cards)