MACHINE LEARNING Flashcards

Question

Feature Extraction

Answer 1

Deriving new features from raw data -often for dimensionality reductin - or improved model performance

Answer 2

choosing the most relevant features for a model to improve accuracy and reduce complexity

Answer 3

Ability for a model perform well on unseen data - bad generalization: Can 2+2=4, but cant do 3+3=6 - or, only bicycle on one spesific cycle

Answer 4

An optimization algorithm to minimize a model's loss function by iteratively adjusting parameters - Adjusting a model until it gets perfect

Answer 5

A clustering method hat creates a tree of clusters by iteratively merging or splitting clusters

Answer 6

A dataset where some classes are more frequent than others, often requiring techniques like resampling

Answer 7

Filling in missing data - commonly done using statistical - or model-based techniques

Answer 8

Assumptions to generalize beyond training data

Answer 9

metric used in decision trees to measure the reduction in entropy after a split

Answer 10

A function that transforms data into a higher-dimensional space pros: - used in SVM to handle non-linearly separable data

Answer 11

A clustering algorithm that partitions data into k clusters by minimizing the distance of points from the cluster centroid - You want to centralize the clustuers, by doing the same thing multiple times -sensitive for: startposition, outliers, scale on features

Answer 12

A classification algorithm that assigns labels based on the majority label among the k closest points -

Answer 13

Steps a model takes during gradient descent

Answer 14

A model that finds a linear relationship between a dependent variable and one or more independent variables

Answer 15

A common method for k-means clustering, iterating between assigning points to clusters and updating centroids

Answer 16

A model used for binary classification by modeling the probability of an outcome with a sigmoid function

Answer 17

function that finds error from a models predicts - Used to guide training

Answer 18

average of absolute difference between predicted and actual values

Answer 19

Types of missing data MCAR- completely random MAR- related to observed data MNAR- related to missing data itself

Answer 20

Process of assessing a model's performance using metrics like accuracy, precision, recall and RMSE

Answer 21

The process of choosing the best model among various options based on performance

Answer 22

Probabilistic classifier based on Bayes theorem, assuming features are independent

Answer 23

Model inspired by the human brain, consisting of layers of neurons that process data for tasks like classification and regression

Answer 24

No model works best for all problems

Answer 25

When a model learns noise in the training data, performing poorly on new data

Answer 26

A technique to balance classes by replicating or generating new samples of the minority class

Answer 27

A dimensionality reduction technique that projects data into a lower-dimensional space while retaining variance

Answer 28

Process of removing parts from a decision tree to prevent overfitting

Answer 29

A function used in machine learning, often in neural networks and SVMs, to handle non-linear patterns

Answer 30

An ensemble of decision trees that improves accuracy and reduces overfitting by aggregating predictions

Answer 31

Ratio true positive predictions to the total actual positives

Answer 32

An activation function that outputs zero for negative inputs and the input itself for positive values

MACHINE LEARNING Flashcards

terms (57 cards)