ML Concepts Flashcards

Question 1

Q

Feature crossing

Answer

A

Create new features from existing features by taking their cross product
A feature cross is a synthetic feature that encodes nonlinearity in the feature space by multiplying two or more input features together.

Question 2

Q

Example of crossing two features such as country and language

f1: [USA, China, England]
f2: [English, Chinese]

Answer

A

Generates 6 new features
- USA-English
- USA-Chinese
- China-English

Question 3

Q

Feature selection

Answer

A

Goal is to reduce the number of features to only those most useful
can use GBDT to select a subset of features based on their importance

Question 4

Q

Feature extraction

Answer

A

Reduce the number of features by creating new features from existing ones. The new features have more predictive power.

Question 5

Q

Unsupervised learning

Answer

A

no labels

Question 6

Q

Dimensionality Reduction

Answer

A

Techniques for reducing the number of input variables in training data.

Question 7

Q

Over-fitting

Answer

A

The model gives accurate predictions for training data but not new data

Question 8

Q

Under-fitting

Answer

A

Model is unable to capture the relationship between inputs and output variables accurately
High error rate for both training data and unseen data

Question 9

Q

Pros of Feature Crossing

Answer

A

Captures pair-wise, second-order feature interactions

Question 10

Q

Cons of Feature Crossing

Answer

A

Requires human with domain knowledge to choose pairs
Won’t capture all complex interactions
If original features are sparse, the cardinality of the crossed features can become much larger leading to even more sparsity

Question 11

Q

One hot encoding

Answer

A

Technique used to represent categorical variables as numerical values in a machine learning model

Question 12

Q

Advantages of one hot encoding

Answer

A

It allows the use of categorical variables in models that require numerical input.
It can help to avoid the problem of ordinality, which can occur when a categorical variable has a natural ordering (e.g. “small”, “medium”, “large”).

Question 13

Q

What would a categorical feature for fruit (apple, mango, banana) and associated price look like using one hot encoding

Answer

A

A vector. The columns of the vector would be apple, mango, banana, and price. The fruit columns would contain one or zero. The price column would contain a numerical value.

Question 14

Q

Disadvantages of one hot encoding

Answer

A

One-hot-encoding is a powerful technique to treat categorical data, but it can lead to increased dimensionality (extra cols), sparsity (most cols are zero), and overfitting (poor predictions). It is important to use it cautiously and consider other methods such as ordinal encoding or binary encoding.

Question 15

Q

Embeddings

Answer

A

Way to encode a categorical feature
Alternative to one-hot encoding which can generate very sparse vectors
Map high-dim vectors to low-dim vectors
Embeddings make it easier to do machine learning on large inputs like sparse vectors representing words.
Ideally, an embedding captures some of the semantics of the input by placing semantically similar inputs close together in the embedding space.

Question 16

Q

Feature Extraction

Answer

Study These Flashcards

A

Create new features from existing ones for dimensionality reduction

Question 17

Q

Model Categories

Answer

Study These Flashcards

A

Supervised (labels)
Unsupervised (no labels)
Reinforncement learning (trial and error; this is how your roomba learns the shape of your living room)

Question 18

Q

Types of supervised learning

Answer

Study These Flashcards

A

Classification and regression

Question 19

Q

How do you handle missing values in training data?

Answer

Study These Flashcards

A

Feature imputation techniques
Use defaults
Use the mean, median, or mode
regression imputation (predict values based on correlated features)
k-nearest neighbor (use avg of nearest neighbors)

Question 20

Q

What is embedding learning

Answer

Study These Flashcards

A

Learning an n-dim vector for each unique value a categorical feature may take

Question 21

Q

Upsampling

Answer

Study These Flashcards

A

Strategy to handle data with imbalanced classes by replicating or generating new samples from the minority class to achieve a more balanced distribution
Reduces model bias for the majority class

Question 22

Q

Why deep neural networks

Answer

Study These Flashcards

A

One layer is good enough for many tasks but that single layer might have to be impractically large
Fewer workers
Shallow networks often perform poorly on high dim spaces like natural language

Question 23

Q

Downsampling

Answer

Study These Flashcards

A

Strategy to handle data with imbalanced classes by randomly removing samples from the majority class

Question 24

Q

Epochs

Answer

Study These Flashcards

A

During an epoch, the model sequentially processes each training sample, calculates loss, updates its parameters based on the gradients
The number of times the model iterates through the entire training set

ML Concepts Flashcards

(24 cards)