Week 10: Feature Engineering & Dimensionality Reduction Flashcards

Question 1

Q

Standardise Numeric Values

Answer

A

Subtract the mean \mu from each value and divide each centred value by the standard deviation \sigma

Question 2

Q

Convert Numeric Values into Percentiles

Answer

A

x-th percentile means that x percent of the samples is less than the current sample.

Question 3

Q

Convert Counts into Rates

Answer

A

This is useful when tracking events over time. Can measure how often an even occurs in a specific time unit.

Question 4

Q

Replace Categorical Variables with Numeric Variables

Answer

A

Replacements include numeric descriptor variables and binary indicator variables.

For example, cities can be categorised by a series of numerical variables (i.e. population, medium income, annual rainfall).

Can perform one-hot-encoding for a variable with only a few categories.

Question 5

Q

Replace Numerical Variables with Categorical Variables

Answer

A

Binning is a common technique, with equal width, equal weight, and supervised binning.

Question 6

Q

Combining Variables

Answer

A

Common examples include BMI and Price-to-Earnings ratio.

Question 7

Q

Problems with High Dimensional Data

Answer

A

Issues include risk of correlation between input variables and overfitting.

Question 8

Q

Sparse Data Problem

Answer

A

Sparse data may result in isolated points without many neighbours. This can make pattern recognition tougher.

Question 9

Q

Variable Selection

Answer

A

This is key for reducing the number of predictors in high-dimensional problems. The relevance of variables depends on independence, correlation, and average mutual information.

It’s important to remove attributes with low mutual information with the target attribute, attributes correlated with other attributes, and attributes independent of the target attribute.

Question 10

Q

Correlation

Answer

A

\rho(\boldsymbol{x}j, \boldsymbol{y}) = \frac{\sum{i=1}^m (x_{i,j} - \overline{x}j)(y_i y \overline{y})}{\sqrt{\sum{i=1}^n (x_{i,j}-\overline{x}i)^2 \sum{i=1}^n (y_i - \overline{x})^2}}

Question 11

Q

Average Mutual Information

Answer

A

I(\boldsymbol{y}; \boldsymbol{x}_j) = H(\boldsymbol{y}) - H(\boldsymbol{y} \mid \boldsymbol{x}_j)

Note that H(\boldsymbol{y} \mid \boldsymbol{x}j) is conditional entropy.
H(\boldsymbol{y} \mid \boldsymbol{x}j) = - \sum{y \in Dom(\boldsymbol{y})} \sum{x \in Dom(\boldsymbol{x}_j)} P(x,y) \log [P(x \mid y)]

Question 12

Q

Exhaustive Feature Selection

Answer

A

Exhaustively try all combinations of a set of variables. This approach is impractical for high-dimensional data.

Question 13

Q

Forward Selection

Answer

A

A sequential feature selection method. Start without any variables in the model. Build a family of models with one input variable per model. Pick the best input variable. Repeat by adding one variable at a time. Terminate once a predefined maximum number of variables reached, or if adding a new variable doesn’t improve the model.

Question 14

Q

Backward Selection

Answer

A

Start with all the variables initially included in the model. Each variable is removed one at a time to test its importance to the model. The least important variable is removed. Variables are removed until a minimum number of variables is reached or if the remaining variables are all above a certain level of importance.

This is typically a time-consuming approach.

Question 15

Q

Projections

Answer

A

Transform points from the \mathbb{R}^n space to the \mathbb{R}^k space. For example, projecting a 3-D ball onto a 2-D plane results in a circle.

Question 16

Q

Eigenvectors and Eigenvalues

Answer

Study These Flashcards

A

Given a square matrix \boldsymbol{A}, eigenvector \boldsymbol{u}, and eigenvalue \lambda,

\boldsymbol{Au} = \lambda \boldsymbol{u}

Question 17

Q

Eigendecomposition

Answer

Study These Flashcards

A

The eigendecomposition of a n \times n square matrix \boldsymbol{A} is
\boldsymbol{A} = \boldsymbol{U \Lambda U}^_{-1}

\boldsymbol{U} is a n \times n square matrix whose i-th column is the i-th eigenvector \boldsymbol{u}_i of \boldsymbol{A}.

\boldsymbol{\Lambda} is a n \times n diagonal matrix whose i-th diagonal element \lambda_i is the eigenvalue corresponding to the eigenvector \boldsymbol{u}_i.

Question 18

Q

Singular Vectors and Singular Values

Answer

Study These Flashcards

A

A non-negative number \sigma is a singular value of a m \times n matrix \boldsymbol{A} if there exists unit-length vectors \boldsymbol{u} (left-singular vector) and \boldsymbol{v} (right-singular vector) such that:

\boldsymbol{Av} = \boldsymbol{\sigma u} and
\boldsymbol{A}^T\boldsymbol{u} = \boldsymbol{\sigma v}

Question 19

Q

Singular Value Decomposition

Answer

Study These Flashcards

A

A factorisation \boldsymbol{A} = \boldsymbol{U \Sigma V}^T of a n \times m matrix \boldsymbol{A}.

\boldsymbol{U} = m \times m orthogonal matrix whose columns are the left singular vectors.

\boldsymbol{\Sigma} = m \times n diagonal matrix whose diagonal elements \sigma_{i,i} are the singular values.

\boldsymbol{V} = n \times n orthogonal matrix whose columns are the right singular vectors.

In the context of data mining, each row \boldsymbol{u}_i of \boldsymbol{U} corresponds to a document and each column \boldsymbol{v}_j of \boldsymbol{V} to a term.

Question 20

Q

Latent Semantic Indexing

Answer

Study These Flashcards

A

This method is used in NLP and uses SVD to reduce the number of rows while preserving the similarity structure among columns. Reduces the m \times n matrix to a m \times k matrix. Termes with similar meaning are expected to be merged in the same dimension after reducing dimensionality.

Question 21

Q

Projection Pursuit Regression

Answer

Study These Flashcards

A

Nonlinear transformation of linear combinations of variables by finding the most interesting projections of the data.

\hat{y} = \sum_{j=1}^k w_J h_j (\boldsymbol{\alpha}_j^T \boldsymbol{x})

k = number of new variables, usually much smaller than n, the original number of variables

\boldsymbol{\alpha}_j^T \boldsymbol{x} = projection of vector \boldsymbol{x} to the j-th weight vector \boldsymbol{\alpha}_j

Week 10: Feature Engineering & Dimensionality Reduction Flashcards

(21 cards)