[4] Feature Manipulation Flashcards

Question 1

Q

What is feature selection?

Answer

A

The process of extracting a small set of relevant features from a larger original set

Question 2

Q

What are the motives of feature selection?

Answer

A

Increase classification accuracy
Speed up processing time
Improve interpretability

Question 3

Q

What are the possible overall purposes of feature selection?

Answer

A

Classical - select m features from n while retaining classification accuracy etc
Idealised - find the minimal number of features that can fully describe the target

Question 4

Q

What are the key decision to make when doing feature selection?

Answer

A

the overall purpose
the evaluation approach
the iterative approach

Question 5

Q

What are the iterative approaches for feature selection?

Answer

A

Sequential forward starts with an empty set and adds another feature on each step

Sequential backward starts with a full set and iteratively removes features

Question 6

Q

What are the evaluation approaches that can be used for feature selection?

Answer

A

Wrapper - include a learning algorithm in the learning process i.e. a model is trained each time a feature is considered
Filter - instead of using a learning algorithm; measures based on distance (separability), information (i.e. entropy), dependency (correlation) or consistency are used.
— Single feature ranking lists the features in order and picks m
Embedded - a model is trained once and analysed to see which variables are most important

Question 7

Q

What are the tradeoffs of the evaluation approaches for feature selection?

Answer

A

Wrapper leads to better results, but it computationally expense and doesn’t generalise well if a different classification algorithm is used

Single-feature ranking ignores feature interaction, and there is a risk that the top features may be redundant

Question 8

Q

What is entropy?

Answer

A

The number of bits needed to encode a variable

Question 9

Q

What is Pearson’s correlation?

Answer

A

A measure between -1 (strong negative correlation) and 1 (strong positive correlation) of correlation

Question 10

Q

What is feature construction?

Answer

A

New features are created from existing features

Question 11

Q

What is PCA?

Answer

A

It creates linear transformations so that the first component has the highest variance and the second has the second most etc.

It assumes the bigger the variance, the better the feature

It doesn’t take into account classes, so doesn’t ensure good separability

Question 12

Q

What is key to evaluating features during feature manipulation?

Answer

A

Cross-validation to avoid biases

Question 13

Q

How is GP used for feature construction?

Answer

A

Intervals for each class are defined as [u_c - 3 sigma_c, u_c + 3 sigma_c]

The GP is optimised the avoid overlapping the class intervals i.e. maximising the conditional entropy

Question 14

Q

What are the two general uses of transfer learning?

Answer

A

Domain adaption is when the feature space remains the same but there are different probability distributions

If the feature space also changes, it is is cross-domain adaption

Question 15

Q

What are the approaches to transfer learning?

Answer

A

Instance based - re-weight labelled data from the source domain

Feature based - find a good feature representation which is similar across the domains

Model parameter based - find shared parameters between models

Relational knowledge - map knowledge from the source domain to the target domain

Question 16

Q

When are multi-objective solutions unambiguously better than others?

Answer

Study These Flashcards

A

A solution dominates another if it is better by every other measure

Question 17

Q

What are the ways of approaching multi-objective optimisation?

Answer

Study These Flashcards

A

Aggregation-based - weight each objective to give a single objective which can be optimised

Ideal multi-objective optimisation - output high-leave information i.e. a range of feasible solutions

Question 18

Q

How are solutions to ideal multi-objective problems represented?

Answer

Study These Flashcards

A

With a Pareto front, which is a set of solutions which aren’t dominated

Question 19

Q

How is EC used to solve multi-objective problems?

Answer

Study These Flashcards

A

Evolutionary multi-objective optimisation (EMO) obtains multiple Pareto-optimal solutions in a single run

It can handle discontinuities and concavities along the Pareto front

It has three considerations

Question 20

Q

What are the considerations of EMO?

Answer

Study These Flashcards

A

Fitness assignment - a scalar fitness is needed.

– Aggregating functions such as weighted sum can be used, but miss the concave part of the Pareto front
– Dominance-based methods assign ranks based on who they dominate; within rank, individuals are sorted by crowding distance

Diversity preservation ensures coverage of the Pareto front; Elitism ensures non-dominated solutions aren’t lost

[4] Feature Manipulation Flashcards

(20 cards)