[4] Feature Manipulation Flashcards

1
Q

What is feature selection?

A

The process of extracting a small set of relevant features from a larger original set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the motives of feature selection?

A
  • Increase classification accuracy
  • Speed up processing time
  • Improve interpretability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the possible overall purposes of feature selection?

A
  • Classical - select m features from n while retaining classification accuracy etc
  • Idealised - find the minimal number of features that can fully describe the target
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the key decision to make when doing feature selection?

A
  • the overall purpose
  • the evaluation approach
  • the iterative approach
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the iterative approaches for feature selection?

A

Sequential forward starts with an empty set and adds another feature on each step

Sequential backward starts with a full set and iteratively removes features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the evaluation approaches that can be used for feature selection?

A
  • Wrapper - include a learning algorithm in the learning process i.e. a model is trained each time a feature is considered
  • Filter - instead of using a learning algorithm; measures based on distance (separability), information (i.e. entropy), dependency (correlation) or consistency are used.
  • — Single feature ranking lists the features in order and picks m
  • Embedded - a model is trained once and analysed to see which variables are most important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the tradeoffs of the evaluation approaches for feature selection?

A

Wrapper leads to better results, but it computationally expense and doesn’t generalise well if a different classification algorithm is used

Single-feature ranking ignores feature interaction, and there is a risk that the top features may be redundant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is entropy?

A

The number of bits needed to encode a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Pearson’s correlation?

A

A measure between -1 (strong negative correlation) and 1 (strong positive correlation) of correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is feature construction?

A

New features are created from existing features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is PCA?

A

It creates linear transformations so that the first component has the highest variance and the second has the second most etc.

It assumes the bigger the variance, the better the feature

It doesn’t take into account classes, so doesn’t ensure good separability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is key to evaluating features during feature manipulation?

A

Cross-validation to avoid biases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is GP used for feature construction?

A

Intervals for each class are defined as [u_c - 3 sigma_c, u_c + 3 sigma_c]

The GP is optimised the avoid overlapping the class intervals i.e. maximising the conditional entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two general uses of transfer learning?

A

Domain adaption is when the feature space remains the same but there are different probability distributions

If the feature space also changes, it is is cross-domain adaption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the approaches to transfer learning?

A

Instance based - re-weight labelled data from the source domain

Feature based - find a good feature representation which is similar across the domains

Model parameter based - find shared parameters between models

Relational knowledge - map knowledge from the source domain to the target domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When are multi-objective solutions unambiguously better than others?

A

A solution dominates another if it is better by every other measure

17
Q

What are the ways of approaching multi-objective optimisation?

A

Aggregation-based - weight each objective to give a single objective which can be optimised

Ideal multi-objective optimisation - output high-leave information i.e. a range of feasible solutions

18
Q

How are solutions to ideal multi-objective problems represented?

A

With a Pareto front, which is a set of solutions which aren’t dominated

19
Q

How is EC used to solve multi-objective problems?

A

Evolutionary multi-objective optimisation (EMO) obtains multiple Pareto-optimal solutions in a single run

It can handle discontinuities and concavities along the Pareto front

It has three considerations

20
Q

What are the considerations of EMO?

A

Fitness assignment - a scalar fitness is needed.

  • – Aggregating functions such as weighted sum can be used, but miss the concave part of the Pareto front
  • – Dominance-based methods assign ranks based on who they dominate; within rank, individuals are sorted by crowding distance

Diversity preservation ensures coverage of the Pareto front; Elitism ensures non-dominated solutions aren’t lost