Feature Crosses Flashcards

1
Q

What is a feature cross?

A

A feature cross is a synthetic feature that encodes nonlinearity in the feature space by multiplying two or more input features together. (The term cross comes from cross product.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you call a synthetic feature, that combines other features?

A

Feature cross (feature cross product).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain three possible examples of a feature cross.

A

We can create many different kinds of feature crosses. For example:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is it efficient to use feature crossing?

A

Thanks to stochastic gradient descent, linear models can be trained efficiently. Consequently, supplementing scaled linear models with feature crosses has traditionally been an efficient way to train on massive-scale data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain example 1 of a feature cross of two one-hot encoded vectors (features).

A

As another example, suppose you bin latitude and longitude, producing separate one-hot five-element feature vectors. For instance, a given latitude and longitude could be represented as follows:

binned_latitude = [0, 0, 0, 1, 0]
binned_longitude = [0, 1, 0, 0, 0]

Suppose you create a feature cross of these two feature vectors:

binned_latitude X binned_longitude

This feature cross is a 25-element one-hot vector (24 zeroes and 1 one). The single 1 in the cross identifies a particular conjunction of latitude and longitude. Your model can then learn particular associations about that conjunction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain example 2 of a feature cross.

A

Suppose our model needs to predict how satisfied dog owners will be with dogs based on two features:

Behavior type (barking, crying, snuggling, etc.)
Time of day

If we build a feature cross from both these features:

[behavior type X time of day]

then we’ll end up with vastly more predictive ability than either feature on its own. For example, if a dog cries (happily) at 5:00 pm when the owner returns from work will likely be a great positive predictor of owner satisfaction. Crying (miserably, perhaps) at 3:00 am when the owner was sleeping soundly will likely be a strong negative predictor of owner satisfaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Different cities in California have markedly different housing prices. Suppose you must create a model to predict housing prices. Which sets of features or feature crosses could learn city-specific relationships between roomsPerPerson and housing price?

A

One feature cross: [binned latitude X binned longitude X binned roomsPerPerson]

Crossing binned latitude with binned longitude enables the model to learn city-specific effects of roomsPerPerson.

Binning prevents a change in latitude producing the same result as a change in longitude. Depending on the granularity of the bins, this feature cross could learn city-specific or neighborhood-specific or even block-specific effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is FTRL standing for?

A

It is a regression algorithm (logistic regression). Follow The Regularised Leader.

Here is an implementation:
https://www.kaggle.com/jiweiliu/ftrl-starter-code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain a logistic function.

A

A logistic function or logistic curve is a common “S” shape (sigmoid curve), with equation:

f(x)=L/(1+e^(-k(x-x0))

where
e = the natural logarithm base (also known as Euler’s number),
x0 = the x-value of the sigmoid’s midpoint,
L = the curve’s maximum value, and
k = the steepness of the curve.[1]
For values of x in the domain of real numbers from −∞ to +∞, the S-curve shown on the right is obtained, with the graph of f approaching L as x approaches +∞ and approaching zero as x approaches −∞.

The function was named in 1844 (published 1845)[a] by Pierre François Verhulst, who studied it in relation to population growth.[2] The initial stage of growth is approximately exponential (geometric); then, as saturation begins, the growth slows to linear (arithmetic), and at maturity, growth stops. Verhulst did not explain the choice of the term “logistic” (French: logistique), but it follows his discussion of arithmetic growth and geometric growth (whose curve he calls a logarithmic curve, instead of the modern term exponential curve), and thus “logistic growth” is presumably named by analogy with arithmetic and geometric, logistic being from Ancient Greek: λογῐστῐκός, translit. logistikós, a traditional division of Greek mathematics,[b] and in contrast to the logarithmic curve.[c] The term is unrelated to the military and management term logistics, which is instead from French: logis “lodgings”, though some believe the Greek term also influenced logistics; see Logistics § Origin for details.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly