Raw Data to Features Flashcards
1
Q
List types of features.
A
feature vectors can be
- numerical,
- categorical,
- bucketized,
- crossed
- hashed.
2
Q
What makes a good feature?
A
- has be related to the objective
- it’s known at the prediction time
- it is numeric
- have enough examples
- have some human insights
3
Q
what is called data dredging?
A
That’s dredging the large data set and finding whatever spurious correlations exist, because the larger the data set is, the more likely it is that there are lots of these spurious correlations.