How to Select Categorical Input Features: Encoding and K-best Flashcards
DOES PANDAS TRY TO MAP SOME STR INPUTS TO NUMERICAL VALUES IN THE DATASET? IF YES, WHAT SHOULD WE DO ABOUT IT?P137
Yes, that’s why it’s better to convert for example, numbers with str format, into str, after reading the dataset file.
DOES THE ORDINALENCODER IN SCIKIT-LEARN ALLOW SPECIFYING THE ORDER OF CATEGORIES? P139
Yes, it does.
WHAT IS THE DIFFERENCE BETWEEN ORDINALENCODER AND LABELENCODER? P139
Labelencoder is for encoding a single variable, can’t give a dataframe to it
WHEN USING K-BEST FOR CHI2 TEST, WHAT DO HIGHER SCORES INDICATE? P141
Stronger dependence between feature and target (it doesn’t show p-value, it’s just a score)
HOW CAN WE USE MUTUAL INFO IN K-BEST FOR CLASSIFICATION PROBLEMS? P152
By setting “score_func” parameter to mutual_info_classif