Data Science MODULE 5 Flashcards
In machine learning we describe the learning of the target function from training data as
Inductive learning
What is feature selection?
Methods, employed to reduce the amount of input variables to those that are believed to be most useful to a model
Unsupervised feature selection?
Ignores the target variable
Supervised feature selection
Use the target variable in the selection process
Wrapper feature selection
Gaan basies en gebruik verskillende inputs, uit jou data, om die beste fit te bepaal
Filter feature selection
Soos ek dit sien, basies korrelasies tussen individuele features en die response. Die bestes word gebruik om die model te train
Derde tipe feature selection?
Hulle noem dit intrinsic - so n tree based model is baie goeie voorbeeld van dit
Is dimensionality reduction n feature selection metode?
Eintlik nie, want nuwe features word eintlik geskep, vanaf die oorspronklike inputs
Drie tipe categorical data
Nominal (r,g,b)
Ordinal (1st, 2nd, 3rd)
Boolean (true and false)
Filter feature selection
Numeries - numeries
Pearson
Spearman
Filter feature selection
Numeries - kategoriee
Anova
Kendall
Filter feature selection
Kategoriee Kategoriee
Chi-squared
Mutual info
Verskil tussen Pearson en Spearman?
Pearson vir lineer
Spearman vir nie-lineer
Verskil tussen Anova en Kendal
Anova - lineer
Kendall - nie lineer
Scikit libraries vir:
Pearson
ANOVA
Chi squared
Mutual info
f_regression()
f_classif()
chi2()
mutual_info_classif() en mutual_info_regression()
Libraries in SciPy
kendaltau
spearmanr
Selection methods in python
SelectKBest
SelectPercentile
Using regularisation to constrain a neural network amounts to modifying the
Objective function by adding a penalty term?
So in baie eenvoudige terme, wat is regularisation?
Wanneer die weights te groot raak, word dit gepenaliseer
Hoe word die mate van regularisation beheer?
Met n scaling factor - lambda
L1 regularisation?
Is geneig om weights na nil toe te vat, en konneksies te breek
L2 regularisation staan ook bekend as?
Weight decay
Groot gewigte word meer gepenaliseer as kleiner gewigte
Hoe groter die faktor is, hoe nader sal die gewigte beweeg aan nil
Tussen L1 en L2, watter een is computationally meer effektief?
L2
L1 is nie ondersteun deur die MLPRegressor nie
Korrek, gebruik sknn.mlp as jy dit nodig kry
So met cross validation, en neurale netwerke, watter hyperparameter word verander?
Lamda, of ook dan bekend as die scaling factor