Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure Flashcards
What issue you could possibly face with a credit card fraud detection dataset?
Problem of outliers
Problem of high-dimensionality
Problem of imbalanced data
Multicollinearity problem
Problem of imbalanced data
What happens when we increase the amount of data for a machine learning problem?
A. The training accuracy increases, test accuracy decreases
B. The training accuracy increases, test accuracy increases
C. The training accuracy decreases, test accuracy decreases
D. The training accuracy decreases, test accuracy increases
D. The training accuracy decreases, test accuracy increases
You can delete the records with missing values if the missing assumption is what?
Missing at Random
Missing Completely at Random
Either of MCAR or MAR
Missing not at Random
Missing Completely at Random
Which is the best method to use to handle missing data if the feature has outliers?
Mode imputation
Mean imputation
Listwise deletion
Median imputation
Median imputation
Which of the following Machine Learning models does not have any target value?
Clustering
Anomaly detection
Regression
Classification
Clustering
Which of the following machine learning models’ target is a continuous value?
Regression
Classification
Anomaly detection
Clustering
Regression
Which of the following is the BEST way to create features for a high-cardinality categorical data?
One-hot encoding
Learning with counts
Dummy coding
Binning
Learning with counts
Which of the following is a disadvantage of linear models?
They run slower
They are not scalable
They may not give accurate predictions
They are harder to train
They may not give accurate predictions
What is TRUE about Leave-one-out cross validation?
It produces low bias and high variance models
It produces low bias and low variance models
It produces high bias and low variance models
It produces high bias and high variance models
It produces low bias and high variance models
Suppose you need to create 7 folds for K-fold Cross validation. How would you do it?
Use Partition and Sample module with ‘Assign to folds’ mode
Use Partition and Sample module with ‘Pick folds’ mode
Use Split data module to assign folds
Use the Cross-validate model module
Use Partition and Sample module with ‘Assign to folds’ mode