Feature Engineering Flashcards
What is the 3 main goal of the analyze stage in machine learning?
- Understanding response variables and how they’re structured. continuous? categorical?
- Explore predictor variables.
- Featuring Engineering
What is the 3 general categories of Feature Engineering?
feature selection, feature extraction, feature transformation.
Explain the concept of feature engineering and its importance in machine learning.
Feature engineering involves selecting, transforming, and creating relevant features from raw data to improve model performance. It plays a crucial role in enhancing model accuracy.
What is feature engineering in machine learning
Feature engineering is the process of using practical, statistical, and data science knowledge to select, transform, or extract characteristics, properties, and attributes from raw data for the construction of machine learning models.
How does feature engineering improve model’s performance
- by solving the data structure issues.
- A well-structured data can help the model to detect predictive signals better.
Is feature engineering dependent on the type of data used
Yes, the process of feature engineering is highly dependent on the type of data you’re working with.
What is the difference between feature engineering and Exploratory Data Analysis (EDA)
- Feature engineering goes a step beyond EDA.
- EDA involves exploring the data
- feature engineering involves selecting, extracting, or transforming variables or features from datasets for the construction of machine learning models.
What is the goal of feature selection in data engineering
to select the features in the data that contribute the most to predicting your response variable. This usually involves dropping features that do not help in making a prediction.
What does feature transformation involve in machine learning
- modifying the existing features
- to improve accuracy when training the model.
- This could involve changing the data from numerical to categorical or creating new categories based on the data.
Can you give an example of feature transformation
For instance, if your data includes exact temperatures, but you only need a feature that indicates if it’s hot, cold, or temperate. To make that transformation, you could define some cut off points for the data, such as defining anything above 80°F as hot, anything below 70 as cold, and anything in between as temperate.
What is feature extraction in machine learning
taking multiple features to create a new one to improve the accuracy of the algorithm.
For example, creating a new variable that becomes true if the temperature is warm and the humidity is high, and false otherwise.
How does feature extraction benefit machine learning models
by creating new features that capture important information in a format that’s more understandable for the model, potentially leading to improved accuracy.
What is Feature Selection in machine learning
Feature Selection is the process of picking variables from a dataset that will be used as predictor variables for a model.
The goal is to find the predictive and interactive features and
- exclude redundant and irrelevant features
- to improve model performance.
What are the three types of features in Feature Selection
- Predictive
- Interactive
- Irrelevant
How does Feature Selection fit into the PACE workflow
Feature Selection occurs at multiple stages of the PACE workflow.
- Plan phase, where you define your problem and decide on a target variable to predict. It occurs again during the
- Analyze phase, where after exploratory data analysis, it might be clear that some features might not be suitable for modeling.
- Construct phase, where the goal is to find the smallest set of predictive features that still results in good overall model performance.