L7: Supervised machine learning Flashcards
What is machine learning
A branch of AI
What does machine learns from?
.. Learning from data
… Discovering hidden patterns
… Essential for data-driven decisions
What questions can we ask?
Predictive
Examples of applied ML
Examples:
* Credit card fraud detection in Financial Institutions
* Recommendation systems on websites for personalization
* Customer segmentation for marketing strategies
* Customer churn to foresee service cancellations
* Predictive maintenance in manufacturing companies
* Sentiment analysis of social media data
* Health diagnosis to aid doctors
ML Pipeline
Aqquire –> Prepare –> Analyze –> Report –> Act
ML Pipeline step 1: Aqquire data
- Identify data sources: Check the question that needs to be addressed
- Collect data: Record the necessary data
- Integrate data (data wrangling): Merge/join data, if needed
ML Pipeline step 2: Prepare Data
- Explore: Understand your data e.g.,
- Check the structure and variable types
- Check for outliers, missing values etc.
- Pre-process: Prepare your data for analysis e.g.,
- Clean (missing values, mistakes etc.)
- Feature selection (e.g., combine, remove, add)
- Feature transformation (e.g., scaling, dimensionality reduction, filtering)
ML Pipeline step 3: Analyze data
- Select analytical techniques
- Build models
- Assess results
STEP 4: REPORT RESULTS
Communicate results
Recommend actions
STEP 5: ACT
Apply the results
Implement, maintain, and assess the impact
The goal is to estimate a model from a selection of input variables to give
the best estimate of the target (i.e., outcome variable). It predicts something
we have seen before (i.e., data labels guides the learning process).
Requires:
* A range of input variables
* An outcome variable
Supervised ML
The process of adding informative labels or tags to our data
* Think of it as the “ground truth” for the target variable/outcome variable
* Necessary for a supervised ML algorithm
DATA LABELS
DATA LABELS
The process of adding informative labels or tags to our data
* Think of it as the “ground truth” for the target variable/outcome variable
* Necessary for a supervised ML algorithm
Types of supervised ML
Regression and classifacation
Regression
Given input variables, predict a numeric (continuous) value.
Examples:
* Estimate average house price for a region
* Determine demand for a new product
* Predict power usage
Given input variables, predict a numeric (continuous) value.
Examples:
* Estimate average house price for a region
* Determine demand for a new product
* Predict power usage
Regression
CLASSIFICATION
Given input variables, predict a categorical variable.
Examples:
* Predict if it will rain tomorrow
* Determine if loan application is high-,medium-, or low-risk
* Identify sentiment as positive, negative, or neutral
Given input variables, predict a categorical variable.
Examples:
* Predict if it will rain tomorrow
* Determine if loan application is high-,medium-, or low-risk
* Identify sentiment as positive, negative, or neutral
CLASSIFICATION