8.1 Intro to Machine learning Flashcards
Descriptive Analysis Recap of Univariate Analysis and Bivariate Analysis
- Univariate Analysis examines one variable at a time (e.g., distribution, skewness, subgroups).
- Bivariate Analysis looks at relationships between two variables (e.g., correlation strength and direction).
Why is using predictive anaylsis important ?
- Patterns in data might indicate future trends.
- Predictive analytics takes this a step further by using past data to train models that predict unknown values for new observations.
- This process is a fundamental part of Machine Learning—where algorithms learn from data to make informed predictions.
What does predictive analytics help us understand simply ?
- predictive analytics helps us move beyond understanding “what has happened” to anticipating “what is likely to happen next” using machine learning models.
What are columns in a dataset called in machine learning?
- Variables (Independent Variables) – Inputs that might influence the target.
- Factors – Often used in categorical data (e.g., gender, yes/no responses).
- Features – A common term in machine learning for inputs used to make predictions.
What is the column that we want to predict called?
- Target Variable – What the model aims to predict.
- Predicted Variable – The value we want the model to generate.
- Dependent Variable – The output that depends on the independent variables.
What are rows in a dataset called in machine learning ?
- Samples – Often used in statistics and machine learning.
- Observations – Common term in research and analysis.
- Records – Used in databases and spreadsheets.
- Instances – Common in machine learning, referring to each real-world example.
What is Supervised learning ?
- We already have examples where we know the correct answer.
The model learns from past data where inputs (features) are linked to known outputs (target).
E.g what height will my child be by age 10 ?
What is Regression in Machine Learning ?
- A type of Supervised Learning where we predict a number (numeric value) based on labeled data.
Examples:
- Predicting how far a car can drive on a full tank.
- Estimating a child’s height at age 10.
What is unsupervised learning ?
- We do not have the correct answers (no labels).
- The model finds patterns, clusters, or groupings in the data by itself.
e.g how many objects are in an image ?
What is Classification in Machine Learning ?
- A type of Supervised Learning where we predict a category (label) based on labeled data.
Examples:
- Deciding what jacket to wear based on the weather.
- Determining if a person is infected (Yes/No).
What is Clustering in Machine Learning ?
- Clustering (Unsupervised Learning) means the machine groups things that are similar without being told what those groups are.
Examples:
- Finding groups of customers who respond well to marketing.
- Identifying different species of bacteria that are resistant to antibiotics.
How do I remember the difference between Regression & Classification ?
✅ Regression = Predicts numbers (How much? How far? How tall?)
✅ Classification = Predicts categories (Yes/No, Red/Blue, Disease/No Disease)
How do I remember the difference between Supervised & Unsupervised Learning ?
✅ Supervised = We already have labeled data (we know the answers).
✅ Unsupervised = No labels, we find patterns & groupings.