General Machine Learning & Ethics Flashcards
What is the difference between supervised and unsupervised learning?
Supervised learning involves training a model on labeled data, while unsupervised learning deals with unlabeled data and seeks to find patterns or structures within it.
What is Machine Learning and what does it do?
Machine Learning is the development of algorithms and statistical models that enable computers to learn from data and make predictions or decisions without explicit programming
Supervised ML
- uses labeled datasets
- to train algorithms to
- classify or predict outcomes.
What is required for Supervised ML?
you need labeled data
Unsupervised ML
uses algorithms to analyze and cluster unlabeled datasets.
Once an algorithm is deployed, ____ learning will manage data as it comes in and classify or analyze it.
Unsupervised
When is Linear Regression used?
Linear regression models are used when the result must be a continuous variable
Ex. predict rainfall amounts in inches
Supervise ML - Classification
Classification models will deliver results as a categorical variable, where there is a finite set of values that the variable can be. two results: Will Rain or Won’t Rain.
What is the 3 main goal of the analyze stage in machine learning?
- Understanding response variables and how they’re structured. continuous? categorical?
- Explore predictor variables.
- Featuring Engineering
Will my machine learning model change over time?
For a model to predict accurately, the data that it is making predictions on must have a similar distribution as the data on which the model was trained.
Because data distributions can be expected to drift over time, deploying a model is not a one-time exercise but rather a continuous process.
Continuous monitoring of incoming data can help retrain your model on newer data if the data distribution has deviated significantly from the original training data distribution.
How can I determine that machine learning is the right solution?
- Requires complex logic
- Requires scalability
- Requires personalization
- Requires responsiveness
What are the reasons to NOT use machine learning?
- Can be solved with traditional algorithms
- Does not require adapting to new data
- Requires 100% accuracy
- Requires full interpretability
Is my data ready for a machine learning solution?
- Is it easily accessible?
- Does it respect privacy?
- Is it relevant?
What is popularity bias in the context of machine learning?
Popularity bias refers to the phenomenon where more popular items are recommended more frequently by a system, often overlooking other items that could be just as pleasing to users.
Why is it important for data professionals to prioritize fairness in their data
- reduce the potential for unintended consequences of machine learning applications, including the perpetuation of human biases.
- It is part of responsible data stewardship.