Planning a Machine Learning Project Flashcards
The Planning a Machine Learning Project course introduces requirements to determine if ML is the appropriate solution to a business problem. This course focuses on business leaders and other decision-makers currently or potentially involved in ML projects.
How can I determine that machine learning is the right solution?
Businesses can determine if ML is the right solution if the problem is clear and quantifiable. If this is the case, ML can provide value in a model’s predictions when compared to specific business objectives and success criteria.
What are the reasons to use machine learning?
An example of a business problem where the use of ML would be appropriate is generating personalized recommendations. In this case, the solution to the problem requires complex logic, and we would want to provide personalized recommendations at scale with quick turnaround times.
Requires complex logic
Since developing personalized recommendations requires complex logic, ML is an appropriate tool to consider.
Requires scalability
Serving millions of requests for personalized recommendations every second is a challenge.
Requires personalization
Delivering personalized recommendations at scale and being responsive at the same time is difficult to achieve with classical programming techniques.
Requires responsiveness
The ability to deliver personalized recommendations within a few seconds even while handling millions of requests per second is expected.
What are the reasons to NOT use machine learning?
Business reasons to avoid ML depend on whether traditional methods and rules are viable options, if there are few or no requirements to adapt to new data, if business goals include 100% outcome accuracy, or if models must be explained or translated.
Can be solved with traditional algorithms
If the problem is not overly complex, an ML solution might be overcomplicated.
Does not require adapting to new data
If data and conditions are not changing, a more traditional approach could be more appropriate.
Requires 100% accuracy
ML predictions often provide less than 100% accuracy.
Requires full interpretability
If being able to explain what is going to happen if you change the parameters or input is a priority, ML might not be the best solution.
What is an example business case for machine learning?
Consider a financial institution that needs to determine which category of products and offerings is most interesting to a customer. The problem might not be effectively solved using simple hand-coded rules since the outcome might depend on many factors and overlapping rules. ML could solve this problem.
How can I identify a good problem to solve with ML?
- What is the strategy to achieve this goal?
- How could you use machine learning to achieve this goal?
- What aspects of the problem make it a good fit to apply ML?
Is my data ready for a machine learning solution?
Data readiness depends on the quality, quantity, diversity, and complexity of the data collected. After discovering and collecting all relevant data, the data should be cleansed, validated, transformed, and stored.
Data should exist for training and model development and should not require significant preparation before use.
Data should be in a reachable, on-demand state with access to store, retrieve, move, modify, or copy data from one place to another.
Should my data be used?
Personally identifiable information, such as citizenship or health information, might be labeled private and protected by privacy laws.
Industry regulations, government laws, and compliance policies determine the importance of various data types, and determine what and how data can be processed, stored, managed, or shared.
Is my data high quality?
Data used in an ML project should be relevant to produce valuable results. Data should be timely so that training data is as close to the actual data as possible. Data should be representative of the data across all data sources. Data selection should be unbiased.
What is an example business case for data readiness?
Consider a banking institution that wants to gather quantifiable insights about a segment of customers, and leadership must decide if they meet data readiness requirements to use ML as a solution.
What questions should you ask to identify a good problem to solve using machine learning and assess your data readiness?
- Is it easily accessible?
- Does it respect privacy?
- Is it relevant?
What should be my expectation in terms of time?
Machine learning can take significant time from the start of a project through production deployment. Expectations for the amount of time needed to deploy production models can take weeks or even months.
What does a machine learning lifecycle look like?
- Problem Definition
- Data Exploration
- Data Preparation
- Model Exploration
- Model Training
- Model Testing
- Evaluation
- Production Deployment
- Model Update
Will my machine learning model change over time?
For a model to predict accurately, the data that it is making predictions on must have a similar distribution as the data on which the model was trained. Because data distributions can be expected to drift over time, deploying a model is not a one-time exercise but rather a continuous process. Continuous monitoring of incoming data can help retrain your model on newer data if the data distribution has deviated significantly from the original training data distribution.
What is an example business case of a project timeline challenge?
Consider a manufacturer that needs to use ML to solve a quantifiable business problem but is concerned about potential delays.
What questions should you ask to identify a good problem to solve using machine learning and assess your machine learning lifecycle?
- Have you ever had a similar task to your proposed ML solution?
- Have you explored your data and found faults?
- Is the performance of models meeting business requirements?