Planning a Machine Learning Project Flashcards

The Planning a Machine Learning Project course introduces requirements to determine if ML is the appropriate solution to a business problem. This course focuses on business leaders and other decision-makers currently or potentially involved in ML projects.

1
Q

How can I determine that machine learning is the right solution?

A

Businesses can determine if ML is the right solution if the problem is clear and quantifiable. If this is the case, ML can provide value in a model’s predictions when compared to specific business objectives and success criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the reasons to use machine learning?

A

An example of a business problem where the use of ML would be appropriate is generating personalized recommendations. In this case, the solution to the problem requires complex logic, and we would want to provide personalized recommendations at scale with quick turnaround times.

Requires complex logic
Since developing personalized recommendations requires complex logic, ML is an appropriate tool to consider.

Requires scalability
Serving millions of requests for personalized recommendations every second is a challenge.

Requires personalization
Delivering personalized recommendations at scale and being responsive at the same time is difficult to achieve with classical programming techniques.

Requires responsiveness
The ability to deliver personalized recommendations within a few seconds even while handling millions of requests per second is expected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the reasons to NOT use machine learning?

A

Business reasons to avoid ML depend on whether traditional methods and rules are viable options, if there are few or no requirements to adapt to new data, if business goals include 100% outcome accuracy, or if models must be explained or translated.

Can be solved with traditional algorithms
If the problem is not overly complex, an ML solution might be overcomplicated.

Does not require adapting to new data
If data and conditions are not changing, a more traditional approach could be more appropriate.

Requires 100% accuracy
ML predictions often provide less than 100% accuracy.

Requires full interpretability
If being able to explain what is going to happen if you change the parameters or input is a priority, ML might not be the best solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an example business case for machine learning?

A

Consider a financial institution that needs to determine which category of products and offerings is most interesting to a customer. The problem might not be effectively solved using simple hand-coded rules since the outcome might depend on many factors and overlapping rules. ML could solve this problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can I identify a good problem to solve with ML?

A
  1. What is the strategy to achieve this goal?
  2. How could you use machine learning to achieve this goal?
  3. What aspects of the problem make it a good fit to apply ML?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is my data ready for a machine learning solution?

A

Data readiness depends on the quality, quantity, diversity, and complexity of the data collected. After discovering and collecting all relevant data, the data should be cleansed, validated, transformed, and stored.

Data should exist for training and model development and should not require significant preparation before use.

Data should be in a reachable, on-demand state with access to store, retrieve, move, modify, or copy data from one place to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Should my data be used?

A

Personally identifiable information, such as citizenship or health information, might be labeled private and protected by privacy laws.

Industry regulations, government laws, and compliance policies determine the importance of various data types, and determine what and how data can be processed, stored, managed, or shared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Is my data high quality?

A

Data used in an ML project should be relevant to produce valuable results. Data should be timely so that training data is as close to the actual data as possible. Data should be representative of the data across all data sources. Data selection should be unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an example business case for data readiness?

A

Consider a banking institution that wants to gather quantifiable insights about a segment of customers, and leadership must decide if they meet data readiness requirements to use ML as a solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What questions should you ask to identify a good problem to solve using machine learning and assess your data readiness?

A
  1. Is it easily accessible?
  2. Does it respect privacy?
  3. Is it relevant?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What should be my expectation in terms of time?

A

Machine learning can take significant time from the start of a project through production deployment. Expectations for the amount of time needed to deploy production models can take weeks or even months.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a machine learning lifecycle look like?

A
  1. Problem Definition
  2. Data Exploration
  3. Data Preparation
  4. Model Exploration
  5. Model Training
  6. Model Testing
  7. Evaluation
  8. Production Deployment
  9. Model Update
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Will my machine learning model change over time?

A

For a model to predict accurately, the data that it is making predictions on must have a similar distribution as the data on which the model was trained. Because data distributions can be expected to drift over time, deploying a model is not a one-time exercise but rather a continuous process. Continuous monitoring of incoming data can help retrain your model on newer data if the data distribution has deviated significantly from the original training data distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an example business case of a project timeline challenge?

A

Consider a manufacturer that needs to use ML to solve a quantifiable business problem but is concerned about potential delays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What questions should you ask to identify a good problem to solve using machine learning and assess your machine learning lifecycle?

A
  1. Have you ever had a similar task to your proposed ML solution?
  2. Have you explored your data and found faults?
  3. Is the performance of models meeting business requirements?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do I take my machine learning solution into production?

A

While production deployment of an ML model is one of the last stages of the ML pipeline, ML production code differs in many ways from research code. The purpose of research code is to promote exploration and validate models using iterative processes, which might lack formal quality, stability, or scaling requirements. However, production code must meet objective and fixed requirements, facilitate collaboration through version control, maintain a code deployment history, and meet code reliability standards.

17
Q

What is the likely computational cost of generating predictions with your model?

A

Start thinking about the costs associated with launching your machine learning model into production (that is storage, processing, and so on). If the project model meets your business requirements, fast implementation can expedite the benefits. However, failure to plan ahead on computational costs could hinder production later.

18
Q

How quickly does your data change?

A

Why is this important?

Start thinking about how complex your data is and how often data changes. Ultimately, this could lead you to constantly retrain the model. This could lead to increased time allotted for training since you will be cycling back from production to research and development.

19
Q

How significant are the changes needed to deploy?

A

Why is this important?

Start thinking about which changes you would like to enter into production and the frequency. This will guide a strategy on maximizing the impact of updating your model.

20
Q

Does the model’s performance meet the business need?

A

Why is this important?

Start thinking about how your business conditions will change over time. Your current model might need to be adjusted as conditions change. Do you have new product lines launching? Are there new regulations in you business sector? Are you expanding into new geographies?

21
Q
A