Identifying Good Problems for ML Flashcards
Before starting with ML ask yourself the following questions:
(in order)
- What problem is my product facing?
- Would it be a good problem for ML?
Know the problem before focusing on the data. If you understand the problem clearly, you should be able to ___
list some potential solutions to test in order to generate the best model.
What kind of data is going to be the most useful?
Data collected specifically for your task is going to be the most useful. In practice, you may not be able to do this, and you’ll rely on whatever data you can get that’s close enough. That’s fine as long as you’re aware of the cost, and as you can eventually get product logs, you can use those to build something more targeted to your task.
Why should you not make ML discover all the useful feature for you and simply throw everything at the model and see what looks useful?
Your model will likely wind up overly complicated, expensive, and filled with unimportant features
In __ __, you have a higher chance that a feature will be correlated with your label by chance within your sample of data.
smaller datasets
What do we mean by decisions in ML?
By decisions, we mean that your product should take action on the output of the model. ML is better at making decisions than giving you insights.
Make sure your _____ allow you to take a useful action.
predictions