Data Science Methodology Flashcards
IBM Data Science Professional Certificate Course 3 / 10
What is a methodology?
A system of methods used in a particular field of study
What does a methodology include?
- Data collection forms
- Measurement strategies
- Comparisons of data analysis methods
What are the stages of data science methodology?
- Business understanding
- Analytic approach
- Data requirements
- Data collection
- Data understanding
- Data preparation
- Modeling
- Evaluation
- Deployment
- Feedback
What is the cornerstone of success in data science?
Asking questions
What are the data science methodology questions?
- What is the problem you are trying to solve?
- How can you use data to answer the business question?
- What data do you need to answer the question?
- Where is the data sourced from, and how will you receive the data?
- Does the data you collected represent the problem to be solved?
- What additional work is required to manipulate and work with the data
- When you apply data visualisations, do you see answers that address the business problem?
- Does the data model answer the initial business question, or must you adjust the data?
- Can you put the model into practice?
- Can you get constructive feedback from the data and the stakeholder to answer the business question?
What does data science methodology begin with?
Spending time to seek clarification
What is business understanding?
Spending time with stakeholders and clarifying what the problem you are trying to solve is
Why is having a clearly defined question vital in data science?
Because it ultimately directs the analytical approach that will be needed to address the question
What does establishing a clearly-defined goal begin with?
It begins with understanding the goal of the stakeholder asking the question
Example:
If a business owner says “how can we reduce the costs of performing an activity?”
We need to understand if the goal is to:
improve the efficiency of the activity?
or is it to increase business profitability?
Once the goal is clarified, the next piece of the puzzle….
is to figure out the objectives that are in support of the goal.
The analytic approach to a problem depends on…
the question being asked
What is an analytic approach?
It is how you use data to answer a question
If the goal is to determine the probabilities of an action or outcome…
use a predictive model
If the goal is to show relationships…
use a descriptive model
If the question requires a yes / no answer…
use a classification model
What does the correct analytic approach depend on?
It depends on the business requirements for the model
Approach is based on current status
Descriptive approach
Approach is based on what happened, or why is this happening?
Diagnostic approach
Approach is based on what happens if the trends continue or what will happen next
Predictive approach
Approach is based on how you solve something
Prescriptive approach
What does data collection require?
That you know the source, or know where to find the data elements that are needed