Data Science Methodology Flashcards

IBM Data Science Professional Certificate Course 3 / 10

1
Q

What is a methodology?

A

A system of methods used in a particular field of study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a methodology include?

A
  • Data collection forms
  • Measurement strategies
  • Comparisons of data analysis methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the stages of data science methodology?

A
  1. Business understanding
  2. Analytic approach
  3. Data requirements
  4. Data collection
  5. Data understanding
  6. Data preparation
  7. Modeling
  8. Evaluation
  9. Deployment
  10. Feedback
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the cornerstone of success in data science?

A

Asking questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the data science methodology questions?

A
  1. What is the problem you are trying to solve?
  2. How can you use data to answer the business question?
  3. What data do you need to answer the question?
  4. Where is the data sourced from, and how will you receive the data?
  5. Does the data you collected represent the problem to be solved?
  6. What additional work is required to manipulate and work with the data
  7. When you apply data visualisations, do you see answers that address the business problem?
  8. Does the data model answer the initial business question, or must you adjust the data?
  9. Can you put the model into practice?
  10. Can you get constructive feedback from the data and the stakeholder to answer the business question?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does data science methodology begin with?

A

Spending time to seek clarification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is business understanding?

A

Spending time with stakeholders and clarifying what the problem you are trying to solve is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is having a clearly defined question vital in data science?

A

Because it ultimately directs the analytical approach that will be needed to address the question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does establishing a clearly-defined goal begin with?

A

It begins with understanding the goal of the stakeholder asking the question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Example:
If a business owner says “how can we reduce the costs of performing an activity?”

A

We need to understand if the goal is to:
improve the efficiency of the activity?
or is it to increase business profitability?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Once the goal is clarified, the next piece of the puzzle….

A

is to figure out the objectives that are in support of the goal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The analytic approach to a problem depends on…

A

the question being asked

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an analytic approach?

A

It is how you use data to answer a question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If the goal is to determine the probabilities of an action or outcome…

A

use a predictive model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If the goal is to show relationships…

A

use a descriptive model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If the question requires a yes / no answer…

A

use a classification model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does the correct analytic approach depend on?

A

It depends on the business requirements for the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Approach is based on current status

A

Descriptive approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Approach is based on what happened, or why is this happening?

A

Diagnostic approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Approach is based on what happens if the trends continue or what will happen next

A

Predictive approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Approach is based on how you solve something

A

Prescriptive approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does data collection require?

A

That you know the source, or know where to find the data elements that are needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When collecting data, it is alright to defer decisions about unavailable data, and attempt to acquire it at a later stage.

True or False?

A

True

24
Q

What does the chosen analytic approach determine?

A

The data requirements

25
Q

Data Requirements stage tasks include…

A

identifying the correct and necessary data content, data formats, and data sources for the specific analytical approach.

26
Q

During the Data Collection stage,…

A

expert data scientists meticulously revise data requirements and make critical decisions regarding the quantity and quality of data.

27
Q

Who determines how to collect and prepare data?

A

Data scientists

28
Q

What does data understanding seek to answer?

A

Is the data collected representative of the problem to be solved?

29
Q

What are histograms good for in data understanding?

A
  • understanding how values or variables are distributed
  • understanding what data preparation might be needed to make the variable more useful
30
Q

The Data Understanding Stage encompasses…

A

sorting the data

31
Q

What is the most time-consuming phase of a Data Science project?

A

Data preparation

32
Q

What does the data preparation stage seek to answer?

A

What are the ways in which data is prepared?

33
Q

In order to work effectively with data, it must be…

A

prepared in a way that addresses missing or invalid values and removes duplicates

34
Q

What is feature engineering?

A

The process of using domain knowledge of the data to create features that make the machine learning algorithms work

35
Q

When is feature engineering critical?

A

When machine learning tools are being applied to analyse the data

36
Q

Why is text analysis during the data preparation stage critical?

A

It is critical in validating that the proper groupings are set and that the programming is not overlooking hidden data.

37
Q

What question does modeling seek to answer?

A

In what way can the data be visualised to get to the answer that is required?

38
Q

What question does evaluation seek to answer?

A

Does the model used really answer the initial question or does it need to be adjusted?

39
Q

What models does data modeling focus on developing?

A

Descriptive or prescriptive models

40
Q

What do descriptive models try and answer

A

If a person did this, then they’re likely to prefer that

41
Q

What do prescriptive models try to yield?

A

They try to yield yes/no or stop/go type outcomes

42
Q

What do data scientists use for predictive modelling?

A

A training set

43
Q

What is a training set?

A

A set of historical data in which the outcomes are already known

44
Q

What is crucial to the success of data compilation, preparation, and modelling

A

It depends on understanding the problem at hand, and the appropriate analytical approach being taken

45
Q

What does the data support?

A

The answering of the question as well as setting the stage for the outcome

46
Q

What is the end goal of John Rollins’ descriptive Data Science Methodology?

A

Moving the data scientist to a point where a data model can be built to answer the question

47
Q

John Rollins’ descriptive Data Science Methodology is geared towards doing what 3 things?

A
  1. Understanding the question at hand
  2. Selecting an analytic approach or method to solve the problem
  3. Obtain, understand, prepare, and model the data
48
Q

Why are the modeling and evaluation stages done iteratively?

A

Because a model evaluation goes hand-in-hand with model building

49
Q

When is model evaluation performed?

A

During model development and before the model is deployed

50
Q

Why is evaluation necessary?

A

It allows the quality of the model to be assessed and it also serves an opportunity to see if the model meets initial request

51
Q

What question is answered by evaluation?

A

Does the model used really answer the initial question or does it need to be adjusted?

52
Q

What are the two main phases of model evaluation?

A
  1. Diagnostic measures phase
  2. Statistical significance testing
53
Q

What are diagnostic measures used for?

A

To ensure the model is working as intended

54
Q

In the diagnostic measures phase, if a model is prescriptive,…..

A

a decision tree can be used to evaluate if the answer the model can output is aligned to the initial design.

55
Q

In the diagnostic measures phase, if the model is descriptive,…

A

a testing set with known outcomes can be applied, and the model can be refined as needed

56
Q

Why is statistical significance testing applied to models?

A

So you can ensure that the data is being properly handled and interpreted within the model

57
Q
A