16 - Maximizing data science Flashcards

1
Q

What determines whether data science is a source of value?

A

The way in which data science and data scientists are deployed and managed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the concept of the ‘Long Tail’?

A

The idea that in the internet age there are millions of niche markets that can be serviced profitably.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Who is Chris Anderson?

A

The person who introduced the concept of the Long Tail and the idea of the ‘Petabyte age’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does Chris Anderson suggest about the use of data?

A

We no longer need a theory or model; we can just go with what works if we have enough data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is data science?

A

The discipline that combines programming skills and knowledge of business, mathematics, and statistics to extract meaningful insights from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a data warehouse?

A

A store of structured and filtered data used by data scientists to build, test, refine, and run their models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a data lake?

A

A store of raw data, the purpose of which is not yet defined, useful for informing and testing data science models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the first step in starting a data science project?

A

Ensure it has the business backing and adheres to the data strategy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the four Cs that data science can help businesses with?

A
  • Generate cash
  • Customer retention
  • Cost management
  • Care of employees
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What should the head of data science define for better integration with the business?

A

A process to work with data science that is repeatable, scalable, and described in non-technical language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is embedding analytics into production systems a major challenge?

A

It requires models to be tested, documented, and their outputs integrated into operational systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a quick win in data science?

A

Demonstrating value aligned with a strategic priority to showcase the potential of data science.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What should the data science team focus on before gaining insights?

A

Data engineering to ensure data quality and reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the importance of recognizing success in data science?

A

It helps demystify the discipline and builds support across the business.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What role does executive sponsorship play in data science?

A

It engages the executive team to ask critical questions about the purpose and value of data science initiatives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What should be encouraged regarding skepticism about data science output?

A

Encourage skepticism about the output to ensure critical evaluation of its value.

17
Q

Fill in the blank: Data science can be a catalyst for _______.

18
Q

What is a key aspect of data science that the executive team understands?

A

Data science is speculative and does not always turn out as expected.

The executive team is aware of the commitment and potential outcomes involved in data science initiatives.

19
Q

Why should confidence intervals be considered in data analytics?

A

They indicate that forecasts are strong but not the only signals due to the speculative nature of data science.

Informed debate inspired by models can improve decisions and enhance model accuracy.

20
Q

What does embracing the potential for failure in data science entail?

A

Accepting misunderstandings between the data science team and the business as part of the process.

Misunderstandings can arise due to the experimental nature of data science.

21
Q

What are the two well-established processes for delivering technology in the IT department?

A
  • Waterfall process
  • Agile development

Each method has its advantages and is typically used in traditional IT project management.

22
Q

How is data science fundamentally different from traditional IT development processes?

A

Data science is an experiment based on the scientific method rather than a predefined delivery plan.

This includes making observations, creating hypotheses, and testing models.

23
Q

What can be a positive outcome of a data science model demonstrating no value?

A

It can indicate a need to improve data quality or reveal limitations in the model.

For example, not identifying customer defection due to insufficient data is a sign for improvement.

24
Q

What was the outcome of the Netflix Prize competition?

A

The prize was won, but the improvements were not implemented due to cost-benefit considerations.

Netflix opted for simpler solutions to enhance recommendations instead.

25
Q

What is crucial for embedding data science into an organization?

A

Data science should not be viewed as a project that must be implemented at all costs.

Understanding its experimental nature is essential for effective integration.

26
Q

What was the reaction of the CIO when faced with the experimental nature of data science?

A

The CIO wanted the data science leader to resign, considering them a maverick.

The CEO understood the importance of the experimental approach in data science.

27
Q

What is the first step in a successful data science initiative?

A

Defining a leadership hypothesis and process that focuses the output for maximum impact.

Engagement at the executive level is vital for optimizing analytical capability.

28
Q

True or False: Data science can be managed to deliver a predetermined goal or return on investment.

A

False

Data science is inherently experimental and cannot be strictly controlled for specific outcomes.