16 - Maximizing data science Flashcards
What determines whether data science is a source of value?
The way in which data science and data scientists are deployed and managed.
What is the concept of the ‘Long Tail’?
The idea that in the internet age there are millions of niche markets that can be serviced profitably.
Who is Chris Anderson?
The person who introduced the concept of the Long Tail and the idea of the ‘Petabyte age’.
What does Chris Anderson suggest about the use of data?
We no longer need a theory or model; we can just go with what works if we have enough data.
What is data science?
The discipline that combines programming skills and knowledge of business, mathematics, and statistics to extract meaningful insights from data.
What is a data warehouse?
A store of structured and filtered data used by data scientists to build, test, refine, and run their models.
What is a data lake?
A store of raw data, the purpose of which is not yet defined, useful for informing and testing data science models.
What is the first step in starting a data science project?
Ensure it has the business backing and adheres to the data strategy.
What are the four Cs that data science can help businesses with?
- Generate cash
- Customer retention
- Cost management
- Care of employees
What should the head of data science define for better integration with the business?
A process to work with data science that is repeatable, scalable, and described in non-technical language.
Why is embedding analytics into production systems a major challenge?
It requires models to be tested, documented, and their outputs integrated into operational systems.
What is a quick win in data science?
Demonstrating value aligned with a strategic priority to showcase the potential of data science.
What should the data science team focus on before gaining insights?
Data engineering to ensure data quality and reliability.
What is the importance of recognizing success in data science?
It helps demystify the discipline and builds support across the business.
What role does executive sponsorship play in data science?
It engages the executive team to ask critical questions about the purpose and value of data science initiatives.
What should be encouraged regarding skepticism about data science output?
Encourage skepticism about the output to ensure critical evaluation of its value.
Fill in the blank: Data science can be a catalyst for _______.
[change]
What is a key aspect of data science that the executive team understands?
Data science is speculative and does not always turn out as expected.
The executive team is aware of the commitment and potential outcomes involved in data science initiatives.
Why should confidence intervals be considered in data analytics?
They indicate that forecasts are strong but not the only signals due to the speculative nature of data science.
Informed debate inspired by models can improve decisions and enhance model accuracy.
What does embracing the potential for failure in data science entail?
Accepting misunderstandings between the data science team and the business as part of the process.
Misunderstandings can arise due to the experimental nature of data science.
What are the two well-established processes for delivering technology in the IT department?
- Waterfall process
- Agile development
Each method has its advantages and is typically used in traditional IT project management.
How is data science fundamentally different from traditional IT development processes?
Data science is an experiment based on the scientific method rather than a predefined delivery plan.
This includes making observations, creating hypotheses, and testing models.
What can be a positive outcome of a data science model demonstrating no value?
It can indicate a need to improve data quality or reveal limitations in the model.
For example, not identifying customer defection due to insufficient data is a sign for improvement.
What was the outcome of the Netflix Prize competition?
The prize was won, but the improvements were not implemented due to cost-benefit considerations.
Netflix opted for simpler solutions to enhance recommendations instead.
What is crucial for embedding data science into an organization?
Data science should not be viewed as a project that must be implemented at all costs.
Understanding its experimental nature is essential for effective integration.
What was the reaction of the CIO when faced with the experimental nature of data science?
The CIO wanted the data science leader to resign, considering them a maverick.
The CEO understood the importance of the experimental approach in data science.
What is the first step in a successful data science initiative?
Defining a leadership hypothesis and process that focuses the output for maximum impact.
Engagement at the executive level is vital for optimizing analytical capability.
True or False: Data science can be managed to deliver a predetermined goal or return on investment.
False
Data science is inherently experimental and cannot be strictly controlled for specific outcomes.