02. Data Analytics Lifecycle Flashcards
List the 6 steps in the Data Analytics Lifecycle
Discovery Data Preparation Model Planning Model Build Communicate Results Operationalise
Describe the Discovery Phase in the Data Analytics Lifecycle
In Phase 1, the team learns the business domain, including relevant history such as whether the organisation or business unit has attempted similar projects in the past from which they can learn. The team assesses the resources available to support the project in terms of people, technology, time, and data. Important activities in this phase include framing the business problem as an analytics challenge that can be addressed in subsequent phases and formulating initial hypotheses (IHs) to test and begin learning the data.
Describe the Data Preparation Phase in the Data Analytics Lifecycle
Phase 2 requires the presence of an analytic sandbox, in which the team can work with data and perform analytics for the duration of the project. The team needs to execute extract, load, and transform (ELT) or extract, transform and load (ETL) to get data into the sandbox. The ELT and ETL are sometimes abbreviated as ETLT. Data should be transformed in the ETLT process so the team can work with it and analyze it. In this phase, the team also needs to familiarize itself with the data thoroughly and take steps to condition the data
Describe the Model Planning Phase in the Data Analytics Lifecycle
Phase 3 is model planning, where the team determines the methods, techniques, and workflow it intends to follow for the model building phase. In this phase the data science team identifies candidate models to apply to the data for clustering, classifying, or finding relationships in the data depending on the goal of the project. The team explores the data to learn about the relationships between variables and subsequently selects key variables and the most suitable models.
Describe the Model Building Phase in the Data Analytics Lifecycle
In Phase 4, the team develops datasets for testing, training, and production purposes. In addition, in this phase the team builds and executes models based on the work done in the model planning phase. The team also considers whether its existing tools will suffice for running the models, or if it will need a more robust environment for executing models and workflows (for example, fast hardware and parallel processing, if applicable).
Describe the Communicate Result Phase in the Data Analytics Lifecycle
In Phase 5, the team, in collaboration with major stakeholders, determines if the results of the project are a success or a failure based on the criteria developed in Phase 1. The team should identify key findings, quantify the business value, and develop a narrative to summarize and convey findings to stakeholders.
Describe the Operationalise Phase in the Data Analytics Lifecycle
In Phase 6, the team delivers final reports, briefings, code, and technical documents. In addition, the team may run a pilot project to implement the models in a production environment.