S2-m5 Flashcards
The Data Life Cycle
Describes the sequential steps all business data must go through from creation, through its use, storage, and final disposal.
1. Definition
2. Capture
3. Preparation
4. Synthesis
5. Analytical and usage
6. Publication
7. Archival
8. Purging
Definition
defining what data a business needs and where to capture or retrieve such data
Capture/Creation
Obtain the data by creating internally or capturing data from where it has been created externally
Cleaning Data
- Remove unnecessary headings or subtotals
- Clean leading zeros and nonprintable characters
- Format negative numbers
- Identify and correct inconsistencies across data in general
- Address inconsistent data types
Preparation
to determine whether the data is complete, clean, encrypted, and user friendly
Embanking Completeness and Integrity of Data
any time data is moved it is possible that some of the required data could have been lost. To validate the captured data:
1. Compare number of records that you intend to capture to the number of records in the source database
2. Compare descriptive statistics for numeric fields
3. Validate that field formats are consistent
4. Compare character limits for the attributes in the source file
Data integration
when data is sourced externally, it is critical to design the data architecture to ensure that the data pipeline is integrated with the target location/database
Data Encryption
The sensitivity of data and the consideration if integrity would generally require encryptions both in data transit and data storage
Synthesis
a bridge between preparation and usage. once you have determined how you intend to use the captured data, you can create calculated fields to prepare that data for quicker usage and analysis
Analytics and Usage
focuses on the data being useful to the internal company-not being shared with external users
Publication
sending monthly statements to clients, publishing financial statements, and sending quotes to customers
Archival
data sets are moved from active systems to passive systems for archiving to free up storage resources for the active systems
Purging
the end of the life cycle occurs when the data is completely removed from the company’s storage system
Extract, Transform, and Load
When data already exists, whether that data is internal or external, the data must be extracted from its original source, transformed into useful information, and loaded into the tool you choose to use for analysis
Active Data Collection
when you directly ask your users for data. This can occur from survey or interview results as well as forms gathering personal information as users emails, phone #