Wk3:Chap3 - Data management, big data analytics, and records management Flashcards
Databases?
- Collections of data sets or records stored in a systematic way.
Stores data generated by business apps, sensors, operations, & transaction-processing systems (TPS).
Data Warehouses
Integrate data from multiple databases and data silos, and organize them for complex analysis, knowledge discovery, and to support decision making.
Data Marts
- Small-scale data warehouses that support a single function or one department.
- Enterprises that cannot afford to invest in data warehousing may start with one or more data marts.
Business intelligence (BI)
Tools and techniques that process data and conduct statistical analysis for insight and discovery.
Database Management System (DBMS)
- Integrate with data collection systems such as TPS and business applications.
- Stores data in an organized way.
- Provides facilities for accessing and managing data.
Relational Management System (DBMS)
Provides access to data using a declarative language.
Declarative Language
- Simplifies data access by requiring that users only specify what data they want to access without defining how they will be achieved.
- Structured Query Language (SQL) is an example of a declarative language:
SELECT column_name(s)
FROM table_name
WHERE condition
DBMS Functions
- Data filtering and profiling: Check for errors/ Inconsistencies and redundancies
- Data integrity and maintenance: Consistency
- Data synchronization: Integration
- Data security: Data Integrity over time
- Data access: Authorisation
Latency
The delay or time elapsed between when data is created and when it is available for reporting.
Online Transaction Processing (OLTP)
- DBMSs record and process transactions and supports queries
- Designed to manage transaction data, which are volatile & break down complex information into simpler data tables to strike a balance between transaction-processing efficiency and query efficiency.
Online Analytics Processing (OLAP)
- A means of organizing large business databases.
- Divided into one or more cubes that fit the way business is conducted.
Dirty Data
- Lacks integrity/validation and reduces user trust.
- Incomplete, out of context, outdated, inaccurate, inaccessible, or overwhelming.
- Need for integrity checks
Data Life Cycle: Model illustrating how data travels throughout an organisation
- Principle of Diminishing Data Value
- The value of data diminishes as they age.
- Blind spots (lack of data availability) of 30 days or longer inhibit peak performance.
- Global financial services institutions rely on near-real-time data for peak performance. - Principle of 90/90 Data Use
- As high as 90 percent, is seldom accessed after 90 days (except for auditing purposes).
- Roughly 90 percent of data lose most of their value after 3 months. - Principle of data in context
- The capability to capture, process, format, and distribute data in near real time or faster requires a huge investment in data architecture.
- The investment can be justified on the principle that data must be integrated, processed, analyzed, and formatted into “actionable information.”
Master Reference File and Data Entities:
As data volumes explode database performance degrades.
Solution = Master Data and Master Data Management (see chapter 2)
MDM processes integrate data from a variety of sources to create a more complete view of an entity.
Market share
Percentage of total sales in a market captured by a brand, product, or company.