General Enterprise Data Flow Flashcards
How does data get transformed into information in an enterprise
It is data sources to enterprise data warehouse then reporting and analysis tools
The three system landscape
Data sources
enterprise data warehouse
reporting and analysis tools
Three data sources
Erp systems
other databases
flat files
All of an enterprise data is fed into this and all reports are obtained directly from it
Erp systems
They are still machines with their limited capacity and purchasing power
Erp systems
They might also not be able to address and enterprise needs and the company grows larger
Erp systems
It makes extensive use of master data to help keep track of business partners and items
Erp systems
When they can’t use erp system this will serve at the database that records all transactions for the day
Other databases
If a branch of the company might be physically impossible to connect to the corporate network so they will use this
Other databases
There are some instances where a branch is in such a remote location that an internet connection is not available transactions will be recorded in this
Flat files
These are usually excel or delimited text files that business users create in order to make their own reports when needed
Flat files
Delimited text files are usually either
Tab-delimited
comma separated value
Delimited text files can still be opened in excel but this file might need a few extra steps before it can be read
Tab-delimited files
This is built in order to consolidate the disparate data sources so that only the data necessary for reporting will actually be used
Enterprise data warehouse
This is a tool to help build data warehouses
SAP business warehouse
This is essentially a large database it is likely that technical column names are still use instead of more common
Enterprise data warehouse
It is set up as a sort of translator so that business user can immediately understand what the data is
Semantic layer
This is one of the defining features today in business analytics
Self-service BI
These are created with an easy-to-understand interface so that the business user will be empowered to create their own reports
Reporting tools
The three-tier architectures
Development DEV
quality assurance QAS
production PRD
This is the most critical of the tree as it contains live data
Production
It is the system that is used in the day-to-day transactions of the company
Production
A lot of redundancies might be required for this landscape as it is needed for the proper function of the enterprise
Production
When a new report needs to be created or a change in configuration needs to be made it should be done here first
Development
This is essentially a copy of production that is placed separate from the other three landscapes
Disaster recovery
It will act as a contingency when production becomes subject to catastrophic failure
Disaster recovery
The five data reliability issues
Inconsistent terminology
rounding errors and truncation
nulls and zeros
incorrect inputs
outright data discrepancies
A department might refer to an sku as a product and another might refer to it as material
Inconsistent terminology
It consider the number of decimal places a given piece of numeric data has
Rounding errors
It will have the same effect in rounding errors but instead of rounding the number decimal places are outright omitted
Truncation
This is where the concept of garbage in garbage out is very apparent
Incorrect input
It is the first data model that can be fully described mathematically
relational model
It is the most common way to store and access enterprise data as it uses some form of structured query language
Relational model
It records all transactions that is included into the system
TXN
It stores all customer information
CUS_MAS
it stores all product information
PROD_MAS
It is a representation of the abstract structure of domain information
Schema or logical data model
It is often expressed as a diagram and its use as foundation to designing database structures
Schema or logical data model
It is the simplest approach used in designing enterprise data warehouse
Star schema
It is comprised of fact table referencing any number of dimension tables
Star schema
It records measurements for a specific event
Fact table
These are typically referred to as transaction tables that contain very granular numeric data
Fact table
Fact table is also referred to as
Transaction tables
It contains less records than fact tables
Dimension table
They don’t contain transactions rather they contain descriptive information like customer information addresses age
Dimension table
The data contained in the dimension table are sometimes referred as
Master data
This ensured that each row of data within the table is unique
keys
keys can be
primary or foreign