ITEC79 (Ma'am Karen) Flashcards
is composed of individual discreet facts that collect
descriptive, quantitative, and qualitative value of business
interests.
Data
produced by corporate
applications, such as the one used to fill customer orders
for its products or the one used to manage financial
transactions.
Run-the-Business Data
built to improve the quality
of and synchronize two or more applications, such as a
master list of customers.
Integrate-the-Business Data
presented to end users for
reporting and decision support, such as financial
dashboards.
Monitor-the-Business Data
is an organized collection of data presented
in a specific and meaningful way
Information
it encompasses the familiarity, awareness,
understanding, and perceptions of a person about a given
subject
Knowledge
is the process of doing something
Action
is a subject-oriented, integrated, non-
volatile, time-variant collection of data in support of management’s decisions.
- is a powerful database model that
significantly enhances the user’s ability to quickly analyze
large, multidimensional data sets. It cleanses and
organizes data to allow users to make business decisions
based on facts.
◼ is a collection of integrated, subject
oriented databases designed to support the decision
support function where each unit of data is relevant to
some moment of time.
data warehouse
refers to de-duplicating information and
merging it from many sources into one consistent definition
Integrated Data
is a way of storing data and creating information through
leveraging data marts.
Data Warehousing
are segments or categories
of information and/or data that are grouped together to provide
insights into that segment or category.
Data Mart
is the leveraging of a data warehouse to help make business
decisions and recommendations.
◼ Business Analytics
A system that keeps track of an organization’s daily
transactions and updates the warehouse at periodic intervals
On-Line Transaction Processing
A technology that uses a multi-dimensional view of aggregate
data to provide quick access to strategic information for further
analysis
Enables end-users to perform ad-hoc analysis of data in
multiple dimensions, thereby providing the insight and
understanding they need for better decision making
OLAP: On-Line Analytical Processing
A subject-oriented, integrated, volatile, current-valued data
store containing only corporate detailed data
ODS: Operational Data Store
of a data warehouse can
be legacy data sources,
is either pushed or pulled
into the landing area in a pre-determined
format from respective source systems.
Source Systems
is a volatile intermediate
area for operational data before
transformation takes place.
Landing Area
is a place where you
hold temporary tables on the data
warehouse server.
Staging Area
is defined as standardizing and
consolidating customer and/or business
data.
Data quality
is
a set of software design patterns
used to determine the data that
has changed in a database so that
action can be taken on the
changed data.
-occur mostly in data
warehouse environments
Change Data Capture (CDC)
This involves analysis of metadata and
data values, and detection of
differences between defined and
inferred properties.
Analyze the Data
is a process to assess
current data conditions, or to monitor
data quality over time. It begins with
collecting measurements about data,
and then looking at the results
individually and in various combinations
to see where anomalies exits.
Profile the Data / Data profiling
is the act of detecting and
correcting (or removing) corrupt or
inaccurate records from a record set,
table, or database.
Cleanse the Data / Data cleansing
This involves integration and
consolidation of data from various
source systems to form a single system
of record.
Integrate the Data
transforms
different input formats into a
consolidated output format; helps in:
creating single domain fields,
incorporating business, industry
standards.
Standardize the Data
Is a subject-oriented, integrated, volatile,
current-valued, detailed-only collection of data
in support of an organization’s need for up-to-
the second, operational, integrated, collective
information
Operational Data Store
Is a body of decision-support data for a
department that has an architectural foundation
of a data warehouse; can also represent a
business process that can proliferate across
many departments
Data Mart
is defined as the extensive use of
data, statistical and quantitative analyses,
explanatory and predictive modeling, and
fact-based management to drive decision
making.
Analytics
provides information about
the past state or performance of a business and
its environment. It provides regular reports for
events that already happened and ad hoc
reports to help examine facts about what
happened, where, how often, and with how
many.
Descriptive analytics
helps predict (based on
data and statistical techniques) with confidence
what will happen next so that you can make
well-informed decisions and improve business
outcomes. It uses simulation models to suggest
what could happen.
Predictive analytics
recommends high-value
alternative actions or decisions given a complex
set of targets, limits, and choices. It predicts
future outcomes and suggests courses of
actions to take so that you can benefit from
those predictions.
Prescriptive analytics
is “data about data.” It refers to data that tries to
describe a data set in terms of its value, content, quality,
and significance.
Metadata
a data warehouse may be as broad as all the
informational data for the entire enterprise from the beginning
of time, or it may be as narrow as a personal data warehouse
for a single manager for a single year.
Scope
allow end users to get at operational
databases directly; it provides the ultimate in flexibility as well
as the minimum amount of redundant data that must be loaded
and maintained.
Virtual data warehouses
are single physical databases that
contains all data for a specific functional area, department,
division, or enterprise.
Central data warehouses
are those in which certain
components are distributed across a number of different
physical databases.
Distributed data warehouses
Is the specification of data structures and business rules to
represent business requirements
Data Model
Is a structured approach used to identify major components of an
information system’s specifications
◼ Is the process used to analyze the data, identify the relationships, and,
ultimately, create the data model
Data Modeling
is a structured business view
of the data required to support current business processes,
business events, and related performance measures
It is a single integrated data structure which reflects the
structure of business functions rather than the processing flow
or the physical arrangement of data
Conceptual Data Model (CDM)
builds upon the business
requirements and includes a further level of detail that supports
both the business and system requirements
Logical Data Model (LDM)
is specific to the software and
performance constraints of the specific database management
system to be used in the implementation
Physical Data Model (PDM)
is a logical design technique for structuring data so that it’s
intuitive to business users and delivers fast query performance.
is widely accepted as the preferred approach for data
warehouse presentation.
Dimensional Modeling
is quite different from dimensional modeling.
a design technique that seeks to eliminate data redundancies.
Normalized modeling
are captured by the organization’s
business processes and their supporting operational
source systems.
- are usually numeric values; we refer to
them as facts.
Measurements
are surrounded by largely textual context that is true
at the moment the fact is recorded.
serve as the Key Performance
Indicators (KPI) of the organization.
Fact
is the business
definition of the measurement event that produces the
fact row.
fact table’s grain (granularity)
provide descriptive information about the fact.
- are composed of
attributes which are used for filtering
or labeling data within data
warehouse queries.
Dimensions