Tutorial 6 Flashcards
Data warehouse
A data warehouse is a centralized repository that integrates data from multiple sources to
support decision-making processes. It provides a single version of the truth, ensuring
consistency and reliability in business intelligence (BI) applications.
Purpose of Data Warehousing:
●Supports strategic decision-making by consolidating data from diverse business
functions.
●Enhances reporting, analytics, and forecasting.
●Enables organizations to perform trend analysis and business intelligence (BI).
Operational Data Store (ODS):
An Operational Data Store (ODS) consolidates data from multiple transactional systems to
provide a near real-time view of current business operations.
Data Warehouse vs. Operational Data Store
Data Type: Historical vs. Current, near real-time
Purpose: Strategic analysis vs. Operational reporting
Data Refresh: Periodic (daily/monthly) vs. Continuous or frequent
Storage Duration: Long-term vs. Short-term (days/weeks)
Metadata
⇒ Information that describes and provides context for other data. It helps IT personneöl and
end-users understand and work with the data stored in a warehouse. It provides essential
details like:
●Where and when the data was extracted
●Data transformation rules
●Scheduled reports and queries associated with the dataset
Key Characteristics Data Warehouse(Inmon, 1992):
- Subject-Oriented
●Organized around key business subjects such as sales, customers, or
inventory rather than specific processes like order entry. - Integrated
●Combines data from multiple sources, standardizing formats, naming
conventions, and coding structures. - Time-Variant
●Stores historical data to allow trend analysis over time. - Non-Volatile
●Data is read-only for users; c
Purpose of Data Warehousing:
●Supports strategic decision-making by consolidating data from diverse business
functions.
●Enhances reporting, analytics, and forecasting.
●Enables organizations to perform trend analysis and business intelligence (BI).
Data warehouses can be developed using two major approaches:
- Top-Down Approach (Enterprise Data Warehouse) – Bill Inmon
lement. - Bottom-Up Approach (Data Mart Strategy) – Ralph Kimball
- Top-Down Approach (Enterprise Data Warehouse) – Bill Inmon
●Starts with an enterprise-wide data warehouse.
●Creates dependent data marts from a single repository.
●Ensures data consistency but takes longer to implement.
- Bottom-Up Approach (Data Mart Strategy) – Ralph Kimball
●Begins with individual data marts for specific business areas.
●Marts are later integrated into an enterprise warehouse.
●Faster implementation but requires careful planning to avoid “data silos.”
Hybrid Approach to data warehouse development
Combines elements of both strategies, ensuring flexibility while maintaining data
integrity.
What is ETL (extraction, transformation, loading)?
Definition:
ETL is the process of extracting data from source systems, transforming it into a usable
format, and loading it into a data warehouse.
ETL Phases:
- Extraction – Data is pulled from multiple sources (databases, CRM, ERP, etc.).
- Transformation – Data is cleansed, standardized, and aggregated.
- Loading – Transformed data is stored in the data warehouse.
Data Marts:
A data mart is a subset of a data warehouse that focuses on a specific business unit, such
as marketing or finance.
Types of Data Marts:
●Independent Data Mart: Built directly from operational systems without a central
data warehouse.
●Dependent Data Mart: Extracts data from an existing data warehouse, ensuring
consistency.
Advantages of Data Marts:
●Faster implementation and lower cost.
●Tailored to specific business needs.
●Improved query performance due to a smaller dataset.
What is data mining?
Data mining is the process of discovering patterns, relationships, and trends in large
datasets using statistical and machine learning techniques.
Data Mining Techniques:
- Classification – Assigning data into predefined categories (e.g., spam detection).
- Clustering – Grouping similar records together (e.g., customer segmentation).
- Association Rule Mining – Identifying relationships between variables (e.g., market
basket analysis). - Regression Analysis – Predicting numerical outcomes (e.g., sales forecasting).
- Anomaly Detection – Identifying unusual patterns (e.g., fraud detection).
Applications of Data Mining:
●Retail: Recommender systems (Amazon, Netflix).
●Finance: Credit risk assessment and fraud detection.
●Healthcare: Predicting disease outbreaks and treatment effectiveness.
The evolution of IS success has been categorized into five distinct eras, each representing a
shift in how information systems were utilized and evaluated:
- Data Processing Era (1950-1960)
- Management Reporting and Decision Support Era (1960-1980)
- Strategic and Personal Computing Era (1980-1990)
- Enterprise System and Networking Era (1990-2000)
- Customer-Focused Era (2000-Present)
- Data Processing Era (1950-1960)
○Focus: Automating simple computational tasks.
○Success Measurement: Technical efficiency, speed, and accuracy.
○Users: Military and financial sectors.
- Management Reporting and Decision Support Era (1960-1980)
○Focus: Information systems for managerial decision-making.
○Success Measurement: Decision-making effectiveness and cost reduction.
○Users: Managers using structured reports and decision-support tools.
- Strategic and Personal Computing Era (1980-1990)
○Focus: Aligning IT with business strategy and increasing personal
productivity.
○Success Measurement: Strategic alignment, productivity gains, and
competitive advantage.
○Users: Employees and managers leveraging personal computing
- Enterprise System and Networking Era (1990-2000)
○Focus: Large-scale enterprise resource planning (ERP) and networking
technologies.
○Success Measurement: System integration, operational efficiency, and net
organizational benefits.
○Users: Organizations adopting ERP, CRM, and other enterprise-wide
solutions.
- Customer-Focused Era (2000-Present)
○Focus: Enhancing customer experience and creating social value.
○Success Measurement: Customer satisfaction, user engagement, and
business intelligence.
○Users: Customers, employees, and organizations using cloud-based and
AI-driven solutions
Escalation of commitment refers to the tendency to continue investing in failing projects due
to psychological, social, and organizational factors.
(a) Self-Justification Theory (SJT)
●Tend to escalate their commitment to a course of action to justify prior behavior
●Based on cognitive dissonance theory, individuals continue investing in failing
projects to justify past decisions.
●The need for self-justification is both psychological (to maintain self-image) and
social (to save face in front of others).
●Leaders or managers who initially championed the project feel personally responsible
and are reluctant to admit failure .
Escalation of commitment refers to the tendency to continue investing in failing projects due
to psychological, social, and organizational factors.
(b) Prospect Theory
● Cognitive bias that influence human decision-making under the conditions of uncertainty
●Individuals are more likely to take risks when faced with potential losses.
●Decision-makers continue funding failing projects to avoid immediate financial or
reputational losses, even if this leads to greater long-term losses.
●This behavior is linked to the sunk cost fallacy, where past investments influence
decision-making irrationally
●Throw ‘good money after bad’
Escalation of commitment refers to the tendency to continue investing in failing projects due
to psychological, social, and organizational factors.
(c) Agency Theory
●Explains escalation through the principal-agent problem.
●Managers (agents) may continue projects that are not in the best interests of
shareholders (principals) due to asymmetry in information and personal
incentives (e.g., career growth, bonuses, reputation).
●If a project failure reflects poorly on the agent, they may hide negative information
from executives or justify continued investments .
Escalation of commitment refers to the tendency to continue investing in failing projects due
to psychological, social, and organizational factors.
(d) Approach-Avoidance Theory
●Explains escalation as a conflict between driving forces (e.g., sunk costs, desire to
complete) and restraining forces (e.g., risks, negative feedback).
●As a project nears completion, individuals feel a stronger motivation to see it
through (completion effect), even if it is failing.
●This explains why organizations continue failing projects instead of cutting losses
early
●Cost of persistence and cost of abandonment