Lecture Three - Flashcards
Digital Transformation - Definition
Integration of digital technologies in various sectors, transforming traditional business models and processes.
Digital Transformation Key Drivers - World Wide Web/ Internet
Foundation for global connectivity and information dissemination.
Digital Transformation Key Drivers - Cloud Computing
On-demand computing resources offering scalability and flexibility.
Digital Transformation Key Drivers - Smartphones
Ubiquitous mobile devices that facilitate connectivity and application access.
Digital Transformation Key Drivers - Internet of Things
Interconnected devices generating real-time data for analysis and automation.
Digital Transformation Key Drivers - 5G Networks
Advanced mobile communication with high-speed data transfer and low latency.
Digital Transformation Sectoral Impacts - E-commerce, FinTech, E-government:
Revolutionizing business and governance with digital platforms.
Digital Transformation Sectoral Impacts - Industry 4.0/5.0
Advancing manufacturing through automation and data exchange (Cyber-Physical Systems).
Digital Transformation Sectoral Impacts - Circular Economy
Enhancing resource efficiency and sustainability through data-driven asset management.
Digital Transformation Sectoral Impacts - Smart Cities
Integrating infrastructure and services for increased operational efficiency and improved quality of life.
New Emerging Paradigms - Industry 4.0/5.0
Automation and Data Exchange: Leveraging Cyber-Physical Systems to enhance manufacturing processes and productivity.
Impact: Streamlined operations, increased production efficiency, and improved product quality.
New Emerging Paradigms - Circular Economy
Data Utilization: Tracking and managing assets to maximize value and minimize waste through continuous resource upscaling.
Impact: Promotes environmental sustainability by optimizing resource usage and lifecycle management.
New Emerging Paradigms - Smart Cities
Infrastructure Integration: Virtualization and integration of urban services and infrastructure for improved efficiency.
Impact: Enhances urban living through innovative solutions and operational insights.
New Emerging Paradigms - Digital Health
Efficient Healthcare Delivery: Application of digital technologies in medicine for developing new treatments and improving patient care.
Impact: Facilitates aged and assisted living, supports chronic disease management, and enhances patient outcomes.
Big Data - Concept
Refers to large, complex datasets that are challenging to process using traditional data processing methods.
Big Data - Attributes (The 5 V’s)
Velocity: Speed at which data is generated, collected, and processed, often in real-time.
Volume: Massive amounts of data generated from diverse sources, measured in petabytes or exabytes.
Value: Economic and strategic benefits derived from analyzing and utilizing data effectively.
Variety: Diversity of data formats and sources, including structured, semi-structured, and unstructured data.
Veracity: Accuracy, reliability, and trustworthiness of data, which can be affected by factors like data quality and biases.
Big Data - Definition
Large-scale datasets characterized by high complexity and volume, requiring advanced technologies for management and analysis.
Big Data - Key Features
High Throughput Processing: Ability to manage and analyze vast volumes of data efficiently.
Diverse Sources: Data from social media, IoT devices, transaction systems, and more, contributing to a rich but complex data ecosystem.
Big Data - Challenges
Data Management: Storing and organizing data to facilitate easy retrieval and analysis.
Data Quality: Ensuring accuracy and consistency across diverse datasets.
Analytics: Developing methodologies and tools to derive meaningful insights and drive business intelligence.
Data Storage & Processing Over Time - Historical Evolution
3,000 BC: Ancient Egypt’s use of written records for crop storage management, marking the early use of data.
circa 1,450 AD: The printing press revolutionized data dissemination through mass-produced written materials.
1940s: Advent of digital computers with data stored on magnetic tape, requiring sequential reading.
1950s: Introduction of more affordable PCs and the first database systems, enabling broader data management.
1960s: Development of specialized Database Management Systems (DBMS) to enhance data organization.
1970s: Emergence of relational databases offering data independence by separating physical and logical data representations.
1980s: Geographic expansion of businesses led to increased data sources and complexity.
1990s onwards: Big Data emerged as a strategic asset, providing a competitive advantage through advanced analytics.
Data Lake - Definition
A vast repository for raw, unprocessed data without a predefined purpose.
Data Lake - Usage
Ideal for storing diverse data types until specific processing and analysis needs are identified.
Data Warehouse - Definition
A centralized repository organized in a unified data model, designed to aggregate and curate data from multiple sources.
Data Warehouse - Usage
Supports business operations by providing clean, organized data ready for analysis and reporting.
Data Lake and Data Warehouse - Key Distinctions
Data Lakes: Focus on data ingestion and flexibility, allowing for experimentation and discovery.
Data Warehouses: Emphasize data quality, consistency, and integration to support business intelligence activities.
Why Data Warehouse?
Data Redundancy: Consolidates duplicated data across multiple systems and departments, ensuring consistency.
Data Consistency: Provides standardized definitions and formats for uniform data interpretation.
Heterogeneous Data Sources: Integrates data from various sources, including relational DBMS, OLTP systems, and unstructured files.
Data Warehouse - Benefits
Strategic Decision Support: Facilitates comprehensive data analysis to inform business strategy and operations.
Enhanced Data Quality: Ensures data accuracy by addressing issues like missing data and varying formats.
Cross-Functional Analysis: Enables analysis across business functions by providing a unified view of data.
Data Warehouses - Operational Support
Curates data for specific operational systems, such as accounting and billing, ensuring relevant and accurate data is available.
Supports historical analysis by retaining data over time, even as operational systems update or delete records.
Data Warehouses - Data Stability
Data remains stable and non-volatile within the warehouse, providing a consistent historical record.
Allows for longitudinal studies and trend analysis without the risk of data loss from operational changes.
Data Warehouse - Purpose
Supports strategic decision-making and advanced analytics.
Facilitates ad-hoc queries and report generation for business intelligence.
Enables data mining to uncover hidden patterns, correlations, and trends.
Data Warehouses: Data Organisation - Key Characteristics
Subject-Oriented: Focuses on specific business subjects or domains rather than operational processes.
Integrated: Combines heterogeneous data from different sources into a coherent, unified format.
Time-Variant: Maintains historical data, allowing users to analyze changes and trends over time.
Stable: Data is non-volatile, ensuring a consistent and reliable view of business operations.
Decision Support: Structured to facilitate complex queries and analyses for informed decision-making.
Data Warehouses: Architectural Properties - Key Properties
Separation: Distinction between analytical and transactional processing, minimizing interference and maximizing efficiency.
Scalability: Architecture can easily accommodate growth in data volume and complexity through hardware and software upgrades.
Extensibility: Supports the integration of new applications and technologies without extensive system redesign.
Security: Protects sensitive strategic data with robust access controls and monitoring.
Administrability: Designed for efficient management and maintenance, ensuring operational continuity.
Complementary Concepts in Data Warehousing - Data Mart
Definition: A specialized, smaller subset of a data warehouse, focusing on specific business lines or departments.
Purpose: Provides targeted data for specific analyses, improving query performance and relevance.
Complementary Concepts in Data Warehousing - ETL (Extract, Transform, Load)
Extract: Selecting and exporting data from source systems.
Transform: Reformatting data to match the destination system’s requirements, including cleaning and integration.
Load: Importing transformed data into the destination system, such as a data warehouse.
Complementary Concepts: OLTP
OLTP (Online Transaction Processing):
Manages transaction-oriented applications focused on day-to-day operations, such as sales and inventory management.
Characteristics include fast query processing and maintaining data integrity in multi-access environments.
Complementary Concepts: OLAP
OLAP (Online Analytical Processing):
Enables complex analytical queries, supporting multi-dimensional data analysis and business intelligence.
Facilitates decision-making by providing insights into historical performance, trends, and projections.
3-Layer Data Warehouse Architecture - Architecture Layers
Heterogeneous Sources: Diverse data inputs from operational databases and external sources.
ETL Tools: Perform data staging, including redundancy removal, consistency checks, and data normalization.
Reconciled Data: Processed and integrated data stored in a centralized repository for consistency and accessibility.
3-Layer Data Warehouse Architecture - Outputs
Data Marts: Customized subsets of the warehouse for specific business functions.
Analytical Tools: Enable data mining, visualization, and advanced analytics to extract value from stored data.
History of Software Deployment
Evolution Stages:
Monolithic Applications on Physical Machines: Traditional, single-unit applications with limited scalability.
Virtual Machine Abstraction: Enabled resource virtualization, improving flexibility and resource utilization.
Stateless & Horizontally Scalable Apps: Applications designed for distributed systems, enhancing scalability and fault tolerance.
Microservices & Containers: Modern architecture focused on modular, portable, and independent services.
Virtual Machines (VMs):
Each VM includes its own guest OS, application, binaries, and libraries, providing isolation but with high overhead.
Suitable for running multiple OS instances on a single hardware platform, offering strong security and resource management.
Provide isolated environments for multiple applications, each with its own OS, on shared hardware.
Suitable for legacy applications and environments needing strong isolation.
Containers
Share the host OS kernel, containing only the app and its dependencies, reducing overhead and improving efficiency.
Run as isolated processes, offering portability and faster deployment across diverse environments.
Ideal for microservices architectures and cloud-native applications.
Lightweight and efficient, designed for fast deployment and scaling across multiple environments.
Ideal for modern, cloud-based applications where rapid scaling and resource efficiency are priorities.
Bare Metal
Direct deployment on physical hardware, offering maximum performance and resource control.
Best for workloads requiring dedicated resources and high performance.
Software Development Evolution - Transition
From monolithic applications with tightly coupled components to a microservices architecture that emphasizes modularity and independence.
Software Development Evolution - Monoliths
Large, single-codebase applications with all components intertwined, posing challenges in scaling and maintaining.
Often lead to bottlenecks and rigid architectures.
Software Development Evolution - Microservices
Composed of smaller, independent services, each focused on a specific functionality, enhancing agility and scalability.
Supports continuous delivery and deployment, allowing faster innovation and responsiveness to change.
Software Development Evolution - Common Services
Core functionalities are developed as reusable base applications stored in a trusted registry, ensuring consistency and security.
From Monoliths to Microservices - Before
Applications were built as monolithic units with tightly integrated components, making changes complex and risk-prone.
Central IT departments managed infrastructure and deployment processes.
From Monoliths to Microservices - After
Applications transitioned to microservices architecture, breaking down into smaller, autonomous units with clear interfaces.
Teams leverage central IT-maintained registries for infrastructure provisioning and manage their services independently.
The DevOps cycle fosters collaboration and continuous integration/deployment, promoting agility and innovation.
Blockchain - Definition
A decentralized ledger technology that records transactions across multiple computers, ensuring transparency and security.
Blockchain - Applications Beyond Bitcoin
Smart contracts: Automating legal agreements with self-executing code.
Supply chain management: Enhancing traceability and efficiency in logistics.
Identity verification: Securing digital identities and credentials.
Blockchain - Key Feature
Decentralization: Removes the need for a central authority, distributing trust among participants.
Security: Utilizes cryptographic techniques to ensure data integrity and confidentiality.
Transparency: Provides an immutable record of transactions accessible to all network participants.
Open Ledgers - Concepts
A shared record-keeping system where multiple parties maintain synchronized copies of a ledger
Open Ledgers - Scenario
Friends lending money to each other and regularly reconciling transactions, highlighting the challenges of trust and accuracy.
Open Ledgers - Centralized Ledger Challenges
Dependence on a single trusted entity can lead to biases or errors.
Limited scalability and potential bottlenecks in transaction processing.
Open Ledgers - Decentralized Ledgers
Each participant holds a copy of the ledger, reducing reliance on a central authority.
Consensus protocols ensure data integrity and conflict resolution among participants.
Open Distributed Ledgers - Centralized Ledger Limitations
Trustworthiness issues arise when a single party manages the ledger.
Scalability challenges limit transaction throughput and responsiveness.
Open Distributed Ledgers - Decentralized Solution
Each Participant Maintains a Copy: Ensures redundancy and resilience against data loss or manipulation.
Consensus Protocol: Defines rules for synchronizing ledger copies and resolving conflicts, enhancing trust and reliability.
Challenges Addressed By Blocked
Validity of Transactions: Cryptographic techniques verify the authenticity and legitimacy of each transaction.
Timeliness: Ensures a reliable sequence of events, recording transactions in chronological order.
Tamper-Proof: Prevents unauthorized alterations to the ledger, providing an immutable record.
Inconsistencies: Synchronizes distributed copies of the ledger, ensuring consistency and coherence.
Double Spending: Prevents the reuse of digital assets, ensuring each transaction is unique and final.
Bitcoin Whitepaper -
Introduced innovative solutions, such as digital signatures and cryptographic hash functions, to address these challenges.
Asymmetric Cryptography
Uses a pair of keys, public and private, for secure communication and verification.
Private Key: Kept secret, used to sign messages and create digital signatures.
Public Key: Shared publicly, used to verify signatures and authenticate message origin.
Digital Signature Process
Sign(Message, Private Key): Generates a digital signature unique to the message and sender.
Validate(Message, Digital Signature, Public Key): Confirms the authenticity and integrity of the message.
Digital Signatures and Asymmetric Encryption - Security Implications
Ensures non-repudiation, preventing the sender from denying a signed message.
Protects against forgery, with a vast keyspace making brute force attacks infeasible.
Cryptographic Hash Functions - Definition
Functions that map input data to a fixed-length output, known as a hash or digest.
Cryptographic Hash Functions - Key Properties
Consistency: The same input always produces the same hash output.
Irreversibility: Infeasible to reverse-engineer the input from its hash, ensuring one-way security.
Collision Resistance: Low probability of different inputs producing the same hash, enhancing data integrity.
Cryptographic Hash Functions - Common Uses
Data integrity checks and verification.
Digital signatures and certificates.
Secure password storage and retrieval.
Ledgers - Transaction Broadcasting
Participants asynchronously broadcast transactions they wish to record on the ledger.
Transactions must be validated and agreed upon by the network.
Ledgers - Consensus Challenges
Ensuring all participants maintain an identical copy of the ledger.
Determining the order of transactions and resolving any discrepancies.
Legers - Consensus Mechanism
Defines the rules and protocols for achieving agreement on the ledger state.
Ensures trust and integrity across the decentralized network.
Blocks - Batching Transactions
Transactions are grouped into blocks for efficient processing and verification.
Each transaction within a block is digitally signed, ensuring validity and authenticity.
Blocks - Proof of Works (PoW)
A consensus mechanism requiring miners to solve complex mathematical puzzles.
Validates blocks by ensuring the hash meets specific criteria (e.g., a hash starting with 32 zeros).
Balances network security and transaction timeliness, preventing tampering and fraud.
Blocks and Miners - Roles of Miners
Miners validate blocks by solving Proof of Work challenges, using computational power to find solutions.
They listen for broadcasted transactions, batch them into blocks, and verify block validity.
Miners are incentivized with rewards, typically in the form of cryptocurrency, for their efforts.
Blocks and Miners - Proof of Work (PoW)
Regulates the network by adjusting challenge difficulty, ensuring consistent block generation times.
Provides security by requiring significant computational resources to alter the blockchain.
Blockchains - Structure and Functionality
Blocks are linked sequentially, each containing a hash of the previous block to form a continuous chain.
Ensures chronological integrity and prevents unauthorized changes.
Blockchains - Tamper Resistance
Any alteration to a block requires recalculating the hashes of all subsequent blocks.
An attacker must outpace the network’s combined computational power, known as a 51% attack, to succeed.
Blockchains - Use Cases
Secure financial transactions.
Decentralized applications and smart contracts.
Transparent and verifiable supply chain tracking.
Resolving Conflicts in Blockchain - Decentralized Network Challenges:
Different segments of the network may work on separate branches or forks of the blockchain.
Forks can occur when two miners solve a block simultaneously, leading to multiple valid chains.
Resolving Conflicts in Blockchain - Longest Chain Rules
The network resolves conflicts by committing to the longest chain, representing the most computational work.
Ensures consistency and integrity across the distributed ledger.
Resolving Conflicts in Blockchain - Implications
Encourages cooperation among miners to maintain a single, cohesive chain.
Reduces the risk of data loss or divergence across the network.
Block Chain Concept - Transaction Validation
Uses digital signatures to confirm transaction authenticity and origin.
Block Chain Concept - Block Formation
Transactions are batched into blocks, each containing multiple verified entries.
Block Chain Concept - Proof of Work
Miners solve cryptographic puzzles to validate blocks and ensure network security.
Block Chain Concept - Network Consensus
Miners agree on a single, unified blockchain, maintaining ledger integrity.
Blockchain Concepts - Conflicts Resolution
Longest chain is accepted as the authoritative record, resolving discrepancies.
Remarks on Blockchain - Canonical Blockchain Structure
The foundational model for decentralized ledgers, closely associated with Bitcoin.
Remarks on Blockchain - Variations and Adaptions
Different blockchain implementations may alter roles, consensus mechanisms, and applications.
Tailored to specific use cases, from cryptocurrency to supply chain management and identity verification.
Remarks on Blockchain - DLT vs. Blockchain
All blockchains are types of distributed ledgers, but not all distributed ledgers use blockchain technology.
Other DLTs may offer alternative structures or consensus models suited to particular requirements.
Blockchain Variations by Membership - Permissionless Blockchains
Open to the public, allowing any node to participate in the network.
Nodes have equal rights for reading, writing, and verifying transactions.
Examples include Bitcoin and Ethereum, fostering inclusivity and transparency.
Blockchain Variations by Membership - Permissioned Blockchains
Restricted access, often used by specific organizations or consortia.
Nodes have varying rights based on their role and the network’s governance model.
Ideal for enterprise applications requiring privacy and control.
Blockchain Variations by Membership - Hybrid Blockchains
Combine elements of both permissionless and permissioned blockchains.
Selected nodes participate in consensus, while others have limited roles.
Enables collaboration across multiple organizations with shared objectives.
Blockchain Variations by Consensus Mechanisms - Proof of Work (PoW)
Requires significant computational effort to validate transactions, ensuring security but consuming substantial resources.
Resolves conflicts by committing to the chain with the most cumulative work.
Blockchain Variations by Consensus Mechanisms - Proof of Stake (PoS)
Validators commit a certain amount of cryptocurrency to secure the network, with rewards based on their stake.
More energy-efficient and encourages long-term network health.
Blockchain Variations by Consensus Mechanisms - Delegated Proof of Stake (DPoS)
Members vote for delegates to manage blockchain operations, incentivizing fair governance.
Balances efficiency with community engagement and representation.
Blockchain Variations by Consensus Mechanisms - Other Mechanisms
Proof of Capacity, Proof of Elapsed Time, and more, each with unique advantages and trade-offs.
Blockchain Variations by Implementation - Bitcoin
The first blockchain implementation and cryptocurrency, offering decentralized transaction processing without intermediaries.
Features a large market capitalization and widespread recognition.
Blockchain Variations by Implementation - Ethereum
Supports smart contracts and decentralized applications via the Ethereum Virtual Machine (EVM).
Enables programmable, automated transactions with diverse applications.
Blockchain Variations by Implementation - Hyperledger
An open-source collaboration led by the Linux Foundation, offering tools and frameworks for blockchain-based solutions.
Focuses on enterprise use cases and interoperability.
Blockchain Variations by Implementation - IoTA
Designed for the Internet of Things (IoT), using a unique approach with lightweight Proof of Work.
Lacks specialized miners, with blocks generated by a central node, emphasizing scalability and efficiency.
Layers of Blockchain Technology - Application Layer
Interfaces and applications built on blockchain, providing user interactions and functionalities.
Layers of Blockchain Technology - Contract Layer
Execution of smart contracts and business logic, enabling automated processes.
Layers of Blockchain Technology - Consensus Layer
Protocols that ensure agreement among nodes on the blockchain state.
Layers of Blockchain Technology - Network Layer
Manages communication and data exchange among network nodes.
Layers of Blockchain Technology - Data Layer
Structures and stores blockchain data, ensuring integrity and accessibility.