chapter 2 Flashcards
Give an overview for The architecture of a system?
- outlines its structure, components, their functions, and interactions.
- It involves specifying modules, interfaces, and interrelationships in terms of data within the system
What is A Reference Architecture?
- Are developed by standard developers, to define standardized interfaces based on goals and context
What are the Three “reference” architectures for a distributed DBMS?
- client/server systems,
- peer-to-peer distributed DBMS,
- and multi-database systems.
What is the ANSI/SPARC architecture?
- Are the efforts of the Standards Planning and Requirements Committee (SPARC), that determine which aspects of database management systems should be standardized, if feasible
ANSI/SPARC architecture is a ______ approach to defining the architecture of a DBMS?
data logical
What is the focus of the ANSI/SPARC architecture?
it focuses on the different user classes and roles and their varying views on data.
What are the 3 levels / schemas of the ANSI/SPARC architecture?
- External Schema (User View)
- Conceptual Schema (Community View)
- Internal Schema (Physical View)
Explain the External Schema (User View)?
- IS the way data is viewed by individual users or groups of users.
- Defines how different user classes or roles perceive the data they are interested in.
Explain the Conceptual Schema (Community View)?
It provides an abstract and high-level view of the entire database, independent of the specific details of how data is stored or accessed.
Explain the Internal Schema (Physical View)?
- Is the physical implementation of the database on the storage media.
- It defines how data is stored, indexed, and organized.
The separation of the external schemas from the conceptual schema enables _______ ?
logical data independence
The separation of the conceptual schema from the internal schema allows _______ ?
physical data independence
query processing, can be time-consuming, especially for complex queries.
True
SQL query itself specifies what data is needed, and specifies the optimal strategy for retrieving that data.
False
- it does not dictate the optimal strategy
What are the three key characteristics to classify the architecture schema of a DDBMS?
(1) the autonomy of local systems
(2) their distribution, and
(3) their heterogeneity
________ is the degree to which individual DBMSs can operate independently?
Autonomy
What are the Requirements of an autonomous system?
- Local operations of individual DBMSs remain unaffected by their participation in the distributed system.
- The processing of queries by individual DBMSs should not be influenced by the execution of global queries accessing multiple databases.
- System consistency should not be compromised when individual DBMSs join or leave the distributed system.
What are Three Alternatives for Autonomous Systems in Distributed DBMS?
- Tight Integration
- Semiautonomous Systems
- Total Isolation
Explain Tight integration.
- Single-image of the entire db is available, which may reside in multiple dbs.
- From users’ POV, the data are logically integrated in one database.
data managers are implemented in _______ ?
- tightly-integrated systems so that one of them is in control of the processing of each user request
Describe Semiautonomous Systems?
- a federation of independent DBMSs that share local data.
- Each DBMS determines which parts of its database are accessible to users of other DBMSs.
- Not fully autonomous because they need modifications to exchange information with each other.
Explain Total Isolation.
- Individual systems are stand-alone DBMSs.
- DBMSs are unaware of the existence of other DBMSs and can’t communicate with them.
Processing user transactions that access multiple databases is challenging in _______ ?
total isolation autonomy
autonomy is the distribution of ______ while distribution is the distribution of _____ ?
control , data
What are we considering in Distribution?
- the physical distribution of data over multiple sites
What are the 2 approaches to distribute a DBMSs?
- client/server distribution
▪ data management duties at servers - peer-to-peer distribution (full distribution).
* Each machine has full DBMS functionality
What is Heterogeneity?
refers to the presence of diverse hardware, software, and network components
What are the Three alternative architectures?
- client/server distributed DBMSs
- a peer-to-peer distributed DBMS
- a peer-to peer distributed, heterogeneous multidatabase system
Explain a Client/Server Architecture?
- Query processing, optimization, transaction management, and storage management are performed at the server.
- The client handles the application, user interface, and manages cached data and transaction locks.
What are the Types of Client/Server Architecture?
Multiple Client/Single Server
Multiple Client/Multiple Server
What are the 2 Management Strategies for CSA?
A. Client Manages Own Connection:
Simplifies server code.
Results in a “heavy client” system.
B. Client Knows “Home Server”
Concentrates data management at servers.
Results in “light clients” with transparency provided at the server interface.
The primary distinction between client/server systems and peer to-peer ones is in the level of transparency.
False
- in the architectural paradigm that is used to realize transparency.
What are the Elements of Peer-to-Peer Architecture?
- Local Internal Schema (LIS):
Individual internal schema definition at each site. - Local Conceptual Schema (LCS):
Describes logical organization of data at each site, - Global Conceptual Schema (GCS):
Describes enterprise-wide data view
Supports location and replication transparencies. - External Schemas (ESs):
User applications and access to the database are supported by external schemas.
_____ is Defined above the global conceptual schema to support user applications and user access to the database?
External schemas (ESs)
_____ handles fragmentation and replication.
Local conceptual schema (LCS).
_____ is the Union of local conceptual schemas.
Global conceptual schema (GCS)
In Peer-2Peer the distributed DBMS translates global queries into a group of local queries
True
only 1 GCS to many LCS
What are the Two basic components in P2P?
- User processor [ ES, GCS ]
- Data processor [ LCS, LIS ]
What are User processor Components in P2P DBMS?
- The user interface handler
- The semantic data controller
- The global query optimizer and decomposer
- The distributed execution monitor
_____ Interprets user commands and formats result data for users?
The user interface handler
Explain the role of Semantic Data Controller?
- Uses integrity constraints from the global conceptual schema to check user processing
- Responsible for authorization
Explain Global Query Optimizer and Decomposer?
- Determines execution strategy using global and local conceptual schemas.
- Translates global queries into local ones
Explain Distributed Execution Monitor.
- Coordinates distributed execution of user requests.
- Aka the distributed transaction manager.
What are Data processor Components in P2P DBMS?
1.The local query optimizer
2. The local recovery manager
3.The run-time support processor
______ Acts as the access path selector in data processor?
The local query optimizer
_____ Ensures local database consistency in the event of failures in the data processor?
The local recovery manager
_____ Physically accesses the database based on the schedule generated by the query optimize?
The run-time support processor
The local recovery manager Acts as the interface to the operating system
False
The run-time support processor
Explain a Database Buffer (or Cache) Manager?
- Is a component contained with in the, The run-time support processor that is responsible for maintaining the main memory buffers and managing the data accesses
What is The fundamental difference between multi-DBMS & distributed DBMS.
the existence of full-fledged DBMSs, each of which manages a different database in multi-DBMS
From the perspective of individual DBMSs, the MDBS layer is just another application that submits requests and receives responses.
True
What are the 3 main Complications Introduced by Distribution?
- Replication Complexity
- Failure Complexity
- Synchronization Problem
What must the Distributed database do to handle Replication Complexity?
- Choose replica to be accessed in retrievals.
- Ensure updates are reflected on all replicas
What must the Distributed database do to handle Failure Complexity?
- System must handle failures (hardware / software) during updates.
- Ensure effects are reflected on data once the system recovers.
What is Synchronization Problem?
The Lack of instantaneous information on actions at other sites
What are other Complications Introduced by Distribution?
- cost of replicating resources
- managing distribution
- devolution of control
List the Design Issues in Distributed Database Systems.
- Distributed Database Design
- Distributed Directory Management
- Distributed Query Processing
- Distributed Concurrency Control
- Distributed Deadlock Management
- Reliability of Distributed DBMS
Explain Distributed Database Design.
- Data can be Partitioned (non-replicated) or replicated.
> Partitioned: Database divided into disjoint partitions at different sites.
> Replicated: Fully replicated (entire database stored at each site) or partially replicated (each partition stored at multiple, but not all, sites).
Explain Distributed Directory Management.
Directory :
- Contains information about data items in the database.
- May be global or local to DDBS.
- Can be centralized or distributed across sites.
- Has Complex management.
Explain Distributed Query Processing.
- Involves designing algorithms to analyze queries and optimize their execution
- Considers data distribution, communication costs, and lack of locally-available information.
- Objective is to optimize performance within constraints.
______ involves the synchronization of accesses to the distributed database, such that the integrity of the database is maintained?
Distributed Concurrency Control
What is mutual consistency?
When multiple copies of every data item to converge to the same value
What are the fundamental primitives used in Distributed Concurrency Control?
Locking
timestamping
_______ is based on the mutual exclusion of
accesses to data items
Locking
What is timestamping?
where the transaction executions are ordered based on timestamps
intermediate results of a transaction should not be visible to other concurrently executing transactions
True
transanaction isolation can be ensured trivially by _______ ?
running transactions serially(timestamp)
What is deadlock?
occurs when two or more transactions are blocked indefinitely, each waiting for the other to release a lock.
What Alternatives do we have for deadlock?
prevention, avoidance, detection, and recovery.
Explain Reliability of Distributed DBMS.
ensure database consistency, detect failures, and recover from them.