Distributed DBMS Flashcards

1
Q

Advantages of DDBMS

A
  • Data is located near the sites of greatest demand
  • Faster data access
  • Process data at different sites
  • New sites can be added without affecting other sites
  • Cheaper to add nodes to a system than updating a mainframe
  • Less danger of SPOF
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Disadvantages of DDBMS

A
  • Complexity of management and control
  • Technological difficulty - replication, query optimization, transaction management
  • Increased storage requirements (for replication)
  • Higher cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Components of DDBMS

A
  • TP
  • DP
  • Communications network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Distributed processing

A

database’s logical processing is shared among
two or more physically independent sites via network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Distributed database

A

stores logically related database over two or
more physically independent sites via a computer network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Database fragment

A

database composed of many parts in distributed
database system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Database level fragmentation

A

Table1 in Location1, Table2 in Location2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Table level fragmentation

A

Same tables with different data in different locations
- e.g. Payroll & Ops tables in Halifax office with Halifax data, Payroll & Ops tables in Bedford office with Bedford data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Single site processing, single site data

A
  • TP and DP in one computer
  • End user has dumb terminal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Multi-site processing, single site data

A
  • Multiple TP run on different computers sharing a single data repository (DP)
  • Accessed through LAN
  • Client/server architecture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Multi-site processing, multi-site data

A
  • Fully distributed database management system
  • Support multiple DP and TP at multiple sites
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Homogeneous DDBMS

A
  • integrate multiple instances of same DBMS over a
    network
  • e.g. MySQL v5 on 3 locations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Heterogeneous DDBMS

A
  • integrate different types of DBMSs over a network
  • e.g. MySQL in Asia, Oracle in EU, MSSQL Server in US
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fully heterogeneous DDBMS

A
  • support different DBMSs, each supporting different data model running under different computer systems
  • DB level fragmentation
  • e.g. T1T2 with Oracle in L1, T3T4 with Postgres in L2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Minimum desirable DDBMS transparency

A
  1. Distribution Transparency
  2. Transaction Transparency
  3. Failure Transparency
  4. Performance Transparency
  5. Heterogeneity Transparency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Distribution Transparency?

A

Distributed DB treated as a single logical database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Levels of Distribution Transparency

A

fragmentation (highest), location, and local mapping (lowest)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Fragmentation Transparency

A

Query has no fragment name, no location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Location Transparency

A

Query has fragment name, no location

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Local Mapping Transparency

A

Query has fragment name and location. Faster data retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When to use local mapping transparency?

A

for security researchers to track data loss
for DBA to track dupe records

22
Q

What contains the entire description of Distributed DB?

A

distributed data dictionary (DDD) or distributed data
catalog (DDC)

23
Q

What is a distributed global schema?

A

common database schema to translate user
requests into subqueries

24
Q

Pros & cons of having large # factors in DDD (i.e. node name, IP address)

A

Pros: efficient data retrieval
Cons: need to update often

25
Are rows in each fragment unique?
Yes
26
What is Transaction Transparency?
- Ensures database transactions will maintain the data integrity and consistency - Ensures transaction completed only when all database sites complete their part
27
Remote request
Single SQL statement accesses data processed by a single remote database processor
28
Remote transaction
Accesses data at single remote site composed of several requests
29
Distributed transaction
Requests data from several different remote sites on network
30
Distributed request
Single SQL statement references data at several DP sites
31
Solution to concurrency control in distributed DBMS?
2PC / 3PC
32
What is 2PC?
One node would be acting as a coordinator. - prepare phase: asking other nodes whether they can commit the proposed transaction. - commit phase: commanding other nodes to commit the proposed transaction. - if at any phase, a node aborts, the coordinator issues a global abort and tries again
33
Why is 2PC a blocking protocol?
nodes are stuck waiting for coordinator's global commit
34
What is 3PC?
- prepare - pre-commit (guarantees that nodes can commit) - commit
35
What is Performance transparency?
allows a DDBMS to perform as if it were a centralized database
36
What is Failure transparency?
ensures the system will operate in case of network failure
37
Advantages of query optimization
Lower cost - Access time (I/O) cost involved in accessing data from multiple remote sites - Communication costs associated with data transmission - CPU time cost associated with the processing overhead
38
Replica transparency
hide multiple copies of data from the user
39
Network latency
delay imposed by the amount of time required for a data packet to make a round trip
40
Network partitioning
delay imposed when nodes become suddenly unavailable due to a network failure
41
CAP theorem
Choose 2 from Consistency, Availability, Partition-Tolerance
42
Data replication
Storage of data copies at multiple sites served by a computer network
43
Strategies of data fragmentation
Horizontal, vertical, mixed
44
How does horizontal fragmentation work?
divide by rows on a partition key (e.g. location)
45
How does vertical fragmentation work?
divide by columns - e.g. Suppose that the company is divided into two departments: the service department and the collections department.
46
How does mixed fragmentation work?
both horizontal & vertical
47
Two modes of data replication
Push & pull
48
What is push replication
originating DP node sends the changes to the replica nodes to ensure that data is immediately updated
49
When to use push replication
When consistency is important. Latency involved in ensuring consistency.
50
What is pull replication
the originating DP node sends “messages” to the replica nodes to notify them of the update. The replica nodes decide when to apply the updates to their local fragment.
51
When to use pull replication
When availability is important. - data updates propagate more slowly to the replicas - temporary inconsistency