Distributed Database Flashcards

1
Q

What is the DB system evolution ?

A

Each unit defines and maintains its own data

Centralised DB Systems: Data is defines and administered centrally under the control of a single DBMS.

Distributed DBMS: Data cam be accessed at a set of distributed sites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is distributed DB developed ?

A

Data in all units is accessible

Data is stored in proximity to locations where it is most often used.

Should improve shareability of data and efficiency of data access.

Should resolve islands of info problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a distributed Database and Distributed DBMS

A

Distributed DATABASE:
A logically inter-related collection of shared data(description of data), physically distributed over a computer network.

Distributed DBMS:
Software system that permits the management of distributed DB and makes the distribution transparent to users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the components of DDB and DDBMS ?

A

DDB:
A logical DB split into a number of FRAGMENTS.
Each is stored on one or more comps under the control of a seperate DBMS
Computer communicate over a network.

DDBMS: User access the Distributed DB via applications:
LOCAL: Don’t require data from other sites
GLOBAL: Require data from other sites.
DDBMS have at least one global application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the characteristics of DDBMS?

A

Collection of logically related shared data
Data split into fragments and those may be replicated and they are allocated to sites. and these are linked by communication network.

Data at each site under control of DBMS.
Each DBMS participated in at least one global application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is parallel DBMS?

A

can be designed to run across multiple processors and disks to improve performance

Parallel link multiple smaller machines to achieve the same throughout as a single larger machine with greater reliability.

Architectures are: Shared memory, Shared disk and Shared nothing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Advantage and Disadvantage of DDBMS

A

Advantage:
Reflects organizational structure.
Improved availability and reliability.
Modular growth
Improved performance

Disadvantages:
Complexity
Cost(manpower needed)
Lack of standards and experience
DB design more complex

Issues while waiting for a response from a sent packet:
remote node fail
May have been lost
Response lost in network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Could time stamps save us ?

A

Each machine has its own clock- quartz crystal oscillator

These aren’t accurate so it has its own notion of time.

Machines can synchronize to a network time protocol(NTP) that allows the comp clock to to adjust to a time acc to group of servers

How would a node know who is its current leader?
1.ACquires a lease with timeout
2.One node can hold a lease at a time
3.Leader renew lease after timeout period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the types of DDBMS?

A

HOMOGENOUS:
All sites use same local DBMS product(SQL)
Easier to design and manage
Similar to Centralised DB over a Distributed system.
Provides incremental growth and performance

HETERGENEOUS:
Sites may run diff DBMS products with possibly diff underlying data models.
Occurs when sites-own DB and integration is an after-thought
Translations are needed for diff hardware and DBMS products
Typical sol is to use gateways

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the functions of a DDBMS?

A

Extended communication service
Extended data dictionary and concurrency control
Distributed query processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the key issues of the design of distributed DB ?

A

FRAGMENTATION:
relation may be / into sub - relations which are then distributed

ALLOCATION: each fragment is distributed and stored at site with optimal distribution

REPLICATION:may contain copy of frag at several sites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the four strategies regarding placement of data

A

IN data allocation:
CENTRALIZED: single database and DBMS stored at one site with users distributed across the network.

PARTITIONED: DB partitioned into fragments and each is assigned to one site

COMPLETE REPLICATION: maintaining complete copies of DB at each site

SELECTIVE REPLICATION:combination of partitioning, replication, and centralization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why do we do fragmentation ?

A

Usage: applications work with views rather than entire relations

Efficiency: Data stored close to where it’s used. Data not needed by local applications not stored.

Parallelism: Transactions can be / into sub queries operating on seperate frags.

Security: Data not required by local applications is not stored and not available to unauthorized users.

ADVANTAGES:
Locality of reference
Improved performance and reliability
Minimal communication costs

DISADVANTAGES:
Performance: apps pulling data from many frags will be slower. Network delays.
Integrity: will be more difficult. Communicating updates to all replicas may be slow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three main types of fragmentation ?

A

1.HORIZONTAL: Tuples that belong to this are identified by selection query
EX: employee relation for a company with employees in london and new york
We could fragment-D1 and D3 department are stored in new york
D2-london workers
Reconstruct original table with UNION OPERATION.

This can lead to HOTSPOTS.

2.VERTICAL: if we want the relation in two vertical frags. The first frag-name, gender etc and the second frag can have salary ssn no.

3.HYBRID:
A mixed frag consists of hori frag that is subsequently verti fragmented or verti frag is subsequently hori fragmented

EX: store work related and personal frag at separate sites but also save verti frag of work-related data in new york or dublin

Reconstruct with outer joins and union operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Transparencies in DDBMS

A

The distribution should be transparent to the user
There is 3 types
DISTRIBUTION
TRANSACTION
PERFORMANCE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is distribution trans and its types

A

FRAGMENTATION:
user doesn’t know the data is fragmented and may query any table as though it was not.

LOCATION:
user is unaware of physical location of data but is aware about fragmentation.

LOCAL MAPPING:
Users needs to know where data is fragmented and located.

17
Q

What is transaction trans ?

A

Ensures all distri transactions maintain DB integrity and consistency

Distributed transactions accesses data stores in more than one location

Each transaction is / into set of sub-transactions one for each site

Both must be indivisible

18
Q

What is performance trans ?

A

This requires the DDBMS performs as if it were a centralised DBMS.

There should be no performance degradation due to distributed nature.

DDBMS should also determine most cost-effective way to execute a request.