Distributed Databases Flashcards
Distributed Databases
Distributed DBMS provide access to data at all sites.
Lets say we have one store in Liverpool. This store might eventually spread to Manchester or London etc.
The concrete definition of a distributed database if a collection of multiple logically interrelated databases which is distributed over a computer network.
Advantages of distributed databases
- help provide us access to these different data sites
- we don’t have to specify where the data is from and we can just grab it from wherever
- can gives many users access to large datasets
- answer to queries faster by distributing tasks over the nodes
- easier to scale (just add new node)
Fragmentation
Split database in different parts which we can store at different nodes.
Horizontal Fragmentation
Fragmenting the database from top to bottom (rows).
Data is stored as tuples.
Vertical Fragmentation
Fragmenting the database based upon columns.
Data is stored as columns in other databases.
Fragmentation Transparency
The user does not see this fragmentation, just the full relations.
Entire Relation
Union of the fragments.
Fragmentation Advantage
Using these fragmentations can help resilience; if there is a failure in one store, there are other stores which hold the fragments of the database.
Types of replication
- full replication
- no replication
- partial replication
Full Replication
Each fragment is stored at every sight.
No Replication
Each fragment is stored at a unique site.
Partial Replication
Limit number of copies of each fragment, where we replicate only some fragments.
Types of transparency
- fragmentation transparency
- replication transparency
- locations transparency
- naming transparency
Fragmentation Transparency
Fragmentation is transparent to others.
Replication Transparency
Ability to copy data items at different sites where the replication is transparent to others.