ADBMS - unit 3 & 4: intro to DDBMS & Arch Flashcards
What is distributed database system?
A distributed database is a database that runs and stores data across multiple computers, as opposed to doing everything on a single machine.
What is Node or Instance?
Typically, distributed database systems operate on two or more interconnected servers on a computer network. Each location where a version of the database is running is often called an instance or a node.
How instance runs on centralized and on distributed?
A distributed database, for example, might have instances running in New York, Ohio, and California. Or it might have instances running on three separate machines in New York. A traditional single-instance database, in contrast, only runs in a single location on a single machine.
what is distributed in D dbms
- program logic
- functions
- data
- control
synonymous terms for D DBMS
distributed data processing
multiprocessors / multi computers
satellite processing
backend processing
dedicated / special purpose computers
timeshared systems
functionally modular systems
peer to peer systems
What is DDB system
- DDB is collection of multiple, logically interrelated databases distributes over a computer network
what is D DBM system software
D-DBMS is the s/w that manages the DDB and provides an access mechanism that makes this distribution transparent to the users
D DBMS = DB + communication
Why not D DBMS
timesharing computer system
loosely or tightly coupled multiprocessor system
database system which resides at one of the nodes of a network of computers - this is a centralized database on a network node
distributed dbms promises
transparent management of distributed, fragmented, and replicated data
improved reliability / availability through distributed transactions
improved performance
easier and more economical system expansion
what is meaning of “Promises of Distributed Databases”
Promises of distributed databases, meaning advantages of distributed databases
what is first promise of distributed database
First promise or advantage of distributed database is
1. transparency of data, fragmentation and replication
what is transparency transparency
Explain transparency of data, fragmentation and replication
- transparency refers to separation of the higher level semantics of a system from lower level implementation issues
- DDBMS hides all the added complexities of distribute allowing users to think that they all working with a single centralized systems
eg:
engineering firm that has offices in boston, mumbai, paris, and delhi
- they run projects and maintain database of these employees ex: projects, employees etc
- let us assume that the database is relational and stored in following two relations
Emp( eno, ename, title )
Proj( Pno, Pname, Budget )
- the other relation to store salary information
SAL(Title, Amt)
the 4 relation to know the assign projects with duration and responsibility indicates as
ASG( eno, pno, resp, dur )
if we want to find out the names and employees who worked on a project for more than 12 months the query that we all going to write is:
select ename amt
from ASG.dur > 12
AND Emp.eno = ASG.eno
AND sal.title = emp.title
based on queries it is going to search in different databases of boston paris etc…
in order to quick processing of query we are going to partition each of the relations and store each partition at a different siet
this is known as fragmentation
- data independence DI
- DI is a fundamental form of transparency
- it is capacity of changing the database scheme at one level of database system without efficiency the schema at the next higher level
2 types
- logical DI
- physical DI
LDI stores information about how data is managed inside
PDI deals with hiding the details of the storage structure from user applications
if network transparency / distribution transparency
- other than data the user should be protected from the operational details of the network
- allowing a user to access a resource (application program or data) without the user needing to know whether the resource is located on the local machine or on a remote machine
- replication transparency
- replication transparency ensures that replication of databases are hidden from the users
- it enables users to query upon a table as if only a single copy of the table exists
- fragmentation transparency
- dividing each database relation into smaller fragments and treat each fragment as a separate database object
- this is for reasons of performance, availability and reliability
so to provide easy and efficient access of the DBMS we need to have fully transparency
what is second advantage of promise of distributed database
reliability through distributed transactions
explain reliability through distributed transactions
- distributed DBMSs are designed to improve reliability by having replicated components results in eliminating failures
- so the failure creates problem to the entire system
- in distributed proper case is taken such that instead of failure part user may be permitted to access other parts of the distributed database
- this is useful to support for distributed transactions
- a transaction is a basic unit of consistent and reliable computing, consisting of a sequence of database operations executed as an atomic action
ex: transaction based on the engineering firm
- assuming that there is an application that updates the salaries of all the employees by 10%
- in the middle of this transaction if system fails we would like the DBMS to be able to determine, pon recovery, where it left off and continues with its operation
- distributed transactions execute at a no. of sited at which they access the local database
- here we are providing a facility that there is no interruption in any transaction
-
what is third promise or advantage of distributed databases
improved performance
- a distributed DBMS fragments the conceptual database, this is also called as data localization
advantages
1. since each site handles only a portion of the database, correction for CPU and I/O services is not severe
- localization reduce remote access delays
- implementation of inherent parallelism of distributed system
- it has enter query and infra query parallelism
- inter query parallelism - to execute multiple queries at the same time
- intra query parallelism is achieved by breaking up a single query into a number of sub queries each of which is executed at a different site, accessing a different part of the distributed database
what is fourth advantage or promise of distributed database
- easier system expansion
in a distributed environment it is mush easier to accommodate increasing database sizes
in general expansion can be handled by adding processing and storage power to the network
- this also depends on the overhead of distribution
- one aspect of easier system expansion is economics
- it normally costs much less to put together a system of smaller computers with equivalent power of a single big machine
what are some problem areas
complexity
data replication
overall cost
security issue
integrity control
lacking standards
explain complex nature of D DBMS
distributed databases, are network of many computers present at dif. location and they provide an outstanding level of performance, availability and of course reliability
therefore nature of distributed database is more complex than centralized database
we also need complex and advance software’s to manage distributed databases
also it ensures no data replication, which adds even more complexity in its nature
explain overall cost in detail
costs such as maintenance cost, procurement cost, hardware cost, network / communication costs, labor costs etc, adds up to the overall cost and make it costlier than normal DBMS
explain security issues of distributed databases
- along with maintaining no data redundancy/duplication, security of data as well as a network is a prime concern
- network can be easily attacked for data theft and misuse
Explain integrity control
- in vast distributed database system, maintaining data consistency is important
- all changes made to data at one site must be reflected on all the sites
- the communication and processing cost is high in distributed DBMS in order to enforce the integrity of data
what is standardization
the process of transforming data from various sources into a consistent format, ensuring uniformity in data structure, naming conventions, and values, allowing for easier integration, organization
abstract view ka kya matlab hota he
abstract view ka matlab hota he, jis tarike se database kam kar raha he, usse upar upar se samjana
internally uske andar nahi jhakte jyada
upar upar se dekhte he ki kya kya components lage hue he
konse konse components aapas me communicate kar rahe he
ye sari chize ham study karte he
architectural model kya hote he? ye hame kya show karte he
arch models hame abstract view of distributed systems show karte he
arch models se kya help hoti he
arch models system ki reasoning ko simplify karne me help karte he
architectural models for distributed DBMS
client-server
peer to peer
multi database
Client servers explain.
1. What it is consists of
2. what are responsibilities of servers
3. How clients and servers are connected
- clients and servers
- hosting, managing and delivering services to clients
- clients and servers are connected through computer network, and they communicate over internet
Client - server arch
1. How does it work
- when client need services, it sends request to server, server processes request sends response to client
explain step by step how client interacts with server
- user enter URL in browser
- browser sends request to DNS server to lookup the IP address of the web server
- DNS sends IP address of the web server to browser
- browser sends http/ request to IP address of srever
- web server sends necessary files back to browser
- browser render files and display data
client server model me konse konse process hote he
clienst server model me hamare pas do types ke process hote he
- client process
- server process
hamare pas jo do processes he, client process and server process
usme se konsi process requesting process he
jo do processes he, client process and server process, usme se client process ko requesting process kaha jata he
jo do chize hamare pas hoti he client server architecture me, clients and servers usme se kise “service providing” kaha jata he
hamare pas jo do processes he, client and server usme se client ko requesting service kaha jata he
aaur server ko, providing service kaha jata he
client server konse protocol par kam karta he
client server request/reply protocol par kam karta he
what are types of C-S arch
- two tier
- three tier
- N tier
advantages of client server
disadvantages of client server
use cases of client server
FTPs
Web servers
web browsers
DNS