Parcial 1 Flashcards

Examen de Primer Parcial

1
Q

Is the result of the union of two
technologies; «Database Systems» and «Computer Network»

A

Distributed Database System (DDBS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A number of autonomous processing elements (not necessarily
homogeneous) that are interconnected by a computer network and that
cooperate in performing their assigned tasks.

A

Distributed Computing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is a collection of multiple, logically
interrelated databases distributed over a computer network.

A

distributed database (DDB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

is the software
that manages the DDB and provides an access mechanism that makes
this distribution transparent to the users.

A

distributed database management system (D–DBMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A “collection of files” individually stored at each node of a computer
network is a DDBS. (True or False)

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A database system which resides at one of the nodes of a network of
computers is a DDBS (True or False)

A

False, this is a centralized database on a network node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Is related to how data is delivery from sites where they are
stored to where the query is posed.

A

Data delivery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

There are three ortogonal dimensions into data delivery altertatives
(DDA)

A

Delivery modes,
Delivery frequency,
Communication methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When a client request is received at a server, the server responds by locating the requested information. The arrival of new data items or updates to existing data items are carried out at a server without notification to clients unless clients explicitly poll the server

A

Delivery mode: Pull-only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The transfer of the data from servers to clients is initiated by a server push in the absence of any specific request from clients. The main difficulty of this approach is in deciding which data would be of common interest, and when to send them to clients. Alternatives are periodic, irregular, or conditional. The usefulness of server push depends heavily upon the accuracy of a server to predict the needs of clients (Broadcast or Multicast).

A

Delivery mode: Push-only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Combines the client-pull and server-push mechanisms. The continuous or continual query approach presents one possible wat of combining the pull and push modes: namely, the transfer of information from servers to
clients is first initiated by a client pull (by posing the query), and the subsequent transfer of updated information to clients is initiated by a server push.

A

Delivery mode: Hybrid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data are sent from the server to clients at regular intervals. The intervals can be defined by system default or by clients using their profiles. Both, pull and push can be performed in this frequency. Is carried out on a regular and pre-specified repeating schedule

A

Frequency: Periodic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In this delivery, data are sent from servers whenever certain conditions installed by clients in their profiles are satisfied. Such conditions can be as simple as a given time span or as complicated as event condition-action rules. Is mostly used in the hybrid or push-only delivery systems.

A

Frequency: Conditional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

This delivery is performed mostly in a pure pull-based system. Data are pulled from servers to clients whenever clients request it. In contrast, periodic pull arises when a client uses polling to obtain data from servers based on a regular period (schedule).

A

Frequency: Ad-hoc or Irregular

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In this method, the communication is performed from a server to a client in a one-to one fashion; the server sends data to one client using a particular delivery mode with some frequency.

A

Communication Methods: Unicast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In this method, as the name implies, the server sends data to a number of clients. Note that we are not referring here to a specific protocol; _________ communication may use a multicast or broadcast protocol.

A

One-to-many

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Transparent management of distributed, fragmented, and replicated data
Improved reliability/availability through distributed transactions
Improved performance
Easier and more economical system expansion

A

Distributed DBMS Promises

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Refers to separation of the higher-level semantics of a system from lower-level implementation issues. A transparent system “hides” the implementation details from users. The advantage of a fully transparent DBMS is the high level of support that it provides for the development of complex applications

A

Transparency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Is a fundamental form of transparency that we look for within a DBMS. It refers to the immunity of user’s applications to changes in the definition and organization of data, and vice versa.

A

Data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Refers to the immunity of user application to changes in the logical structure (scheme) of the database.

A

Logical data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Deals with hiding the details of the storage structure from user applications.

A

Physical data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

This is concerned with the user protection from the operational details of the network; possibly even hiding the existence of the network.

A

Network transparency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

This refers to the fact that the command used to perform a task is independent of both the location of the data and the system on which an operation is carried out.

A

Location transparency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

This means that a unique name is provided for each object in the database. In the absence of ___________, users are required to embed the location name (or an identifier) as part of the object name

A

Naming transparency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
For performance, reliability, and availability reasons, it is usually desirable to be able to distribute data in a replicated fashion across machines on a network. The ________________ is whether the users should be aware of the existence of copies or whether the system should handle the management of copies and the user should act as if there is a single copy of the data.
Replication transparency
26
Sometimes, it is commonly desirable to divide each database relation into smaller fragments and treat each fragment as a separate database object (i.e. another relation). This is commonly done for reasons of performance, availability and reliability.
Fragmentation transparency
27
A relation is partitioned into a set of sub-relations each of which have a subset of tuples (rows) of the original relation.
Horizontal fragmentation
28
Where each sub-relation is defined no a subset of attributes (columns) of the original relation
Vertical fragmentation
29
Is a basic unit of consistent and reliable computing, consisting of a sequence of database operations executed as an atomic action.
Transaction
30
In consists in the transformation of a consistent database state to another consistent database state even when several such transactions are executed concurrently.
Concurrency transparency
31
It happens when there are atomic failures during several transactions carried out at same time.
Failure atomicity
32
They are protocols that control simultaneous transactions in distributed databases.
Distributed concurrency control protocols
33
Protocols that are capable of perform commits operations and recovery unfinished transactions.
Commit protocols
34
Great for read-intensive workloads, problematic for updates. They are mechanism to improve reliability in big size distributed databases (very common in NoSQL databases)
Data replication
35
Protocols designed to perform replication tasks considering principles of distributed computing.
Replication protocols.
36
A DBMS fragments the conceptual database, enabling data to be stored close to its points of use
Localization
37
This results from the ability to execute multiple queries at the same time.
Inter-query parallelism
38
This is achieved by breaking up a single query into several subqueries each of which is executed at a different site, accessing a different part of the distributed database.
Intra-query parallelism
39
- Have as much of the data required by each application at the site where the application executes - Full replication - Mutual consistency - Freshness of copies These are:
Parallelism Requirements
40
DBMS Issues How to distribute the database Replicated & non-replicated database distribution A related problem in directory management
Distributed Database Design
41
DBMS Issues Convert user transactions to data manipulation instructions Optimization problem "min{cost = data transmission + local processing}" General formulation is NP-hard
Query Processing
42
DBMS Issues Synchronization of concurrent accesses Consistency and isolation of transactions' effects Deadlock management
Concurrency Control
43
DBMS Issues How to make the system resilient to failures Atomicity and durability
Reliability
44
Defines the structure of the system - components identified - functions of each component defined - interrelationships and interactions between components defined
Architecture
45
Related Issues - Operating system with proper support for database operations - Dichotomy between general purpose processing requirements and database processing requirements
Operating System Support
46
Related Issues - Distributed Multidatabase Systems - More probable scenario - Parallel issues
Open Systems and Interoperability
47
Dimensions of the problem ----- Whether the components of the system are located on the same machine or not
Distribution
48
Dimensions of the problem ----- Various levels (hardware, communications, operating system) DBMS important one ✦ data model, query language,transaction management algorithms
Heterogeneity
49
Dimensions of the Problem ----- Ability of a component DBMS to decide on issues related to its own design.
Design autonomy
50
Dimensions of the Problem ----- Ability of a component DBMS to decide whether and how to communicate with other DBMSs.
Communication autonomy
51
Ability of a component DBMS to execute local operations in any manner it wants to.
Execution autonomy
52
- More efficient division of labor - Horizontal and vertical scaling of resources - Better price/performance on client machines - Ability to use familiar tools on client machines - Client access to remote data (via standards) - Full DBMS functionality provided to client workstations - Overall better system price/performance These are:
Advantages of Client-Server Architectures
53
Is a finite, time varying set of n-tuples (d1, d2, ..., dn) such that d1 e Dom1, d2 e Dom2, ..., dn e Domn, and A1 c D1, A2 c D2, ..., An c Dn.
Relation with Attributes defined over n Domains
54
Tabular structure of data where - R is the table heading - Attributes are table columns - Each tuple is a row
Relational Model
55
Is the definition; i.e., a set of attributes
Relation scheme
56
Is a set of relation schemes: i.e., a set of sets of attributes
Relational Database Scheme
57
Is an instance of a relation scheme
Relation
58
is a subset of the Cartesian product of the domains of all attributes, i.e.
Relation over a relation scheme
59
Is a type in the programming language sense.
Domain
60
Is a set of acceptable values for a variable of a given type.
Domain values
61
Binary operations (e.g., comparison to one another, addition, etc.) can be performed on them.
Domain compatibility
62
Attribute values are repeated for each project that the employee is involved in.
Repetition Anomaly
63
If any attribute of project is updated, multiple tuples have to be updated to reflect the change.
Update Anomaly
64
It may not be possible to store information about a new project until an employee is assigned to it.
Insertion Anomaly
65
If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted, or the information about that project is lost. May have to delete many tuples.
Deletion Anomaly
66
Is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions.
Normalization
67
Criteria that should the decomposed schemas follow: Recover the original relation -> no spurious joins
Reconstructability
68
Criteria that should the decomposed schemas follow: No information loss
Lossless decomposition
69
Criteria that should the decomposed schemas follow: The constraints (i.e., dependencies) that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.
Dependency preservation
70
Eliminates the relations within relations or relations as attributes of tuples.
First Normal Form (1NF)
71
Eliminate the partial functional dependencies of non-prime attributes to key attributes
Second Normal Form (2NF)
72
Eliminate the transitive functional dependencies of non-prime attributes to key attributes
Third Normal Form (3NF)
73
Eliminate the partial and transitive functional dependencies of prime (key) attributes to key.
Boyce-Codd Normal Form (BCNF)
74
Specify how to obtain the result using a set of operators
Relational Algebra
75
Relational Algebra Operators: Produces a horizontal subset of the operand relation
Selection (sigma)
76
Relational Algebra Operators: Produces a vertical slice of a relation
Projection
77
Relational Algebra Operators: where R, S are relations, t is a tuple variable Result contains tuples that are in R or in S, but not both (duplicates removed) R, S should be union-compatible
Union
78
Relational Algebra Operators: where R and S are relations, t is a tuple variable Result contains all tuples that are in R, but not in S. R – S != S – R R, S union-compatible
Set Difference
79
Relational Algebra Operators: The result of R × S is a relation of degree (k1+ k2) and consists of all (n1* n2)-tuples where each tuple is a concatenation of one tuple of R with one tuple of S
Cartesian (Cross) Product
80
Relational Algebra Operators: R n S = {t | t e R and t e S} = R – (R – S) R, S union-compatible
Intersection
81
Relational Algebra Operators: Is derivate of Cartesian product. There are various forms of join. The primary classification is between inner join and outer join.
Join
82
The formula F only contains equality (=) as arithmetic operator. R ⋈R.A=S.B S The above example is a special case of q-join which is called the
Equi-Join
83
It is a equi-join of two relations R and S over an attribute (or attributes) common to both R and S and projecting out one copy of those attributes
Natural Join
84
Inner join requires the joined tuples from two operand relations to satisfy the join predicate. In contrast, in ___________ tuples exist in the result relation regardless Ensures that tuples from one or both relations that do not satisfy the join condition still appear in the final result with other relationʼs attribute values set to NULL
Outer-Join
85
The tuples from the left operand relation are always in the result.
Left outer join
86
The tuples from the right operand relation are always in the result.
Right outer join
87
Tuples from both relations are always in the result.
Full outer join
88
The ________ of relation R, defined over the set of attributes A, by relation S, defined over the set of attributes B, is the subset of tuples of R that participate in the join of R with S.
Semijoin
89
R of degree k1 (R = {A1,…,Ak1}) S of degree k2 (S = {B1,…,Bk2}) Let A = {A1,…,Ak1} [i.e., R(A)] and B = {B1,…,Bk2} [i.e., S(B)] and B c A. T = R ÷ S gives T of degree k1-k2 [i.e., T(Y) where Y = A-B] such that for a tuple t to appear in T, the values in t must appear in R in combination with every tuple in S.
Division (Quotient)
90
Specify the properties that the result should hold
Relational Calculus
91
Query of the form {t|F{t}} where t is a tuple variable F is a well-formed formula
Tuple Relational Calculus
92
Query of the form x1, x2, …, xn|F(x1, x2, …, xn) where F is a well-formed formula in which x1, x2, …, xn are the free variables
Domain Relational Calculus
93
An interconnected collection of autonomous computers that are capable of exchanging information among themselves.
Computer Network
94
Network of networks
Internet
95
✦ Distance between any two nodes > 20km and can go as high as thousands of kms ✦ Long delays due to distance traveled ✦ Heterogeneity of transmission media ✦ Speeds of 150Mbps to 10Gbps (OC192 on the backbone)
Wide Area Network (WAN)
96
✦ Limited in geographic scope (usually < 2km) ✦ Speeds 10-1000 Mbps ✦ Short delays and low noise
Local Area Network (LAN)
97
✦ In between LAN and WAN
Metropolitan Area Network (MAN)
98
Types of Networks: Topologies
Irregular (Internet), Bus (Typical in LAN - Ethernet), Star, Ring, Mesh
99
- One or more (direct or indirect) links between each pair of nodes - Communication always between two nodes Receiver and sender are identified by their addresses included in the message header - Message may follow one of many links between the sender and receiver using switching or routing
Point-to-point (unicast)
100
Messages are transmitted over a shared channel and received by all the nodes Each node checks the address and if it not the intended recipient, ignores
Broadcast (Multi-point)
101
Message is sent to a subset of the nodes. It´s a special case of Broadcast.
Multi-cast
102
Communication Alternatives (Types)
Twisted pair, Coaxial, Fiber optic cable, Satellite, Microwave, Wireless
103
In Data communication, Hosts are connected by ????, each of which can carry one or more ??????
Link: Physical entity Channel: Logical entity
104
The amount of information that can be transmitted over the channel in a given time unit
Capacity - Bandwidth
105
Messages are divided into fixed size packets, each of which is routed from the source to the destination
Packet Switching
106
A dedicated channel is established between the sender and receiver for the duration of the session
Circuit Switching
107
Software that ensures error-free, reliable and efficient communication between hosts. TCP/IP is the best-known one (Used in the Internet)
Communication Protocols