Parcial 2 Flashcards

1
Q

Is a finite, time varying set of n-tuples (d1, d2, …, dn) such that d1 e Dom1, d2 e Dom2, …, dn e Domn, and A1 c D1, A2 c D2, …, An c Dn.

A

Relation with Attributes defined over n Domains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Tabular structure of data where
- R is the table heading
- Attributes are table columns
- Each tuple is a row

A

Relational Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is the definition; i.e., a set of attributes

A

Relation scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is a set of relation schemes: i.e., a set of sets of attributes

A

Relational Database Scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is an instance of a relation scheme

A

Relation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

is a subset of the Cartesian product of the domains of all attributes, i.e.

A

Relation over a relation scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Is a type in the programming language sense.

A

Domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Is a set of acceptable values for a variable of a given type.

A

Domain values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Binary operations (e.g., comparison to one another, addition, etc.) can be performed
on them.

A

Domain compatibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Attribute values are repeated for each
project that the employee is involved in.

A

Repetition Anomaly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If any attribute of project is updated, multiple tuples have to be updated to reflect the change.

A

Update Anomaly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

It may not be possible to store information about a new project until an employee is assigned to it.

A

Insertion Anomaly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted, or the information about that project is lost.
May have to delete many tuples.

A

Deletion Anomaly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions.

A

Normalization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Criteria that should the decomposed schemas follow: Recover the original relation -> no spurious joins

A

Reconstructability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Criteria that should the decomposed schemas follow: No information loss

A

Lossless decomposition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Criteria that should the decomposed schemas follow: The constraints (i.e., dependencies) that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.

A

Dependency preservation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Eliminates the relations within relations or relations as attributes of tuples.

A

First Normal Form (1NF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Eliminate the partial functional dependencies of non-prime attributes to key attributes

A

Second Normal Form (2NF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Eliminate the transitive functional dependencies of non-prime attributes to key attributes

A

Third Normal Form (3NF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Eliminate the partial and transitive functional dependencies of prime (key) attributes to key.

A

Boyce-Codd Normal Form (BCNF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Specify how to obtain the result using a set of operators

A

Relational Algebra

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Relational Algebra Operators: Produces a horizontal subset of the operand relation

A

Selection (sigma)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Relational Algebra Operators: Produces a vertical slice of a relation

A

Projection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Relational Algebra Operators: where R, S are relations, t is a tuple variable Result contains tuples that are in R or in S, but not both (duplicates removed) R, S should be union-compatible
Union
26
Relational Algebra Operators: where R and S are relations, t is a tuple variable Result contains all tuples that are in R, but not in S. R – S != S – R R, S union-compatible
Set Difference
27
Relational Algebra Operators: The result of R × S is a relation of degree (k1+ k2) and consists of all (n1* n2)-tuples where each tuple is a concatenation of one tuple of R with one tuple of S
Cartesian (Cross) Product
28
Relational Algebra Operators: R n S = {t | t e R and t e S} = R – (R – S) R, S union-compatible
Intersection
29
Relational Algebra Operators: Is derivate of Cartesian product. There are various forms of join. The primary classification is between inner join and outer join.
Join
30
The formula F only contains equality (=) as arithmetic operator. R ⋈R.A=S.B S The above example is a special case of q-join which is called the
Equi-Join
31
It is a equi-join of two relations R and S over an attribute (or attributes) common to both R and S and projecting out one copy of those attributes
Natural Join
32
Inner join requires the joined tuples from two operand relations to satisfy the join predicate. In contrast, in ___________ tuples exist in the result relation regardless Ensures that tuples from one or both relations that do not satisfy the join condition still appear in the final result with other relationʼs attribute values set to NULL
Outer-Join
33
The tuples from the left operand relation are always in the result.
Left outer join
34
The tuples from the right operand relation are always in the result.
Right outer join
35
Tuples from both relations are always in the result.
Full outer join
36
The ________ of relation R, defined over the set of attributes A, by relation S, defined over the set of attributes B, is the subset of tuples of R that participate in the join of R with S.
Semijoin
37
R of degree k1 (R = {A1,…,Ak1}) S of degree k2 (S = {B1,…,Bk2}) Let A = {A1,…,Ak1} [i.e., R(A)] and B = {B1,…,Bk2} [i.e., S(B)] and B c A. T = R ÷ S gives T of degree k1-k2 [i.e., T(Y) where Y = A-B] such that for a tuple t to appear in T, the values in t must appear in R in combination with every tuple in S.
Division (Quotient)
38
Specify the properties that the result should hold
Relational Calculus
39
Query of the form {t|F{t}} where t is a tuple variable F is a well-formed formula
Tuple Relational Calculus
40
Query of the form x1, x2, …, xn|F(x1, x2, …, xn) where F is a well-formed formula in which x1, x2, …, xn are the free variables
Domain Relational Calculus
41
An interconnected collection of autonomous computers that are capable of exchanging information among themselves.
Computer Network
42
Network of networks
Internet
43
✦ Distance between any two nodes > 20km and can go as high as thousands of kms ✦ Long delays due to distance traveled ✦ Heterogeneity of transmission media ✦ Speeds of 150Mbps to 10Gbps (OC192 on the backbone)
Wide Area Network (WAN)
44
✦ Limited in geographic scope (usually < 2km) ✦ Speeds 10-1000 Mbps ✦ Short delays and low noise
Local Area Network (LAN)
45
✦ In between LAN and WAN
Metropolitan Area Network (MAN)
46
Types of Networks: Topologies
Irregular (Internet), Bus (Typical in LAN - Ethernet), Star, Ring, Mesh
47
- One or more (direct or indirect) links between each pair of nodes - Communication always between two nodes Receiver and sender are identified by their addresses included in the message header - Message may follow one of many links between the sender and receiver using switching or routing
Point-to-point (unicast)
48
Messages are transmitted over a shared channel and received by all the nodes Each node checks the address and if it not the intended recipient, ignores
Broadcast (Multi-point)
49
Message is sent to a subset of the nodes. It´s a special case of Broadcast.
Multi-cast
50
Communication Alternatives (Types)
Twisted pair, Coaxial, Fiber optic cable, Satellite, Microwave, Wireless
51
In Data communication, Hosts are connected by ????, each of which can carry one or more ??????
Link: Physical entity Channel: Logical entity
52
The amount of information that can be transmitted over the channel in a given time unit
Capacity - Bandwidth
53
Messages are divided into fixed size packets, each of which is routed from the source to the destination
Packet Switching
54
A dedicated channel is established between the sender and receiver for the duration of the session
Circuit Switching
55
Software that ensures error-free, reliable and efficient communication between hosts. TCP/IP is the best-known one (Used in the Internet)
Communication Protocols
56
In Distributed DBMS, the placement of applications entails
placement of the distributed DBMS software; and placement of the applications that run on the database
57
each application and its data execute at one site, and there is not communication with any other program.
No sharing
58
All the programs are replicated at all sites, but data files are not. Accordingly, user requests are handled at the site where they originate, and the necessary data files are moved around the network.
Data sharing
59
both data and programs may be shared, meaning that a program at a given site can request a service from another program at second site.
Data-plus program sharing
60
Pattern Behavior: The user requests do not change over time (difficult to find in real apps).
Static
61
Pattern Behavior: The user request change over time more commonly in DDBS.
Dynamic
62
- One possibility is that designers do not have any information about how users will access the database. - The second possibility is that designers have complete information, where the access pattern can reasonably be predicted and do not deviate signifiable from these predictions. - The third possibility is that designers have partial information, where there are deviations from the predictions.
Level of knowledge
63
Approach for developing any database ✦ mostly in designing systems from scratch ✦ mostly in homogeneous systems
Top-down
64
Approach for developing any database ✦ when the databases already exist at a number of sites
Bottom-up
65
Starts from the general and moves to the specific. In other words, you start with a general idea of what is needed for the system and then work your way down to the more specific details of how the system will interact. ¡This process involves the identification of different entity types and the definition of each entity’s attributes.
The top-down design method
66
Defines the environment of the system and elicits both the data and processing needs of all potential database users. Also specifies where the final system is expected to stand with respect to the objectives of a distributed DBMS.
Requirement analysis
67
This activity deals with defining the interfaces for end users.
View design.
68
This is the process by which the enterprise is examined to determine entity types and relationships among these entities.
Conceptual design.
69
Rather than distributing relations, it is quite common to divide them into sub-relations, called ___________, which are then distributed.
fragments
70
✦ views are subsets of relations èlocality ✦ extra communication
Relation
71
✦ concurrent execution of a number of transactions that access different portions of a relation ✦ views that cannot be defined on a single fragment will require extra processing ✦ semantic data control (especially integrity enforcement) more difficult
Fragments of relations (sub-relations)
72
Relation instances are essentially tables, so the issue is one of finding alternative ways of dividing a table into smaller ones. There are two alternatives for this:
dividing it horizontally or dividing it vertically.
73
Decomposition of relation R into fragments R1, R2, ..., Rn is complete if and only if each data item in R can also be found in some Ri
Completeness
74
If relation R is decomposed into fragments R1, R2, ..., Rn, then there should exist some relational operator ∇ such that R = ∇1≤i≤nRi
Reconstruction
75
If relation R is decomposed into fragments R1, R2, ..., Rn, and data item di is in Rj, then di should not be in any other fragment Rk (k ≠ j ).
Disjointness
76
When data are allocated, it may either be replicated or maintained as a single copy. The reasons for replication are
Reliability and efficiency of read-only queries.
77
If there are multiple copies of a data item, there is a good chance that some copy of the data will be accessible somewhere even when ____________. Furthermore, read-only queries that access the same data items can be ______________________________.
1. System failure occur 2. executed in parallel since copies exists on multiple sites
78
Allocation Alternatives Each fragment resides at only one site
Non-Replicated -> Partitioned
79
Allocation Alternatives Each fragment at each site
Replicated -> Fully replicated
80
Allocation Alternatives Each fragment at some of the sites
Replicated -> Partially Replicated
81
Allocation Alternatives If read-only queries / update queries << 1,replication is advantageous, otherwise replication may cause problems
Rule of thumb
82
It is performed using predicates that are defined on the original relation.
Primary Horizontal Fragmentation (PHF)
83
It is the partitioning of a relation that results from predicates being defined on another relation.
Derived Horizontal Fragmentation (DHF)
84
The number of tuples of the relation that would be accessed by a user query which is specified according to a given minterm predicate mi.
minterm selectivities: sel(mi)
85
✦ The frequency with which a user application qi accesses data. ✦ Access frequency for a minterm predicate can also be defined.
access frequencies: acc(qi)
86
A set of simple predicates Pr is said to be complete if and only if the accesses to the tuples of the minterm fragments defined on Pr requires that two tuples of the same minterm fragment have the same probability of being accessed by any application.
Completeness of Simple Predicates
87
If a predicate influences how fragmentation is performed, (i.e., causes a fragment f to be further fragmented into, say, fi and fj) then there should be at least one application that accesses fi and fj differently.
Minimality of Simple Predicates
88
Since Pr' is complete and minimal, the selection predicates are complete
PHF: Completeness
89
If relation R is fragmented into FR = {R1,R2,…,Rr} R = U vRi e FR Ri
PHF: Reconstruction
90
Minterm predicates that form the basis of fragmentation should be mutually exclusive
PHF: Disjointness
91
Let R be the member relation of a link whose owner is relation S which is fragmented as FS = {S1, S2, ..., Sn}. Furthermore, let A be the join attribute between R and S. Then, for each tuple t of R, there should be a tuple t' of S such that t[A] = t' [A]
DHF: Correctness
92
Same as primary horizontal fragmentation
DHF: Reconstruction
93
Simple join graphs between the owner and the member fragments.
DHF: Disjointness
94
Has been studied within the centralized context design methodology physical clustering *More difficult than horizontal, because more alternatives exist.
Vertical Fragmentation
95
Aproaches of vertical fragmentation
Grouping: Attributes to fragments -> Overlapping Splitting: Relation to fragments -> Non-Overlapping
96
Ensure that authorized users perform correct operations on the database, contributing to the maintenance of the database integrity.
Objective of Semantic Data Control
97
They are generated from base relation(s) by a query Not stored as base relations
View - Virtual Relation
98
View definition storage should be treated as
Database storage
99
Query modification results in a
distributed query
100
Static copy of the view, avoid view derivation for each query But periodic recomputing of the view may be expensive Actual version of a view - Stored as a database relation, possibly with indices
Materialized view
101
Process of updating (refreshing) the view to reflect changes to base data - Resembles data replication but there are differences -- View expressions typically more complex -- Replication configurations more general
Materialized View Maintenance
102
As part of the updating transaction, e.g. through 2PC View always consistent with base data and fast queries But increased transaction time to update base data
When to Refresh a View - Immediate Mode
103
Through separate refresh transactions - No penalty on the updating transactions
When to Refresh a View - Deferred Mode
104
Triggered at different times with different trade-offs 1. ______ : just before evaluating a query on the view 2. _______: Periodically: every hour, every day, etc. 3. _______: Forcedly: after a number of predefined updates
1. Lazily 2. Periodically 3. Forcedly
105
How to Refresh a View - Efficient if there has been many changes
Full computing from base data
106
How to Refresh a View - Better if a small subset has been changed - Uses differential relations which reflect updated data only
Incremental computing by applying only the changes to the view
107
- Prevents the physical content of data to be understood by unauthorized users - Uses encryption/decryption techniques (Public key)
Data protection
108
Only authorized users perform operations they are allowed to on database objects
Access control
109
Long been provided by DBMS with authorization rules
Discretionary access control (DAC)
110
Increases security with security levels
Multilevel access control (MAC)
111
Subjects (users, groups of users) who execute operations Operations (in queries or application programs) Objects, on which operations are performed
Main actors
112
A malicious user can access unauthorized data through an authorized user
Problem with DAC
113
Rule of access - subject S is allowed to read an object of level L only if level(S) ≥ L - Protect data from unauthorized disclosure, e.g. a subject with secret clearance cannot read top secret data
No read up
114
Rule of access - subject S is allowed to write an object of level L only if level(S) ≤ L - Protect data from unauthorized change, e.g. a subject with top secret clearance can only write top secret data but not secret data (which could then contain top secret data)
No write down
115
Remote user authentication - Typically using a directory service -- Should be replicated at some sites for availability Management of DAC rules - Problem if users’ group can span multiple sites -- Rules stored at some directory based on user groups location -- Accessing rules may incur remote queries Covert channels in MAC
Problems in a distributed environment
116
Indirect means to access unauthorized data
Covert Channels
117
Maintain database consistency by enforcing a set of constraints defined on the database.
Semantic Integrity Control
118
Basic semantic properties inherent to a data model e.g., unique key constraint in relational model
Structural constraints
119
Regulate application behavior, e.g., dependencies in the relational model
Behavioral constraints
120
Control embedded in each application program
Procedural
121
Assertions in predicate calculus - Easy to define constraints - Definition of database consistency clear - But inefficient to check assertions for each update -- Limit the search space -- Decrease the number of data accesses/assertion -- Preventive strategies -- Checking at compile time
Declarative
122
Specify the more common constraints of the relational model
Predefined constraints
123
Types of predefined constraints
Not-null attribute Unique key Foreign key Functional dependency
124
Express preconditions that must be satisfied by all tuples in a relation for a given update type
Precompiled constraints (INSERT, DELETE, MODIFY)
125
Types of precompiled constraints
Domain constraint Ex. CHECK ON PROJ (BUDGET≥500000 AND BUDGET≤1000000) Domain constraint on deletion Ex. CHECK ON PROJ WHEN DELETE (BUDGET = 0) Transition constraint Ex. CHECK ON PROJ (NEW.BUDGET > OLD.BUDGET AND NEW.PNO = OLD.PNO)
126
Constraints that must always be true. Formulae of tuple relational calculus where all variables are quantified.
General constraints
127
Types of general constraints
Functional dependency Ex. CHECK ON e1:EMP, e2:EMP (e1.ENAME = e2.ENAME IF e1.ENO = e2.ENO) Constraint with aggregate function Ex. CHECK ON g:ASG, j:PROJ (SUM(g.DUR WHERE g.PNO = j.PNO) < 100 IF j.PNAME = "CAD/CAM")
128
Methods of integrity enforcements
- Detection - Preventive
129
Preventive Add the assertion qualification to the update query Only applicable to tuple calculus formulae with universally quantified variables
Query Modification
130
Definition of constraints - Consideration for fragments Where to store - Replication - Non-replicated : fragments Enforcement - Minimize costs
Problems of distributed integrity control
131
Types of distributed assertions ------------- -> Single relation, single variable -> Domain constraint
Individual assertions
132
Types of distributed assertions ------------- Single relation, multi-variable * functional dependency Multi-relation, multi-variable * foreign key
Set oriented assertions
133
- Similar to the centralized techniques - Transform the assertions to compiled assertions
Assertion Definition
134
Assertion Storage ------------------------ - One relation, only fragments - At each fragment site, check for compatibility - If compatible, store; otherwise reject - If all the sites reject, globally reject
Individual assertions
135
Assertion Storage ------------------------ - Involves joins (between fragments or relations) - May be necessary to perform joins to check for compatibility - Store if compatible
Set-oriented assertions
136
Assertion Enforcement ----------- Enforce at the site where the update is issued
Individual Assertions -> Update = Insert
137
Assertion Enforcement ----------- Send the assertions to all the sites involved Execute the qualification to obtain R+ and R- Each site enforces its own assertion
Individual Assertions -> Update = Qualified
138
Assertion Enforcement ----------- Similar to individual assertions with qualified updates
Set-oriented Assertions -> Single relation
139
Assertion Enforcement ----------- Move data to perform joins; then send the result to query master site
Set-oriented Assertions -> Multi-relation