Curriculum Flashcards

Question

What is a transaction?

Answer 1

An execution of a DB program. It ensures atomicity

Answer 2

Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.

Answer 3

Locks ezclap Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. Protocol is called Strict 2-Phase-Locking protocol.

Answer 4

Keep a log while carrying out a set of operations. If there were to be a crash the DBMS can roll back the DB to a previous state. This ensures that all executions on the database is carried out, or none of them (all-or-nothing property).

Answer 5

The transaction log contains enough information to undo all changes made to the data file as part of any individual transaction. The log records the start of a transaction, all the changes considered to be a part of it, and then the final commit or rollback of the transaction.

Answer 6

- Maintain database | - Query large datasets

Answer 7

- System crash recovery - Concurrent access - Quick application development - Data integrity - Security

Answer 8

First layer = View level. Second layer = Logical level. Third layer = Physical level.

Answer 9

It gives data independence

Answer 10

A popular high-level conceptual data model and is | frequently used for the conceptual design of database applications.

Answer 11

An object in the real world with an independent existence, distinguishable from other objects and is described (in DB) using a set of attributes.

Answer 12

The particular properties that describe the entity.

Answer 13

Key Attribute: An attribute that identifies an entity in the entity set.

Answer 14

A collection of similar entities. | E.g., all employees.

Answer 15

A relationship type represents the association between entity types. A relationship is uniquely identified by the participating entities.

Answer 16

A multivalued attribute can have more than one value at a time for an attribute. For ex., the skills of a surgeon is a multivalued attribute since a surgeon can have more than one skill. Another common example is the address field, which can have multiple values like zip code, street address, state, etc.

Answer 17

"Composite attribute is an attribute where the values of that attribute can be further subdivided into meaningful sub-parts." Typical examples for composite attribute are; Name – may be stored as first name, last name, middle initial. Think of composite attributes as ''attributes of attributes''.

Answer 18

A derived attribute as the name suggests is the one that can be derived or calculated with the help of other attributes present themselves. For example – The 'age' of the student can be calculated from ‘date of the birth present as an attribute. Another example - derived attribute 'street_number' can be calculated from the 'address' attribute. A derived attribute is portrayed as a dotted oval.

Answer 19

In a relational database, a weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to. : Example: a ROOM can only exist in a BUILDING. On the other hand, a TIRE might be considered as a strong entity because it also can exist without being attached to a CAR

Answer 20

Example: An employee can work in many departments; a department can have many employees.

Answer 21

Example: A manager can manage many departments but a department can only have one manager.

Answer 22

A (silly) example: A person can only have one heart, one heart can only belong to one person. Another (better) example: In a school database, each student has only one student ID, and each student ID is assigned to only one person.

Answer 23

Relational Model represents how data is stored in Relational Databases. A relational database stores data in the form of relations (tables).

Answer 24

Cardinality is number of rows of data present in the model( Not including the head row containing the column names) Degree is the number of columns in a table.

Answer 25

Yes, duplicate columns would serve no good purpose whatsoever.

Answer 26

SQL is short for Structured Query Language and is a query language for getting specific information/data from a database.

Answer 27

ALTER TABLE Students ADD COLUMN degree varchar(255) BETWEEN age AND address;

Answer 28

It should be rejected

Answer 29

A tabular representation of data.

Answer 30

Simple and intuitive,

Answer 31

SELECT row1, row2, rowi FROM table WHERE qualification

Answer 32

DISTINCT is an optional keyword indicating that the answer should not contain duplicates.

Answer 33

For string matching.

Answer 34

The "_" stands for any ONE character, while "%" is ANY number of characters.

Answer 35

Can be used to compute the union of any two union-compatible sets of tuples(which are themselves the result of SQL queries). Example: SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color='red' UNION SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color='green'

Answer 36

A WHERE clause can itself contain an SQL query! example: SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)

Answer 37

A procedure that starts automatically if specified changes occur to the DBMS Parts: •Event (activates the trigger) •Condition (tests whether the triggers should run) •Action (what happens if the trigger runs)

Answer 38

Allow manipulation and retrieval of data from a database.

Answer 39

- Relational Algebra | - Relational Calculus

Answer 40

More operational, very useful for representing execution plans.

Answer 41

Lets users describe what they want, rather than how to compute it. (Non operational, declarative.)

Answer 42

* Selection (σ ) Selects a subset of rows from relation. * Projection (π ) Deletes unwanted columns from relation. * Cross-product (×) Allows us to combine two relations. * Set-difference (−) Tuples in relation. 1, but not in relation. 2. * Union (U) Tuples in relation.1 and in relation. 2.

Answer 43

Example: π name, rating (table) Symbol is pi.

Answer 44

Example: σ Rating>8 (table) Symbol is lower sigma

Answer 45

When you want the result from both tables including where they are equal.

Answer 46

When you want to only show intersecting results. (Only where results are in table1 AND table2, but not where results are in one but not the other).

Answer 47

When you want to show results which are only in table1 but not results which are in table1 and table2. (Not show intersection results)

Answer 48

Used when you want to pair tables. Example: S1× R1

Answer 49

When you want to rename a relation. The symbol is ρ (rho). Example of use: ρ (C(1→ sid1,5→ sid2), S1× R1).

Answer 50

CONDITION JOIN is used when you want to combine tables based on a condition. CONDITION JOIN is also called THETA-JOIN.

Answer 51

A special case of condition join where the condition C contains only equalities.

Answer 52

∀ = FOR ALL

Answer 53

“there exists some tuple m in relation R ....”

Answer 54

“for every tuple b in boat there exists a tuple t in reserves Such that b.bid = t.bid”

Answer 55

{S | S ε Sailors ^ S.rating >7}

Answer 56

{P | ∃ S ε Sailors ( S.rating >7 ^ P.name = S.name ^ P.age =S.age)}

Answer 57

To rectify issues caused by redundancy.

Answer 58

• Redundant storage: Information is stored repeatedly • Insert anomalies: Might be impossible to store certain information unless other information is stored as well • Delete anomalies: Might be impossible to delete certain information without losing some other information • Update anomalies: When copy of repeated data is update but not all the copies

Answer 59

An integrity constraint

Answer 60

- Storage: RATE = 8 corresponding to WAGE = 10, repeated 3 times - Update: WAGE could be updated without RATE - Insertion: We cannot insert a tuple unless we know WAGE for RATE - Deletion: If we delete all tuples with a given RATE , we lose association

Answer 61

FD occurs when one attribute in a relation uniquely determined another attribute

Answer 62

A decomposition of a relation schema R: • Replace R with two (or more) relation schemas • The decomposed schemas together include all of the attributes in R

Answer 63

o A functional dependency X → Y holds over relation R if, for every allowable instance r of R: • t1 ε r , t2 ε r, π X (t1) = π X (t2) implies π Y (t1) = π Y (t2) • i.e., given two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes.) • Read as X determines Y o An FD is a statement about all allowable relations. • Must be identified by application semantics and at design time • Given some allowable instance r1of R, we can check if it violates some FD f, but we cannot tell if f holds over R! o A key constraint is a special instance of FD

Answer 64

Help eliminating anomalies Help save storage space

Answer 65

Since ssn determines did and did determines lot, ssn determines lot.

Answer 66

X, Y, Z are sets of attributes Reflexivity: If Y ⊆ X, then X → Y Meaning: If Y is a subset or equal to X, then X determines Y.

Answer 67

X, Y, Z are sets of attributes Augmentation: If X → Y, then XZ → YZ for any Z Meaning: If X determines Y, then combined XZ determines YZ for any value of Z.

Answer 68

X, Y, Z are sets of attributes Transitivity: If X → Y and Y → Z, then X → Z Meaning: If X determines the value Y, and Y determines the value of Z, then X determines the value of Z.

Answer 69

● The set of all those attributes which can be functionally determined from an attribute set is called as a closure of that attribute set. ● Closure of attribute set {X} is denoted as {X}+.

Answer 70

Step-01: Add the attributes contained in the attribute set for which closure is being calculated to the result set. Step-02: Recursively add the attributes to the result set which can be functionally determined from the attributes already contained in the result set.

Answer 71

Closure of attribute A: A + = { A } = { A , B , C } ( Using A → BC ) = { A , B , C , D , E } ( Using BC → DE ) = { A , B , C , D , E , F } ( Using D → F ) = { A , B , C , D , E , F , G } ( Using CF → G ) Thus, A+ = { A , B , C , D , E , F , G }

Answer 72

D+ = { D } = { D , F } ( Using D → F ) We can not determine any other attribute using attributes D and F contained in the result set. Thus, D+ = { D , F }

Answer 73

{ B , C } + = { B , C } = { B , C , D , E } ( Using BC → DE ) = { B , C , D , E , F } ( Using D → F ) = { B , C , D , E , F , G } ( Using CF → G ) Thus, { B , C } + = { B , C , D , E , F , G }

Answer 74

The purpose of normalization is to identify a suitable set of relations that support the data requirements of an enterprise. The characteristics of a suitable set of relations include the following: 1. The minimal number of attributes necessary to support the data requirements of the enterprise; 2. Attributes with a close logical relationship (described as functional dependency) are found in the same relation; 3. Minimal redundancy with each attribute represented only once with the important exception of attributes that form all or part of foreign keys, which are essential for the joining of related relations.

Answer 75

The benefits of using a database that has a suitable set of relations is that 1. the database will be easier for the user to access and maintain 2. take up minimal storage space on the computer.

Answer 76

- Normal forms help to find problems that might arise from the current schema - If a relation is in a certain normal form, it is known that certain kinds of problems are avoided / minimized. This can be used to help us decide whether decomposing the relation will help.

Answer 77

1NF (First Normal Form) Rules - Each table cell should contain a single value. - Each record needs to be unique Meaning: Not having name and age in same cell, not having address and salary in same cell etc.. Each row should be unique in some way.

Answer 78

2NF (Second Normal Form) Rules - Rule 1- Be in 1NF - Rule 2- No partial dependencies. Meaning: First normal form rules must be applied. All attributes should be fully dependent on primary key.

Answer 79

❖ Should be in 2NF ❖ Contains No Transitive Dependencies Meaning: A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change. An example would be if you have address and country in the same table. If you update/change the address the country could change. This would mean that the address is transitively dependent on country and vice versa. A solution would be to split the address and country attributes into a separate table with address as the primary key.

Answer 80

❖ Should be in 3NF ❖ Only (super) keys should determine other attributes ❖ A non key should not determine (super) keys

Answer 81

For a table to satisfy the Fourth Normal Form, it should satisfy the following two conditions: - It should be in the Boyce-Codd Normal Form. - And, the table should not have any Multi-valued Dependency.

Answer 82

* Lossless-join decomposition (takes care of recovery) | * Dependency-preserving decomposition (takes care of checking integrity constraints)

Answer 83

Definition: A decomposition of R into two schemas with attributes sets X and Y is said to be a lossless-join decomposition with respect to F if, for every instance r of R that satisfies the dependencies in F, π X (r) ⋈ π Y (r) = r i.e. Informally: If we break a relation, R, into bits, when we put the bits back together, we should get exactly R back again

Answer 84

Heap (unordered) file: Simplest file structure, suitable when typical access is a file scan retrieving all records. Records in a heap file are stored in random order across the pages of the file

Answer 85

Sorted Files: Best if records must be retrieved in some order, or only a `range’ of records is needed.

Answer 86

Indexes: Data structures to organize records via trees or hashing to optimize certain kinds of retrieval operations • An index allows us to efficiently retrieve all records that satisfy search conditions on the search key fields of the index • Additional indexes on a given collection of data records, each with a different search key, to speed up search operations that are not efficiently supported by the file organization used to store the data records.

Answer 87

• Three alternatives: • A data entry k* is an actual data record (with search key value k). • A data entry is a (k, rid) pair, where rid is the record id of a data record with search key value k. • A data entry is a (k, rid-list) pair, where rid-list is a list of record id's of data records with search key value k.

Answer 88

• The index is used to store actual data records; each entry b is a data record with search key value k. E.g.: a heap or sorted files. • At most one index on a given collection of data records can use Alternative 1. (Otherwise, data records are duplicated, leading to redundant storage and potential inconsistency.) • If data records are very large, # of pages containing data entries is high. Implies size of auxiliary information in the index is also large, typically.

Answer 89

• Contain data entries that point to data records, are independent of the file organization that is used for the indexed file (i.e., the file that contains the data records). • Data entries are typically much smaller than data records. So, better than Alternative 1 with large data records, especially if search keys are small. (Portion of index structure used to direct search, which depends on size of data entries, is much smaller than with Alternative 1.) • Alternative 2 more compact than Alternative 1, but leads to variable sized data entries even if search keys are of fixed length.

Answer 90

If order of data records is the same as, or `close to’, order of data entries, then called clustered index. A good example of a clustered index would be a phone book where the name(index) points to a number(value) on the same page.

Answer 91

The data is stored in one place, and index is stored in another place. Since, the data and non-clustered index is stored separately, then you can have multiple non-clustered index in a table. Imagine Unclustered as an index in a book. You look up chapter 4(key) or something and then have to look the chapter(value) up on another page in the book.

Answer 92

* Records can be organized using a technique called hashing to quickly find records that have a given search key value. * Records in a file are grouped in buckets, where a bucket consists of a primary page and, possibly, additional pages linked in a chain. * The bucket to which a record belongs can be determined by applying a special function, called a hash function, to the search key. * Given a bucket number, a hash-based index structure allows us to retrieve the primary page for the bucket in one or two disk I/Os. * On inserts, the record is inserted into the appropriate bucket, with 'overflow' pages allocated as necessary. ``` • To search for a record with a given search key value, we apply the hash function to identify the bucket to which such records belong and look at all pages in that bucket. ```

Answer 93

• Organize records using a tree- like data structure. • The data entries are arranged in sorted order (clustered) by search key value, and a hierarchical search data structure is maintained that directs searches to the correct page of data entries. • The lowest level of the tree, called the leaf level, contains the data entries • A node on a tree is a page

Answer 94

``` Atomicity = The entire transaction takes place at once or does not happen at all. Consistency = Database must be consistent before and after transaction Isolation = Multiple transaction occur independently without interference Durability = The changes of a successful transaction occurs even if the system failure occurs. ```

Answer 95

• BEGIN / START TRANSACTION: start a transaction; • COMMIT: commit the current transaction and make its changes permanent; • ROLLBACK: uncommit the current transaction block; • SET autocommit=[0/OFF, 1/ON]: disable or enable the auto-commit mode for the current transaction

Answer 96

❖ Each Xact must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. ❖ All locks held by a transaction are released when the transaction completes ❖ If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object.

Answer 97

❖ If a transaction Ti is aborted, all its actions have to be undone. Not only that, if Tj reads an object last written by Ti, Tj must be aborted as well! ❖ Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time. •If Ti writes an object, Tj can read this only after Ti commits. ❖ In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is recorded. This mechanism is also used to recover from system crashes: all active Xacts at the time of the crash are aborted when the system comes back up.

Answer 98

❖ The following actions are recorded in the log: ❖ Ti writes an object: the old value and the new value. •Log record must go to disk before the changed page! ❖ Ti commits/aborts: a log record indicating this action. ❖ Log records are chained together by Xact id, so it’s easy to undo a specific Xact. ❖ Log is often duplexed and archived on stable storage. ❖ All log related activities (and in fact, all CC related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS.

Answer 99

•Analysis: Scan the log forward (from the most recent checkpoint) to identify all Xacts that were active, and all dirty pages in the buffer pool at the time of the crash. •Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to ensure that all logged updates are in fact carried out and written to disk. •Undo: The writes of all Xacts that were active at the crash are undone (by restoring the before value of the update, which is in the log record for the update), working backwards in the log. (Some care must be taken to handle the case of a crash occurring during the recovery process!)

Curriculum Flashcards

(129 cards)