Curriculum Flashcards

1
Q

What is a Relational Database?

A

A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an Entity Relationship(ER) diagram?

A

An Entity Relationship (ER) Diagram is a type of flowchart that illustrates how “entities” such as people, objects or concepts relate to each other within a system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an ENTITY in an entity relationship diagram and how is an ENTITY shown in an ER diagram?

A

A definable thing—such as a person, object, concept or event—that can have data stored about it. Think of entities as nouns. Examples: a customer, student, car or product. Think of the entity name as the table name.

Will typically be shown as a rectangle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an ATTRIBUTE in an entity relationship diagram and how is the ATTRIBUTE displayed in the ER diagram?

A

An ATTRIBUTE is the property or characteristic of an ENTITY. Think of the attributes at the column names of a table.

An ATTRIBUTE is often shown as an oval or circle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a RELATIONSHIP in an ER diagram and how is the RELATIONSHIP displayed in the ER diagram?

A

How ENTITIES act upon each other or are associated with each other. Think of RELATIONSHIPS as verbs.
Examples would be “works in”, “goes to”, “teaches”, “studies”, “carrying” “has”, “contains” etc..

RELATIONSHIPS are typically displayed as diamonds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would you portray the KEY ATTRIBUTE to distinguish it from other attributes of an ENTITY in an ER diagram?

A

The KEY ATTRIBUTE should be portrayed with underline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a RECURSIVE RELATIONSHIP in an ER diagram?

A

When there is a relationship between two entities of the same type, it is known as a recursive relationship. This means that the relationship is between different instances of the same entity type

Ex: An employee can supervise multiple employees. Hence, this is a recursive relationship of entity employee with itself. This is a 1 to many recursive relationship as one employee supervises many employees.

In an ER model this is shown as a relationship which “connects” to the same entity twice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How would you describe a PRIMARY KEY in DBMS?

A

A PRIMARY KEY is a column of a table or a set of columns that helps to identify every record present in that table uniquely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can a table contain more than one PRIMARY KEY?

A

No, there can be only one PRIMARY KEY per table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can a PRIMARY KEY contain NULL values?

Can a PRIMARY KEY contain duplicates?

A

No and…. no.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a SUPER KEY?

A

A SUPER KEY is a single column or a set of columns which help identify a row uniquely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a CANDIDATE KEY?

A

CANDIDATE KEYS are those attributes that uniquely identify rows of a table. The PRIMARY KEY of a table is selected from one of the CANDIDATE KEYS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an ALTERNATE KEY?

A

When a PRIMARY KEY is chosen from the CANDIDATE KEYS the remaining ones that didn’t get chosen are considered ALTERNATE KEYS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a FOREIGN KEY?

A

FOREIGN KEYS is used to establish relationships between two tables. A FOREIGN KEY will require each value in a column or set of columns to match the PRIMARY KEY of the referential table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a COMPOSITE KEY?

A

COMPOSITE KEY is a set of two or more attributes that help identify each tuple in a table uniquely. The attributes in the set may not be unique when considered separately. However, when taken all together, they will ensure uniqueness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a DATABASE MANAGEMENT SYSTEM (DBMS)?

A

A set of programs to access database which provide a way to store and retrieve
database information that is both convenient and efficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why use a DBMS? (5 reasons)

A
  • Data independence and efficient access.
  • Reduced application development time.
  • Data integrity and security.
  • Uniform data administration.
  • Concurrent access, recovery from crashes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a data model?

A

A collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What FOUR categories of data models do we have?

A

▪ Relational Model

▪ Entity-Relationship Model

▪ Object-Based Data Model

▪ Semistructured Data Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is data independence?

A

Capacity to change the schema at one level of a database system without
having to change the schema at the next higher level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is logical data independence?

A

Logical Data Independence is defined as the ability to make changes in the structure of the middle level of the Database Management System (DBMS) without affecting the highest-level schema or application programs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is physical data independence?

A

Physical Data Independence is defined as the ability to make changes in the structure of the lowest level of the Database Management System (DBMS) without affecting the higher-level schemas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is concurrency in DBMS?

A

Concurrent execution of user programs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Why would concurrency be important in DBMS?

A

Good concurrent execution would mean better performance of the DBMS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a transaction?

A

An execution of a DB program. It ensures atomicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is important regarding a DB transaction?

A

Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How do DBMS ensure the concurrent executions of transactions are atomic?

Do you know the name of the protocol?

A

Locks ezclap

Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction.

Protocol is called Strict 2-Phase-Locking protocol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How do a DBMS ensure atomicity during a transaction in regard to a system crash?

A

Keep a log while carrying out a set of operations. If there were to be a crash the DBMS can roll back the DB to a previous state. This ensures that all executions on the database is carried out, or none of them (all-or-nothing property).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What actions are recorded in the log?

A

The transaction log contains enough information to undo all changes made to the data file as part of any individual transaction.

The log records the start of a transaction, all the changes considered to be a part of it, and then the final commit or rollback of the transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Give examples of what the DBMS is used for.

A
  • Maintain database

- Query large datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are 5 benefits of a DBMS?

A
  • System crash recovery
  • Concurrent access
  • Quick application development
  • Data integrity
  • Security
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

A DBMS has a layered architecture, what are the three layers called?

A

First layer = View level.

Second layer = Logical level.

Third layer = Physical level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What are the benefit of DBMS abstraction?

A

It gives data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is an ER model?

A

A popular high-level conceptual data model and is

frequently used for the conceptual design of database applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is an entity?

A

An object in the real world with an independent existence,
distinguishable from other objects and is described (in DB) using a set
of attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is an attribute?

A

The particular properties that describe the entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a key attribute?

A

Key Attribute: An attribute that identifies an entity in the entity set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is an Entity set?

A

A collection of similar entities.

E.g., all employees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is a relationship in ER model?

A

A relationship type represents the association between
entity types. A relationship is uniquely identified by the participating
entities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is a multivalued attribute in ER model?

A

A multivalued attribute can have more than one value at a time for an attribute. For ex., the skills of a surgeon is a multivalued attribute since a surgeon can have more than one skill. Another common example is the address field, which can have multiple values like zip code, street address, state, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is a composite attribute?

A

“Composite attribute is an attribute where the values of that attribute can be further subdivided into meaningful sub-parts.” Typical examples for composite attribute are; Name – may be stored as first name, last name, middle initial.

Think of composite attributes as ‘‘attributes of attributes’’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is a derived attribute and how is it portrayed?

A

A derived attribute as the name suggests is the one that can be
derived or calculated with the help of other attributes present themselves.

For example – The ‘age’ of the student can be calculated from
‘date of the birth present as an attribute.

Another example - derived attribute ‘street_number’ can be calculated from the ‘address’ attribute.

A derived attribute is portrayed as a dotted oval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is a weak entity?

A

In a relational database, a weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to. : Example: a ROOM can only exist in a BUILDING. On the other hand, a TIRE might be considered as a strong entity because it also can exist without being attached to a CAR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Explain a many-to-many relationship.

A

Example: An employee can work in many departments; a department can have many employees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Explain a one-to-many relationship.

A

Example: A manager can manage many departments but a department can only have one manager.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Explain one-to-one relationship

A

A (silly) example: A person can only have one heart, one heart can only belong to one person.

Another (better) example: In a school database, each student has only one student ID, and each student ID is assigned to only one person.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is a relational model (RM)?

A

Relational Model represents how data is stored in Relational Databases. A relational database stores data in the form of relations (tables).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is the cardinality and degree of a relational model?

A

Cardinality is number of rows of data present in the model( Not including the head row containing the column names)

Degree is the number of columns in a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Do all columns in a relation instance have to be distinct?

A

Yes, duplicate columns would serve no good purpose whatsoever.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is SQL and what is it short for?

A

SQL is short for Structured Query Language and is a query language for getting specific information/data from a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How would you add a column “degree” to a ‘Students’ relation(table) between columns/attributes “age” and “address” using SQL?

A

ALTER TABLE Students
ADD COLUMN degree varchar(255)
BETWEEN age AND address;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What should be done if an Enrolled tuple with a non-existent student id is inserted into a ‘students’ table?

A

It should be rejected

53
Q

What is a Relational Model?

A

A tabular representation of data.

54
Q

Why would we use relational model?

A

Simple and intuitive,

55
Q

What type of data would you insert into 1, 2 and 3?

SELECT 1
FROM 2
WHERE

A

SELECT row1, row2, rowi
FROM table
WHERE qualification

56
Q

When would you use DISTINCT in a SELECT statement in SQL?

Example: SELECT DISTINCT name FROM students;

A

DISTINCT is an optional keyword indicating that the answer should not
contain duplicates.

57
Q

Are duplicates removed/eliminated by default in SQL SELECT statement.

A

No.

58
Q

In what situation would you use the LIKE keyword in a query?

A

For string matching.

59
Q

If you were using the LIKE keyword, when would you use “_” and when would you use “%”?

A

The “_” stands for any ONE character, while “%” is ANY number of characters.

60
Q

When would you use UNION?

A

Can be used to compute the union of any two union-compatible sets of tuples(which are themselves the result of SQL queries).

Example:
SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=’red’
UNION
SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=’green’

61
Q

What is a nested query?

A

A WHERE clause can itself contain an SQL query!

example:
SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)

62
Q

What is a trigger in DBMS and what is the parts of a trigger?

A

A procedure that starts automatically if specified changes occur to the DBMS

Parts:
•Event (activates the trigger)
•Condition (tests whether the triggers should run)
•Action (what happens if the trigger runs)

63
Q

What is the job of a query language?

A

Allow manipulation and retrieval of data from a database.

64
Q

What is the name of the two mathematical query languages?

A
  • Relational Algebra

- Relational Calculus

65
Q

What is the characteristics of Relational Algebra?

A

More operational, very useful for representing execution plans.

66
Q

What is the characteristics of Relational Calculus?

A

Lets users describe what they want, rather than how to compute it. (Non operational, declarative.)

67
Q

What is the 5 basic operations of Relational Algebra?

A
  • Selection (σ ) Selects a subset of rows from relation.
  • Projection (π ) Deletes unwanted columns from relation.
  • Cross-product (×) Allows us to combine two relations.
  • Set-difference (−) Tuples in relation. 1, but not in relation. 2.
  • Union (U) Tuples in relation.1 and in relation. 2.
68
Q

How would you write project in relational algebra and what is the symbol for projection?

A

Example: π name, rating (table)

Symbol is pi.

69
Q

How would you use Selection in relational algebra and what is the symbol for selection?

A

Example: σ Rating>8 (table)

Symbol is lower sigma

70
Q

When to use UNION in relational algebra? (table1 ⋃ table2)

A

When you want the result from both tables including where they are equal.

71
Q

When would you want to use Intersection in relational algebra? (table1 ∩ table2)

A

When you want to only show intersecting results. (Only where results are in table1 AND table2, but not where results are in one but not the other).

72
Q

When do you want to use SET-DIFFERENCE? (table1 − table2)

A

When you want to show results which are only in table1 but not results which are in table1 and table2. (Not show intersection results)

73
Q

When to use the CROSS-PRODUCT in relational algebra?

A

Used when you want to pair tables.

Example:

S1× R1

74
Q

When to use the renaming operator in relational algebra and what is the symbol of rename?

A

When you want to rename a relation. The symbol is ρ (rho).

Example of use: ρ (C(1→ sid1,5→ sid2), S1× R1).

75
Q

When would you want to use CONDITION JOIN in relational algebra?

What is CONDITION JOIN also called.

A

CONDITION JOIN is used when you want to combine tables based on a condition.

CONDITION JOIN is also called THETA-JOIN.

76
Q

What is EQUI-JOIN.

A

A special case of condition join where the condition C contains only equalities.

77
Q

What do 𝑎 ∈ R mean in relational calculus?

A

a is in R

78
Q

⋀ meaning in relational calculus?

A

⋀ = AND

79
Q

⋁ meaning in relational calculus?

A

⋁ = OR

80
Q

¬ meaning in relational calculus?

A

¬ = NOT

81
Q

∃ meaning in relational calculus

A

exists

82
Q

∀ meaning in relational calculus?

A

∀ = FOR ALL

83
Q

How would you express ∃ m ∈ R verbally?

A

“there exists some tuple m in relation R ….”

84
Q

How would you express the following statement;

∀ b ∈ boat ∃ t ∈ reserves [ b(bid) = t(bid) ]

A

“for every tuple b in boat there exists a tuple t in reserves Such that b.bid = t.bid”

85
Q

How would you formulate the following SQL query as a relational calculus statement?

SELECT * FROM Sailors where rating >7

A

{S | S ε Sailors ^ S.rating >7}

86
Q

How would you formulate the following SQL query as a relational calculus statement?

SELECT sname, age FROM Sailors where rating >7

A

{P | ∃ S ε Sailors ( S.rating >7 ^ P.name = S.name ^ P.age =S.age)}

87
Q

Why use Schema Refinement and Normal Forms?

A

To rectify issues caused by redundancy.

88
Q

What are 4 problems associated with not having redundancy?

A

• Redundant storage: Information is stored repeatedly

• Insert anomalies: Might be impossible to store certain information
unless other information is stored as well

• Delete anomalies: Might be impossible to delete certain information
without losing some other information

• Update anomalies: When copy of repeated data is update but not all the
copies

89
Q

What is a Functional Dependency (FD)?

A

An integrity constraint

90
Q

Name the STORAGE, UPDATE, INSERTION, DELETION anomalies that could emerge from this table:

ID NAME LOT RATE WAGE HOURSE
123-22-3666 Attishoo 48 8 10 40
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40

Click the edit pen in the top right corner to view the table more clearly (this would reveal answers).

A
  • Storage: RATE = 8 corresponding to WAGE = 10, repeated 3 times
  • Update: WAGE could be updated without RATE
  • Insertion: We cannot insert a tuple unless we know WAGE for RATE
  • Deletion: If we delete all tuples with a given RATE , we lose
    association
91
Q

When do a functional dependency(FD) occur?

A

FD occurs when one attribute in a relation uniquely determined another attribute

92
Q

What is decomposition of a relation?

A

A decomposition of a relation schema R:
• Replace R with two (or more) relation schemas
• The decomposed schemas together include all of the attributes in R

93
Q

Explain functional dependency

A

o A functional dependency X → Y holds over relation R if, for every
allowable instance r of R:
• t1 ε r , t2 ε r, π X (t1) = π X (t2) implies π Y (t1) = π Y (t2)
• i.e., given two tuples in r, if the X values agree, then the Y values must also
agree. (X and Y are sets of attributes.)
• Read as X determines Y

o An FD is a statement about all allowable relations.
• Must be identified by application semantics and at design time
• Given some allowable instance r1of R, we can check if it violates some

FD f, but we cannot tell if f holds over R!

o A key constraint is a special instance of FD

94
Q

What are the benefits of reducing redundancy?

A

Help eliminating anomalies

Help save storage space

95
Q

Explain this regarding functional dependency with your own words:

ssn → did, did → lot, ssn → lot

A

Since ssn determines did and did determines lot, ssn determines lot.

96
Q

Explaing reflexivity regarding FD:

A

X, Y, Z are sets of attributes

Reflexivity: If Y ⊆ X, then X → Y

Meaning:
If Y is a subset or equal to X, then X determines Y.

97
Q

Explaing augmentation in FD.

A

X, Y, Z are sets of attributes

Augmentation: If X → Y, then XZ → YZ for any Z

Meaning:
If X determines Y, then combined XZ determines YZ for any value of Z.

98
Q

Explain transitivity in FD.

A

X, Y, Z are sets of attributes

Transitivity: If X → Y and Y → Z, then X → Z

Meaning:
If X determines the value Y, and Y determines the value of Z, then X determines the value of Z.

99
Q

Explain the term “Closure of an Attribute Set” in FD

A

● The set of all those attributes which can be functionally determined from an attribute set is called as a closure of that
attribute set.
● Closure of attribute set {X} is denoted as {X}+.

100
Q

Explain the Steps to Find Closure of an Attribute Set.

A

Step-01:
Add the attributes contained in the attribute set for which closure is being calculated to the result set.

Step-02:
Recursively add the attributes to the result set which can be functionally determined from the attributes already contained in the result set.

101
Q
Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-
A → BC
BC → DE
D → F
CF → G

Find the closure of attribute A.

A

Closure of attribute A:
A + = { A }
= { A , B , C } ( Using A → BC )
= { A , B , C , D , E } ( Using BC → DE )
= { A , B , C , D , E , F } ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )

Thus,
A+ = { A , B , C , D , E , F , G }

102
Q
Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-
A → BC
BC → DE
D → F
CF → G

Find the closure of attribute D.

A

D+ = { D }
= { D , F } ( Using D → F )

We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }

103
Q
Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-
A → BC
BC → DE
D → F
CF → G

Find the closure of attribute B, C.

A

{ B , C } + = { B , C }
= { B , C , D , E } ( Using BC → DE )
= { B , C , D , E , F } ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )

Thus,
{ B , C } + = { B , C , D , E , F , G }

104
Q

What is the purpose of normalization?

A

The purpose of normalization is to identify a suitable set of relations that support the
data requirements of an enterprise. The characteristics of a suitable set of relations
include the following:
1. The minimal number of attributes necessary to support the data requirements of the
enterprise;
2. Attributes with a close logical relationship (described as functional dependency) are
found in the same relation;
3. Minimal redundancy with each attribute represented only once with the important
exception of attributes that form all or part of foreign keys, which are essential for
the joining of related relations.

105
Q

What are the benefits of normalization?

A

The benefits of using a database that has a suitable set of relations is
that
1. the database will be easier for the user to access and maintain
2. take up minimal storage space on the computer.

106
Q

Why is normal forms helpful?

A
  • Normal forms help to find problems that might arise from the current schema
  • If a relation is in a certain normal form, it is known that certain kinds of
    problems are avoided / minimized. This can be used to help us decide
    whether decomposing the relation will help.
107
Q

Explain the rules of First Normal Form.

A

1NF (First Normal Form) Rules

- Each table cell should contain a single value.
- Each record needs to be unique

Meaning: Not having name and age in same cell, not having address and salary in same cell etc..

Each row should be unique in some way.

108
Q

Explain the rules of Second Normal Form.

A

2NF (Second Normal Form) Rules

- Rule 1- Be in 1NF
- Rule 2- No partial dependencies. 

Meaning: First normal form rules must be applied.

All attributes should be fully dependent on primary key.

109
Q

Explain the rules of Third Normal Form.

A

❖ Should be in 2NF
❖ Contains No Transitive Dependencies

Meaning:
A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change.

An example would be if you have address and country in the same table. If you update/change the address the country could change. This would mean that the address is transitively dependent on country and vice versa.
A solution would be to split the address and country attributes into a separate table with address as the primary key.

110
Q

Explain BCNF (Boyce-Codd Normal Form).

A

❖ Should be in 3NF
❖ Only (super) keys should determine other attributes
❖ A non key should not determine (super) keys

111
Q

Explain Fourth Normal Form

A

For a table to satisfy the Fourth Normal Form, it should satisfy the following two conditions:

- It should be in the Boyce-Codd Normal Form.
- And, the table should not have any Multi-valued Dependency.
112
Q

What are the 2 types of decompositions?

A
  • Lossless-join decomposition (takes care of recovery)

* Dependency-preserving decomposition (takes care of checking integrity constraints)

113
Q

What is the definition of Lossless-Join?

A

Definition: A decomposition of R into two schemas with attributes sets X and Y is said to be a lossless-join decomposition with respect to F if, for every instance r of R that satisfies the dependencies in F,
π X (r) ⋈ π Y (r) = r

i.e. Informally: If we break a relation, R, into bits, when we put the bits
back together, we should get exactly R back again

114
Q

What is the definition of a heap data structure and how is files stored in a heap?

A

Heap (unordered) file: Simplest file structure, suitable when typical access is a file scan retrieving all
records. Records in a heap file are stored in random order across the pages of the file

115
Q

When to use the Sorted Files file structure?

A

Sorted Files: Best if records must be retrieved in some order, or only a `range’ of records is needed.

116
Q

Explain Index file structure and why it is useful?

A

Indexes: Data structures to organize records via trees or hashing to optimize certain kinds of retrieval
operations
• An index allows us to efficiently retrieve all records that satisfy search conditions on the search key fields of
the index
• Additional indexes on a given collection of data records, each with a different search key, to speed up search
operations that are not efficiently supported by the file organization used to store the data records.

117
Q

Three alternatives for Data Entry k* in Index?

A

• Three alternatives:

• A data entry k* is an actual data record (with search key value k).
• A data entry is a (k, rid) pair, where rid is the record id of a data
record with search key value k.
• A data entry is a (k, rid-list) pair, where rid-list is a list of record
id’s of data records with search key value k.

118
Q

Explain: A data entry k* is an actual data record (with search key value k).

A

• The index is used to store actual data records; each entry b is a data
record with search key value k. E.g.: a heap or sorted files.

• At most one index on a given collection of data records can use
Alternative 1. (Otherwise, data records are duplicated, leading to
redundant storage and potential inconsistency.)

• If data records are very large, # of pages containing data entries is
high. Implies size of auxiliary information in the index is also large,
typically.

119
Q

Explain:
1. A data entry is a (k, rid) pair, where rid is the record id of a data
record with search key value k.

and

  1. A data entry is a (k, rid-list) pair, where rid-list is a list of record
    id’s of data records with search key value k.
A

• Contain data entries that point to data records, are independent of
the file organization that is used for the indexed file (i.e., the file
that contains the data records).

• Data entries are typically much smaller than data records. So, better
than Alternative 1 with large data records, especially if search keys
are small. (Portion of index structure used to direct search, which
depends on size of data entries, is much
smaller than with Alternative 1.)

• Alternative 2 more compact than Alternative 1, but leads to variable
sized data entries even if search keys are of fixed length.

120
Q

Explain: Clustered Index

A

If order of data records is the same as, or `close to’, order of data entries, then called clustered index.

A good example of a clustered index would be a phone book where the name(index) points to a number(value) on the same page.

121
Q

Explain: Unclustered index

A

The data is stored in one place, and index is stored in another place. Since, the data and non-clustered index is stored separately, then you can have multiple non-clustered index in a table.

Imagine Unclustered as an index in a book. You look up chapter 4(key) or something and then have to look the chapter(value) up on another page in the book.

122
Q

Explain: Hash-Based indexing.

A
  • Records can be organized using a technique called hashing to quickly find records that have a given search key value.
  • Records in a file are grouped in buckets, where a bucket consists of a primary page and, possibly, additional pages linked in a chain.
  • The bucket to which a record belongs can be determined by applying a special function, called a hash function, to the search key.
  • Given a bucket number, a hash-based index structure allows us to retrieve the primary page for the bucket in one or two disk I/Os.
  • On inserts, the record is inserted into the appropriate bucket, with ‘overflow’ pages allocated as necessary.
• To search for a record with a given search key value, we apply the hash
function to identify the bucket to which such records belong and look at all  pages in that bucket.
123
Q

Explain: B-Tree indexing

A

• Organize records using a tree- like data structure.

• The data entries are arranged in sorted order (clustered) by
search key value, and a hierarchical search data structure is
maintained that directs searches to the correct page of data
entries.

• The lowest level of the tree, called the leaf level, contains
the data entries

• A node on a tree is a page

124
Q

What are the ACID properties in DBMS?

A
Atomicity = The entire transaction takes place at once or does not happen at all.
Consistency = Database must be consistent before and after transaction
Isolation = Multiple transaction occur independently without interference
Durability = The changes of a successful transaction occurs even if the system failure occurs.
125
Q

Can you mention some commands regarding a Transaction?

A

• BEGIN / START TRANSACTION: start a transaction;
• COMMIT: commit the current transaction and make its changes permanent;
• ROLLBACK: uncommit the current transaction block;
• SET autocommit=[0/OFF, 1/ON]: disable or enable the auto-commit mode for the
current transaction

126
Q

Explain Strict Two-phase Locking (Strict 2PL) Protocol

A

❖ Each Xact must obtain a S (shared) lock on object before
reading, and an X (exclusive) lock on object before writing.

❖ All locks held by a transaction are released when the
transaction completes

❖ If an Xact holds an X lock on an object, no other Xact can get a
lock (S or X) on that object.

127
Q

What should happen during an aborted transaction to endure durability?

A

❖ If a transaction Ti is aborted, all its actions have to be undone. Not only
that, if Tj reads an object last written by Ti, Tj must be aborted as well!

❖ Most systems avoid such cascading aborts by releasing a
transaction’s locks only at commit time.
•If Ti writes an object, Tj can read this only after Ti commits.

❖ In order to undo the actions of an aborted transaction, the DBMS maintains
a log in which every write is recorded. This mechanism is also used to
recover from system crashes: all active Xacts at the time of the crash are
aborted when the system comes back up.

128
Q

Explain: Transaction log and what is recorded in it.

A

❖ The following actions are recorded in the log:
❖ Ti writes an object: the old value and the new value.
•Log record must go to disk before the changed page!
❖ Ti commits/aborts: a log record indicating this action.

❖ Log records are chained together by Xact id, so it’s easy to undo a
specific Xact.

❖ Log is often duplexed and archived on stable storage.

❖ All log related activities (and in fact, all CC related activities
such as lock/unlock, dealing with deadlocks etc.) are handled
transparently by the DBMS.

129
Q

Explain the 3 steps in recovering from a crash during querying a database.

A

•Analysis: Scan the log forward (from the most recent checkpoint) to
identify all Xacts that were active, and all dirty pages in the buffer pool at
the time of the crash.

•Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to
ensure that all logged updates are in fact carried out and written to disk.

•Undo: The writes of all Xacts that were active at the crash are undone (by
restoring the before value of the update, which is in the log record for the
update), working backwards in the log. (Some care must be taken to
handle the case of a crash occurring during the recovery process!)