Exam 2 Flashcards

1
Q

Functional Dependency

A

A functional dependency is a relationship between attributes. A functional dependency is
desirable when a key determines the rest of the row.

However, if a “key” determines only some of the attributes, then it’s the key of some other
object and those attributes belong in their own table. If they stay in the same table then it
leads to redundancy and modification anomalies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

First Normal Form

A

First Normal Form means that attributes in a table must be:

Atomic, meaning that a single attribute cannot contain a list, array, or other structure that
must be searched or processed to determine its contents.

Non-repeating, meaning that an attribute cannot be duplicated multiple times in a row to
produce a list within a single row.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Deletion Anomaly

A

A deletion anomaly is the side effect of placing data about more than one object in the same
table. When a row is deleted, data about both objects is eliminated.

Example: Deleting a part from a parts list also deletes the vendor from the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Second Normal Form

A

Second Normal Form means all the attributes in a row are functionally dependent on the
whole key, not just part of the key. This only affects tables with composite keys.

Put another way, there are no partial-key dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do normalization problems occur?

A

Normalization problems occur because a table represents two objects.

Dividing the table
into two, one for each object (“decomposition”), eliminates redundancy, solves modification
anomalies, and brings the table to a higher normal form.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are second and third normal forms concerned with?

A

Second and Third Normal Forms are both concerned with an attribute that’s a non-key
determinant. In 2NF, the non-key determinant is a component of the table’s composite key,
and in 3NF the non-key determinant is outside of the table’s key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe 1NF, 2NF and 3NF

A

1NF eliminates non-atomic and repeating attributes in a table. 2NF and 3NF describe a
determinant that doesn’t belong to the key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe BCNF

A

BCNF is the general statement that covers all non-key determinants. That means that all
attributes describe the key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the three characteristics of unnormalized tables?

A

An unnormalized table is undesirable because it has these negative characteristics:
- Unnormalized tables have data redundancies.
- Data redundancies cause modification anomalies.
- Modification anomalies make the data unreliable and hard to maintain.
the other answers (more tables, more joins, and slower performance) are drawbacks of
normalized tables. The benefits of normalization usually outweigh the drawbacks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Data Redundancy in a table?

A

Data Redundancy in a table is the same fact stated more than once.

  • It’s caused by two objects sharing the same table.
  • It’s undesirable because it leads to modification anomalies.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the most common cause of unnormalized tables?

A

The cause of a table’s normalization problems is almost always that the table models more
than one real-world object. This causes the table to have data redundancy that results in
modification anomalies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Fourth Normal Form?

A

Fourth Normal Form says that a table may not contain two or more independent multivalued dependencies. A key point is independent. Two multi-valued dependencies that are
interrelated are not independent.

For example, a cable TV company may offer a set of channels (one multi-valued
dependency) and it may cover a set of regions (another multi-valued dependency). If it
offers the same set of channels in each of its regions, the two dependencies are
independent of each other and should not appear together in the same table.

However, if the cable TV company offers a different set of channels within each of its
different regions, then we have to know which region we’re considering before we know
which set of channels is offered. These two dependencies are not independent, and must
appear together in the same table to establish their relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What problems do 2NF, 3NF and BCNF address?

A

2NF, 3NF, and BCNF deal with problems involving functional (non-key) dependencies in
which a determinant points to a specific value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What problems do 4NF and 5NF address?

A

4NF and 5NF are about multi-valued

dependencies in which a determinant points to a set of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How common are 5NF normalization issues?

A

Fifth Normal Form problems are rare, depend on business rules, and are seldom
encountered by database practitioners.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are 2NF, 3NF and BCNF concerned with?

A

2NF, 3NF, and BCNF are all concerned with non-key functional dependencies. 2NF and
3NF describe specific configurations of non-key dependencies, while BCNF is a catch-all
that addresses any other kind of non-key dependency. Some textbooks don’t distinguish
among these three normal forms and lump them all under BCNF.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What problems are caused by Redundancy?

A

Redundancy:
- Causes modification anomalies.
- Is a maintenance headache to keep multiple copies of data synchronized.
- Requires extra storage.
Redundancy is a result of un-normalized tables, and the elimination of redundancy is the
goal of normalization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What conditions must a table meet to be considered normalized?

A

If all of these conditions are met, then a table is very well (though perhaps not perfectly)
normalized:
- 1NF – All attributes are atomic
- 2NF, 3NF, BCNF – The only determinants in any functional dependency are keys
- 4NF – There are not two or more independent multi-valued dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are some characteristics of Redundancy?

A

Repetition of a fact (including repetition of a pairing of values) is called Redundancy.

  • It causes modification anomalies
  • It’s caused by a non-key functional dependency.
  • Which in turn occurs when a single table models more than a single object.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Describe DKNF.

A

DKNF states that every constraint on a table is a consequence of its Domains and Keys. If
a table has a constraint that “date-of-birth” and “age” agree with each other, this constraint
cannot be enforced by either a key of the table, or the domains of the attributes.
If a table has a pair of columns that provide the same information but in different terms
(such as “date-of-birth” and “age”), this is a redundancy and the table isn’t in Domain/Key
Normal Form.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Describe decomposition measures.

A

Correcting a normalization problem almost always involves decomposing a table into two
new tables.

  • Repeating Attributes (1NF) – use the multivalued attribute patern
  • Non-Key (Functional) Dependencies (2NF, 3NF, BCNF) – separate table for each
    dependency
  • Two Independent Multi-valued Dependencies (4NF) – separate table for each
    dependency

Sometimes the problem is caused by an attribute being in the wrong table, and can be
corrected by simply moving the attribute into its correct table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a Multi-Valued Dependency?

A

A Multi-valued dependency means a determinant (“key”) points to a set of values rather
than to just a single value.

A multi-valued dependency occurs in the intersection table forming a many-to-many
relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does the acronym ACID stand for?

A

ACID is an acronym for Atomic, Consistent, Isolated, Durable; these are the properties that
well behaved transactions must possess to be useful.

  • Atomic means that a transaction will execute in its entirety or not at all.
  • Consistent means that if the database was in a consistent state when a transaction
    started, it will still be in a consistent state when it ends.
  • Isolated means that transactions will not interfere with each other.
  • Durable means that a transaction will not produce results that will be inadvertently lost.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a Recoverable Transaction Schedule?

A

A Recoverable Transaction Schedule is one in which no transaction commits with data
read from another uncommitted transaction.

Why is committing with data read from an uncommitted transaction unrecoverable? If the
other transaction rolls back, the committed transaction has committed invalid data and it’s
too late to roll back. There’s no way to recover!

25
Q

What is a Cascading Rollback?

A

Cascading Rollback: A transaction reads data from an uncommitted second transaction.
The uncommitted transaction rolls back, so the first transaction must roll back too because
the data it read is now invalid.

26
Q

What is a Serial Transaction Schedule?

A

A Serial Transaction Schedule has no interleaving of the queries of different transactions
(no concurrency). This is a single-transaction strategy.

27
Q

What is a Serializable Transaction Schedule?

A

A Serializable Transaction Schedule interleaves its operations in such a way that
transactions don’t interfere with each other.

  • Conflict Equivalent to a serial transaction schedule.
  • Conflicting operations are in the same order as in a Serial schedule.
  • Transactions are correctly isolated.

In a Serial transaction schedule, on the other hand, the transactions aren’t interleaved. A
transaction starts after the previous one completes.

28
Q

What are read locks also called?

A

Shared Locks

29
Q

What are write locks also called?

A

Exclusive Locks

30
Q

What is an implicit lock?

A

An Implicit Lock is one that all SQL statements automatically set when they execute
outside a transaction. They prevent conflict with other SQL statements executing at the
same time.

31
Q

What is Two-Phase Locking?

A

Two-Phase Locking is a rule that requires each transaction to request and release locks in
two phases:
- A growing phase in which locks are established or upgraded,
- A shrinking phase in which locks are downgraded or released.
The benefit of Two-Phase Locking is that when all transactions use it, it always produces
Serializable schedules.

32
Q

Why use Two-Phase Locking?

A

The reason to use Two-Phase Locking is that it’s easy to automate, and when all transactions use it, then the transaction schedules are always Serializable.

A Serializable Schedule correctly isolates its transactions because it’s Conflict-Equivalent
to a Serial Schedule. That is, conflicting operations occur in the same order as they would
in a Serial Schedule.

33
Q

How does SQL automatically handle locking?

A

The Transaction Isolation Level is SQL’s way to automatically handle locking in
transactions.

34
Q

How do you enable automatic locking?

A
SET TRANSACTION ISOLATION LEVEL activates automatic Two-Phase Locking.
Depending on your application, you could choose the Serialiazible Isolation Level to ensure
complete isolation (serializablility) while avoiding unrecoverable schedules:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
This sets an environment variable that affects all transactions in the current session.

35
Q

What provides an up-to-the-second database backup?

A

The Transaction Log Files provide an up-to-the-second database backup. When a
change is made to a row, the query that caused the change is written into the transaction
log.

If the database is lost, it’s restored from the last full backup, then the operations in the
transaction logs are applied to the database. This restores the data to it’s state right before
the failure.

36
Q

What is Recovery Via Rollback?

A

Recovery via rollback means we use the transaction logs to undo the last transactions
from the database, undoing each transaction in the reverse order it was originally
performed.

If we discover that a transaction has corrupted the data, we can roll it back to a point before
the corruption occured.

37
Q

Do we need to make backups?

A

Even though the transaction logs save the transactions that have been applied to the
database (which acts as a real-time backup), we still need to make a periodic full backup
of the database.
During recovery, the full backup gives us a starting point to which we apply the latest
transactions from the log.

38
Q

What is the strength of a Distributed Database?

A

The Distributed Database’s strength is that the database takes care of geographically
distributing the tables on its own, without help from the application. The application never
knows that the database is geographically scattered.

39
Q

Do applications need to be specially written to use in a Distributed Database?

A

Applications don’t need to be specially written to work with a distributed database.

40
Q

What is the Two-Phase Commit Protocol?

A

Two-Phase Commit Protocol allows distributed transactions to either all commit together,
or all roll back together. The two phases are called the Prepare Phase, and the Commit
Phase.

41
Q

Describe the characteristics of a View in SQL.

A

A View in SQL is a virtual table.

  • Can be used in place of a real table in SQL statements.
  • Its contents are the output of a SELECT statement.
  • It’s maintained by the database and needs no special attention once it’s declared.
42
Q

What is a SQL Injection Attack?

A

A SQL Injection Attack is performed by an attacker who types SQL commands into user
input fields. These user input fields are intended to send data to the database, but instead
send the attacker’s queries to the database instead.

The goal is to gain unauthorized access to the data, or to vandalize it.

43
Q

What is a good way to protect the DB from a SQL Injection Attack?

A

Using Prepared Statements protects against SQL Injection because the query in a
Prepared Statement is pre-parsed: the query engine analyzes a statement’s syntax only
once, at startup time, rather than every time a statement is sent to the database.

Therefore, the query engine doesn’t look at the syntax of the code that the attacker entered.
The attacker’s code is treated as data instead of as an executable SQL statement.

44
Q

What is a Stored Procedure?

A

A stored procedure is invoked by a call from an application’s code, or by the DB
Administrator from the command line.

45
Q

What is a Trigger?

A

A trigger is fired in response to an INSERT, DELETE, or UPDATE statement.

46
Q

What is a View in SQL?

A

A SQL View is a pre-defined query that stands between the tables and the user.

A View appears to the user as if it were an ordinary table. This extra layer of processing doesn’t
improve performance. Purposes of a view are:

  • Display computed columns
  • Hide complicated syntax
  • Hide rows and columns
  • Hide schema changes
47
Q

What is a Page in the DBMS?

A

The DBMS sees the disk divided into chunks called Pages. A Page is the smallest unit of
disk I/O and the fundamental unit of disk storage for a DBMS.

48
Q

What is the most commonly used data structure in SQL?

A

The B+ Tree is the most commonly used data structure for DBMS systems. It provides
storage for very large sets of data. Indexes using a B+ tree can be searched very rapidly
using few disk I/O operations.

The Hash Table is also used for the same purpose, but is less common because the B+
tree is more efficient for range searches.

49
Q

In an ISAM file, what is used for the index and how are the data items stored?

A

In an ISAM file, the index is a B+ tree and the data is stored in a linked list. The B+ tree is
fast to search, and putting the data into a linked list keeps the data sorted to match the
order of the leaf nodes in the tree. This makes range searches fast.

50
Q

What are the advantages of a B+ tree over a Binary Tree?

A

Fewer disk reads are required to find a value in a B+ tree than in a Binary Tree. Since many
values are stored in each node of the B+ Tree, it’s wider but less deep than a Binary Tree.
Since a disk read is required for each level while traversing a tree (because a decision
must be made at each level), fewer levels mean fewer disk reads.

51
Q

What is a Query Tree?

A

A Query Tree is the sequence of relational algebra operations in a query execution plan.
The tree structure imparts a specific sequence on the relational operations embedded
within the tree.

52
Q

What layer of the DBMS works with or knows anything about the file system?

A

The Disk Space Manager is the only layer of the DBMS that sees the file system. Higher
layers of the DBMS, even the Buffer Manager, know nothing of the hardware or operating
system lurking below.

53
Q

How many relational algebra operators are there and is the language complicated?

A

There are only five fundamental relational operators, and a few more derived ones.
Relational algebra is a simple language.

54
Q

What are scripted relational query languages like SQL based on?

A

Tuple Relational Calculus (TRC)

55
Q

What are visual query systems based on?

A

Domain Relational Calculus (DRC)

56
Q

What is the Buffer Manager and what are its functions?

A

The Buffer Manager maintains the Buffer Pool, which is the area of memory in which the
working database is kept.
It has these functions:

  • Manages the part of the database residing in working memory.
  • Replaces the OS’s virtual memory page swapper with it’s own buffer replacement policy.
  • Decides what pages need to be written to disk.
57
Q

Why doesn’t the Query Manager just execute the first plan it comes up with?

A

It’s quicker for the Query Engine to generate many candidate execution plans and estimate
the execution cost of each one than it is to just execute the first plan it comes up with.

58
Q

What is the Buffer Manager?

A

The Buffer Manager is the DBMS layer that writes the logs used for crash recovery. It’s in
a position to control the sequence that the logs and the data are committed to the disk.
(Write the Logs first, and the Data second).
This sequence is important for committed transactions to be durable no matter at what
point in the process a failure occurs.