Exam 4 Flashcards

Question 1

Q

Mapping of EER diagrams to relations will in ________ cases, result in a database that is normalized?

a) All
b) Some
c) No

Question 2

Q

The 4 elements of normalising a database

Answer

A

1) No redundancy of facts
2) No clutterig of facts
3) Must preserve information
4) Must preserve functional dependencies (e.g. if I know email, I know name and birthdate)

Question 3

Q

How might you end up with a table that is not a relation (or not F2… non-first normal form function)

Answer

A

Example: if you have tags against a record… it is multivalued so you’d technically have many tags against one row. So instead you replicate the rest of the info for each tag

Question 4

Q

What’s the problem with redundancy in a relation?

Answer

A

Requires multiple places to do the update which can mean inconsistency.

Question 5

Q

Insertions, based on functional requirements, cause what problems?

Answer

A

Can lead to NULL values.

For example, if you say anyone born in 1970 has 40k salary… and there is no person born in 1970, then you have to put null values after this statement.

Question 6

Q

What is the problem with deletion anomalies?

Answer

A

If we delete a user who is the only example of our functional reequirement (e.g. the only user born in 1970) then you lose that requirement (i.e. you don’t have an example of someone with 1970 birth earning 40k)

Question 7

Q

What’s the problem with update anomalies?

Answer

A

If you have multi-value and do an update in one row, you have to do work to do it everywhere

Question 8

Q

One answer to anomalies is to decompose tables into multiple tables. What is the problem with this?

Answer

A

This can result in “information loss” where when you recombine the tables, they match too many places and you can no longer rely on any one tuple being correct (i.e. you’ve lost the real connection to the entity and thus, lost the info)

Question 9

Q

What is another problem with decomposing tables?

Answer

A

Dependency loss. Where we can enforce functional dependencies (e.g. this birthdate requires this salaray) on two sub-tables that don’t share those columns.

Question 10

Q

How do you fix all the potential anomalies with a relation?

Answer

A

You decompose them into the right combo of columns to align with the functional dependencies.

Question 11

Q

But how do we get to a place with no functional dependencies?

Answer

A

Functional Dependencies?

Question 12

Q

What is a functional dependency in discrete math terms?

Answer

A

Let X and Y be sets in R. Y is functionally dependent on X in R IFF for each x is an element of RX, there is a recisely one y taht is element of RY

Question 13

Q

How do you ensure FULL functional dependency?

Answer

A

You basically make them “functions” in the true math sense. Each unique key or value of x only has one value y

Question 14

Q

The 3 normal form types

Answer

A

All attributes must depend on the key (1NF), the whole key (2NF), and nothing but the key (3NF), so help me Codd

Question 15

Q

Does something ever land at 3rd NF and not BCNF?

Answer

A

No. They do in theory but prof never saw in practice

Question 16

Q

Transitivity Rule

Answer

A

X->Y and Y->Z implies X->Z

Question 17

Q

Augmentation Rules

Answer

A

If X->Y you can augment with same Z on both sides ZX->ZY

Question 18

Q

Reflexivity

Answer

A

If Y is part of X then X->Y

Question 19

Q

Extent Transfer times?

Answer

A

This means you pick up extra data once you find it and just incur transfer time. which is just .5-1ms

Question 20

Q

Extent Transfer times?

Answer

A

This means you pick up extra data once you find it and just incur transfer time. which is just .5-1ms

Question 21

Q

Buffer Management Strategies to do extent transfer?

Answer

A

LRU = Least Recently Used means that when you need to overwrite buffer you.

Good for merged joins (because you are done with the old tables).

Bad with nested loops because you still need top of the loop.

Question 22

Q

Buffer Management Strategies to do extent transfer?

Answer

A

LRU = Least Recently Used means that when you need to overwrite buffer you.

Good for merged joins (because you are done with the old tables).

Bad with nested loops because you still need top of the loop.

Question 23

Q

Primary Index

Answer

A

Used to help expedite searching with sorted dta.

Basically is a parallel records that point to the key value at front of each block… is an index of what values are start of each block.

So then you can search the sorted index.

The sorted index picks up the values in order so it itself is sorted.

Also, you can use these for “point and range” search wherein you find the start of a block and grab or know that the item is in some range before the next item listed in the index. This speeds search time because you know the range to search in.

Question 24

Q

Size of primary index and fanning

Answer

A

Block pointer is 4 bytes. You have 200,000 records. Each email is 50bytes. So each index record is 54 bytes. Given a block is 4000 bytes but only 80% utilised, you have 3200 bytes which can hold 60x 54 byte records. This 60 is called the “fanout”

Question 25

Q

Sparse index block

Answer

A

Comes from primary index pointing to start of each block

Question 26

Q

Dense index block

Answer

A

Comes from primary index pointing at each record

Question 27

Q

Secondary Index

Answer

A

Builds an index on something other than start of a block and not sorted

Good for point queries only because the underlying data not sorted.

You can find the data quickly but can use the range to find the next value.

Question 28

Q

multilevel index

Answer

A

If you do index for one level you can index each level.

Makes it all faster.

But makes it more vulnerable to overflows because if you add one record, it has to cascade back up the indexes.

Question 29

Q

Hash index

Answer

A

Could look up lecture.

Basically it’s faster unless it gets too deep as then becomes heap again.