Exam 4 Flashcards

1
Q

Mapping of EER diagrams to relations will in ________ cases, result in a database that is normalized?

a) All
b) Some
c) No

A

All

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The 4 elements of normalising a database

A

1) No redundancy of facts
2) No clutterig of facts
3) Must preserve information
4) Must preserve functional dependencies (e.g. if I know email, I know name and birthdate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How might you end up with a table that is not a relation (or not F2… non-first normal form function)

A

Example: if you have tags against a record… it is multivalued so you’d technically have many tags against one row. So instead you replicate the rest of the info for each tag

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s the problem with redundancy in a relation?

A

Requires multiple places to do the update which can mean inconsistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Insertions, based on functional requirements, cause what problems?

A

Can lead to NULL values.

For example, if you say anyone born in 1970 has 40k salary… and there is no person born in 1970, then you have to put null values after this statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the problem with deletion anomalies?

A

If we delete a user who is the only example of our functional reequirement (e.g. the only user born in 1970) then you lose that requirement (i.e. you don’t have an example of someone with 1970 birth earning 40k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the problem with update anomalies?

A

If you have multi-value and do an update in one row, you have to do work to do it everywhere

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

One answer to anomalies is to decompose tables into multiple tables. What is the problem with this?

A

This can result in “information loss” where when you recombine the tables, they match too many places and you can no longer rely on any one tuple being correct (i.e. you’ve lost the real connection to the entity and thus, lost the info)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is another problem with decomposing tables?

A

Dependency loss. Where we can enforce functional dependencies (e.g. this birthdate requires this salaray) on two sub-tables that don’t share those columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you fix all the potential anomalies with a relation?

A

You decompose them into the right combo of columns to align with the functional dependencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

But how do we get to a place with no functional dependencies?

A

Functional Dependencies?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a functional dependency in discrete math terms?

A

Let X and Y be sets in R. Y is functionally dependent on X in R IFF for each x is an element of RX, there is a recisely one y taht is element of RY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you ensure FULL functional dependency?

A

You basically make them “functions” in the true math sense. Each unique key or value of x only has one value y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The 3 normal form types

A

All attributes must depend on the key (1NF), the whole key (2NF), and nothing but the key (3NF), so help me Codd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Does something ever land at 3rd NF and not BCNF?

A

No. They do in theory but prof never saw in practice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Transitivity Rule

A

X->Y and Y->Z implies X->Z

17
Q

Augmentation Rules

A

If X->Y you can augment with same Z on both sides ZX->ZY

18
Q

Reflexivity

A

If Y is part of X then X->Y

19
Q

Extent Transfer times?

A

This means you pick up extra data once you find it and just incur transfer time. which is just .5-1ms

20
Q

Extent Transfer times?

A

This means you pick up extra data once you find it and just incur transfer time. which is just .5-1ms

21
Q

Buffer Management Strategies to do extent transfer?

A

LRU = Least Recently Used means that when you need to overwrite buffer you.

Good for merged joins (because you are done with the old tables).

Bad with nested loops because you still need top of the loop.

22
Q

Buffer Management Strategies to do extent transfer?

A

LRU = Least Recently Used means that when you need to overwrite buffer you.

Good for merged joins (because you are done with the old tables).

Bad with nested loops because you still need top of the loop.

23
Q

Primary Index

A

Used to help expedite searching with sorted dta.

Basically is a parallel records that point to the key value at front of each block… is an index of what values are start of each block.

So then you can search the sorted index.

The sorted index picks up the values in order so it itself is sorted.

Also, you can use these for “point and range” search wherein you find the start of a block and grab or know that the item is in some range before the next item listed in the index. This speeds search time because you know the range to search in.

24
Q

Size of primary index and fanning

A

Block pointer is 4 bytes. You have 200,000 records. Each email is 50bytes. So each index record is 54 bytes. Given a block is 4000 bytes but only 80% utilised, you have 3200 bytes which can hold 60x 54 byte records. This 60 is called the “fanout”

25
Q

Sparse index block

A

Comes from primary index pointing to start of each block

26
Q

Dense index block

A

Comes from primary index pointing at each record

27
Q

Secondary Index

A

Builds an index on something other than start of a block and not sorted

Good for point queries only because the underlying data not sorted.

You can find the data quickly but can use the range to find the next value.

28
Q

multilevel index

A

If you do index for one level you can index each level.

Makes it all faster.

But makes it more vulnerable to overflows because if you add one record, it has to cascade back up the indexes.

29
Q

Hash index

A

Could look up lecture.

Basically it’s faster unless it gets too deep as then becomes heap again.