File Organisation Flashcards

1
Q

what are heap, ordered and hash files

A

Heap:records are on disk in no order.

Ordered:records are ordered by value of specific field.

Hash: records are placed in hash function

Data stored on disk-files of records- collection of data values as facts. Records are stored to locate them efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is primary, secondary and tertiary storage ?

A

Primary:Fastest with cache and main. Data accessed by cpu. Limited capacity and power supply

Secondary:Media is on magnetic disks. Cost less and greater capacity. Slower access. Data loaded to prim before operation. No power supply.

Tertiary:used for backup, tapes DVD.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is access method and how is it applied ?

A

Storing and retrieving. Each tuple-map to record in OS. Each field in record value- one attribute value.
User-requests tuple- DBMS maps logical onto physical and retrieves and buffers from 1 storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are physical records

A

Unit of Transfer between disk and primary storage. Has more than one tuple, single logical record which corresponds to one/more physical ones. Called BLOCKS/ PAGES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are heap files inserted and what is it’s advantage ?

A

Simplest and most basic type. Records are stored in the ORDER in which they are INSERTED. New records- last page of file. No space then new page is added. INSERTION IS VERY EFFICIENT- O(1) COMPLEXITY. Advantage is that they are very efficient at BULK LOADING data into table. No overhead of calculating what page the record should go on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How are heap files searched for ?

A

SEARCHING IS INEFFICIENT. Since no order to find a value, a LINEAR search must be performed. A Linear search- reading pages from file until the required one is found.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are heap files deleted ?

A

Physical deletion leaves unused space in block. A large amount of space is wasted if lot of deletion. File size & disk space will increase and performance progressively deteriorate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are ordered files ?

A

Records in a file- physically ordered based on value of one or more of fields. The field that file is sorted on is called ORDERING FIELD.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are ordered files searched for ?

A

We use the ORDER BY statement to search. There is also binary search which has following steps:
Retrieve the mid-page of the file.
Check if the required record is between the first and last record of this page.
If yes, the record is found on this page, and no more pages need to be retrieved.
If no, proceed to the next steps.
If the value of the key field in the first record on the page is greater than the required value:
Repeat the process using the top half of the file as the new search area.
If the value of the key field in the last record on the page is less than the required value:
Repeat the process using the bottom half of the file as the new search area.
With each iteration, half of the search space is eliminated, optimizing the search for the required record.
More efficient than linear o(log(n))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are ordered files inserted and deleted ?

A

it is a problem because order has to be maintained.

Insert :-find the correct position in file and find space to insert record. If space is there then page is REORDERED if not then MOVE records to NEXT PAGE which causes CASCADING EFFECT.
A solution is-OVERFLOW/TRANSACTION FILE. Insertions added periodically and merged.
Deletion- reorganise the records to remove the free slot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are hash files ?

A

A hash function is used to calculate address of page in which record is stored based on one/more fields-O(1) COMPLEXITY.
Base field- HASH FIELD
if hash field is a key field-HASH KEY
Records will appear randomly distributed across space- RANDOM FILES.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the hash files examples and explain them

A

FOLDING-applies an arithmetic function to diff parts of has field. Strings can convert to numeric value. Result of arithmetic determines ADDRESS OF DISK PAGE.

DIVISION-REMAINDER-More popular and uses the MODULO function where it / pre determined integer and takes REMAINDER AS DISK ADDRESS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the problems that occur with hash files ?

A

Do not guarantee a unique address because of values a hash field can have.
Each address corresponds to a page(Bucket) with slots for multiple records. Within bucket -record are placed in order of arrival.
When same address is generated for two or more a COLLISION is occured and records are called SYNONYMNS.
Collision management degrades overall performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

4 techniques used for collision management

A
  1. OPEN ADDRESSING(linear probing)- if required position is occupied then program checks subsequent positions in turn until an available space is located.

2.UNCHAINED OVERFLOW- area is maintained for collisions. Record that collides with another will be placed here.

3.CHAINED OVERFLOW-each bucket has an additional field(synonym pointer) that indicated if collision has occurred and if yes then points to overflow page.
linked list for each hash address of overflow records is maintained.

4.MULTIPLE HASHING-when a second hash function is applied.

THESE ARE ALL STATIC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is dynamic hashing ?

A

When the space is full, it is said to be saturated. This is when the file size changes dynamically to accommodate shrinking and growth of DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the limitations of hashing ?

A

The retrievals depends upon the complete hash field. It is inappropriate for pattern matching.

not useful for retrievals based on any other field than hash.