Organisation and structure of data Flashcards

Question 1

Q

fixed length record 5 facts

Answer

A

easier to program and search as they have same allocation of memory and space
easier to process as we know start and end locations
binary search can be used to locate records
wastes space if records are too small
truncates fields if they are too large

Question 2

Q

variable length record 5 facts

Answer

A

diff number of bytes for each record
slow searching as you need to identify the marker at the end of each record
if file is updated it all needs to be rebuilt
can only use linear search
better if records are different sizes as doesn’t waste space

Question 3

Q

master file 5 facts

Answer

A

larger file that is sorted into an order
data is updated from the transaction file
data searched regularly and used for info by staff
data not always completely up to date
permanent data

Question 4

Q

transaction file 3 facts

Answer

A

serial files, stored in the order the data is submitted
used as temporary files before data is added to master file usually for short amount of time
serially searched (slow)

Question 5

Q

updating the master file process

Answer

A

usually occurs overnight
1. transaction file sorted to the same order as the master file
2. data copied from the old master to the new master until the point where the data from transaction file is needed
3. data copied from transaction file to new master file
4. this is repeated until all data from transaction file is in correct location
5. error log is generated at the end of process

Question 6

Q

serial files

Answer

A

records stored in chronological order (order they were added
must be linearly searched (slow)
adding records is fast as it is just appended to the end of last record

Question 7

Q

sequential files

Answer

A

stores data in order of primary key
fast to locate specific files - can use binary search
when new records added a new file is made and records are copied across with new record added in when appropriate location reached
same for when record is deleted (deleted record just not copied across)

Question 8

Q

indexed sequential files

Answer

A

records split into 2 components: the index (primary key and pointer) and the bulk of the record
primary keys stored in order and each linked to a pointer which identifies where on the disk the rest of the record can be found
keys are added and deleted in the same manner as a sequential file but is faster as only the index needs to be copied as the pointer remain the same.
speeds up searching time as less of the file needs to be searched

Question 9

Q

multilevel index

Answer

A

same as index but top level index doesn’t contain pointers to the records but to second level indexes which contains pointers to the records (or to a 3rd layer etc)
useful when diff clusters of records are stored in diff physical locations.

Question 10

Q

direct file access

Answer

A

has records that are stored and retrieved according to either their disk address or their relative position within the file
means that the program which stored and retrieves the records must specify the address within the file - must use hashing algorithm with the key to generate the start position of the block, the block can then be serially searched to find the tile

Question 11

Q

types of hashing function

Answer

A

deterministic
uniformity
data normalisation
continuity
non-invertible

Question 12

Q

deterministic hash function

Answer

A

when given a key it should always produce the same result

Question 13

Q

uniformity hash function

Answer

A

keys should be spread evenly over the available block range with same probability to reduce num of records in same block

Question 14

Q

data normalisation hash function

Answer

A

keys should be normalised before hashed (eg to make all characters lower case)

Question 15

Q

continuity hash function

Answer

A

keys that differ by a small amount should result in hash values that also only differ by small amount

Question 16

Q

non-invertible hash function

Answer

Study These Flashcards

A

does not hold true that the hash value can be reversed to get the original key

Question 17

Q

hashing collision

Answer

Study These Flashcards

A

when two record keys generate the same address
solved by using the next available free space or an overflow area
the success of a hashing function depends on the number of collisions that occur so real data must be run through to test it

Question 18

Q

overflow

Answer

Study These Flashcards

A

must be searched linearly
if too many files end up in the overflow the efficiency of the file falls as it slows it down

Question 19

Q

what is a backup

Answer

Study These Flashcards

A

copying files from the main area it is used to a separate area for security (in case file is accidentally deleted)
can take a long time so often done overnight

Question 20

Q

backup policy creation factors

Answer

Study These Flashcards

A

where will the backup be stored?
what will it be stored on?
how often will the backup be taken?
how long will a backup be kept?

Question 21

Q

incremental backup

Answer

Study These Flashcards

A

first backup will copy all files then after that only the files that have changed are backed up

Question 22

Q

archiving

Answer

Study These Flashcards

A

process of moving a file/record from the main system to a separate archived system to speed up running of main system while still giving access to archived file.

Question 23

Q

file version management

Answer

Study These Flashcards

A

creating a copy of the file every time fundamental changes are made to ensure you can revert back if necessary

Organisation and structure of data Flashcards

paper 2 (23 cards)