Organisation and structure of data Flashcards

paper 2

1
Q

fixed length record 5 facts

A
  • easier to program and search as they have same allocation of memory and space
  • easier to process as we know start and end locations
  • binary search can be used to locate records
  • wastes space if records are too small
  • truncates fields if they are too large
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

variable length record 5 facts

A
  • diff number of bytes for each record
  • slow searching as you need to identify the marker at the end of each record
  • if file is updated it all needs to be rebuilt
  • can only use linear search
  • better if records are different sizes as doesn’t waste space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

master file 5 facts

A
  • larger file that is sorted into an order
  • data is updated from the transaction file
  • data searched regularly and used for info by staff
  • data not always completely up to date
  • permanent data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

transaction file 3 facts

A
  • serial files, stored in the order the data is submitted
  • used as temporary files before data is added to master file usually for short amount of time
  • serially searched (slow)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

updating the master file process

A

usually occurs overnight
1. transaction file sorted to the same order as the master file
2. data copied from the old master to the new master until the point where the data from transaction file is needed
3. data copied from transaction file to new master file
4. this is repeated until all data from transaction file is in correct location
5. error log is generated at the end of process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

serial files

A

records stored in chronological order (order they were added
must be linearly searched (slow)
adding records is fast as it is just appended to the end of last record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sequential files

A

stores data in order of primary key
fast to locate specific files - can use binary search
when new records added a new file is made and records are copied across with new record added in when appropriate location reached
same for when record is deleted (deleted record just not copied across)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

indexed sequential files

A

records split into 2 components: the index (primary key and pointer) and the bulk of the record
primary keys stored in order and each linked to a pointer which identifies where on the disk the rest of the record can be found
keys are added and deleted in the same manner as a sequential file but is faster as only the index needs to be copied as the pointer remain the same.
speeds up searching time as less of the file needs to be searched

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

multilevel index

A

same as index but top level index doesn’t contain pointers to the records but to second level indexes which contains pointers to the records (or to a 3rd layer etc)
useful when diff clusters of records are stored in diff physical locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

direct file access

A

has records that are stored and retrieved according to either their disk address or their relative position within the file
means that the program which stored and retrieves the records must specify the address within the file - must use hashing algorithm with the key to generate the start position of the block, the block can then be serially searched to find the tile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

types of hashing function

A

deterministic
uniformity
data normalisation
continuity
non-invertible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

deterministic hash function

A

when given a key it should always produce the same result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

uniformity hash function

A

keys should be spread evenly over the available block range with same probability to reduce num of records in same block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

data normalisation hash function

A

keys should be normalised before hashed (eg to make all characters lower case)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

continuity hash function

A

keys that differ by a small amount should result in hash values that also only differ by small amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

non-invertible hash function

A

does not hold true that the hash value can be reversed to get the original key

17
Q

hashing collision

A

when two record keys generate the same address
solved by using the next available free space or an overflow area
the success of a hashing function depends on the number of collisions that occur so real data must be run through to test it

18
Q

overflow

A

must be searched linearly
if too many files end up in the overflow the efficiency of the file falls as it slows it down

19
Q

what is a backup

A

copying files from the main area it is used to a separate area for security (in case file is accidentally deleted)
can take a long time so often done overnight

20
Q

backup policy creation factors

A

where will the backup be stored?
what will it be stored on?
how often will the backup be taken?
how long will a backup be kept?

21
Q

incremental backup

A

first backup will copy all files then after that only the files that have changed are backed up

22
Q

archiving

A

process of moving a file/record from the main system to a separate archived system to speed up running of main system while still giving access to archived file.

23
Q

file version management

A

creating a copy of the file every time fundamental changes are made to ensure you can revert back if necessary