6. Organisation of Data Flashcards
What is a file?
A collection of related records or data handled as a single unit. It has a filename which users can use to access data at a later time
What is a record?
A collection of related fields (a field is a piece of data about an entity e.g. surname is a field that could be in customer data)
What is a fixed length record?
When the length of the record is stated at the beginning and cannot be changed. If data is too large to be stored, it is truncated.
Examples:
- DOB
- gender
What is a variable length record?
When the length of the record can change depending on what data needs to be stored.
Examples:
- name
- address
How do fixed length and variable length records compare?
fixed length:
- same number of bytes in each field
- easier to program as its easier to calculate how much storage is required
- quicker to process
- fields with blank space waste storage
variable length:
- different number of bytes in each field
- harder to program
- slower to process
- no blank space so only necessary amount of storage is taken up
What is a master file?
They store records of everything that has ever happened and are updated with new batches of information to keep them up to date. Due to this they are large and accessed infrequently. They are stored in a logical and sequential way.
What is a transaction file?
They store day to day data that is copied to the master file at the end of the day. Data is stored using a serial method (no fixed order) for a set period of time.
How is the master file updated?
The transaction files are sorted in order of primary key. For every record in the sorted transaction file, the master file is updated by comparing the new transaction file to the data within the master file. After this has repeated, a new master file is produced that contains the new data, error reports and printed reports to utilise within the company, e.g. gas bills.
What is serial file access?
When records are stored in order of when they are added to the file. They are used when no order is required. It is slow to search through them as a linear search must be used, however adding a file is very quick as it is simply appended to the end.
What is sequential file access?
When records are stored in order of record key. When a new record is added, the records from the old file are copied to the new, up until where the new record needs to be inserted. Once the new record is inserted, the rest of the records from the old file are copied. Records in sequential files can be searched for using a binary search.
How are records deleted from files?
A new copy of the file is made with every record in except the one being deleted. The original file is then deleted and the new file remains.
What is an indexed sequential file?
Records within indexed sequential files are split into two components: the index, and the bulk of the record. The index acts as a record key and a pointer to where the rest of the record is stored on the disk. When copying the file to add or delete a record, only the indexes need to be copied, and the rest of the record can remain where it is.
What is multilevel indexing?
When the top level index of a record points to the second level index of the record, which contains pointers to the rest of the record. As many levels as needed can be used. This is useful for when different clusters of records are stored in different locations.
What is data validation?
Validation checks data inputted by a user before it is committed to storage. It is used to ensure that input provided by the user is possible and/or sensible.
What are the types of validation?
- type check: ensures inputs are correct data type
- range check: ensures data falls within specified range
- presence check: ensures data has been entered into a field
- format check: ensures data conforms to a certain format e.g. dates as dd/mm/yy
- length check: ensures inputs are within a certain length