Data storage Flashcards
Syallabus: 1.1.3
What is data validation?
List 8 different validation checks
Validation means checking carried out by the computer to make sure that input data are within the limits of what a user might resonably enter.
- Type check (e.g. to only accept numerals)
- Range check (e.g. must be >0 and <100)
- Limit check (e.g. date of birth must not be numerically greater than today’s date)
- Length check (e.g. password must be longer than 8 characters)
- Character check (e.g. :!* not accepted for file name)
- Format check (e.g. checking ID codes)
- Presence check (e.g. email address required)
- Consistency check (checking that different fields in the same record correspond correctly)
What is data verification?
List two types of data verification to minimise copying errors by humans
Data verification means double-checking for accuracy when data is copied.
- Visual check - a single operator reads through a document, comparing it with the original
- Double entry - the data is entered twice, either by two different operators or by the same operator, e.g. when entering a new password
List 4 methods of error detection and correction
- Parity checks
- Check digits
- Checksums
- Automatic Repeat reQuests (ARQ)
Describe how a parity check works
- A parity bit is added at an agreed position in each byte to make the total number of 1’s in the byte either odd or even
- Odd parity: odd number of 1’s, e.g. 11001110
- Even parity: even number of 1’s, e.g. 11011110
- If the receiver device checks the parity of received data and it has the wrong parity, it may re-read the byte that was sent, output an error message and/or request re-transmission.
Describe how a check digit works
- A check digit is an extra digit that is calculated from all the original digits using an algorithm in order to summarise them
- The algorithm used depends on the type of code, but relies on assigning a numerical ‘weight’ to each original digit that depends on its position in the number
- The check digit checks for three types of error reliably:
- two adjacent digits transposed
- an incorrect digit entered
- an omitted or extra digit
Describe how a checksum works
- A checksum is a way of summarising a block of data to check that it is not corrupted
- Arithmetic is applied to the elements of the block (e.g. the sum of all numerical values)
- The sum is reduced to a standard number of digits and transmitted with the block
- The same calculation is performed by the receiving device and the result is compared with the received checksum
- If the checksums do not math, the data is rejected; else, the intergrity of the data is proven to be maintained
- Cryptography can be used to try to prevent someone from maliciously substituting different data with the same checksum
Describe how an Automatic Repeat reQuest (ARQ) works
- Also called Automatic Repeat Query
- An error-control protocol that automatically initiates a call to retransmit any data packet or frame after receiving flawed or incorrect data.
- When the transmitting device fails to receive an acknowledgement signal to confirm the data has been received, it usually retransmits the data after a predefined timeout and repeats the process a predetermined number of times until the transmitting device receives the acknowledgement.
What are the following file formats?
- MIDI
- jpeg
- MP3
- MP4
- MIDI - Musical Instrument Digital Interface, used by musicians when creating songs
- jpeg - Joint Photographic Experts Group, a lossy form of compression for colour images
- MP3 - lossy form of audio compression
- MP4 - video format
What is data compression
- Storing data in a format that requires less space than usual
- Data compression is particularly useful in communications because it enables devices to transmit or store the same amount of data in fewer bits
- Compression can be either lossy or lossless
What is lossy compression?
- This type of compression applies an algorithm that deletes ‘unecessary’ bits of data to reduce the file size
- Quality of the file is reduced, e.g. poor picture quality
- The original file cannot be recreated from the compressed file
- Creates a smaller file size than lossless compression, so it takes up less space and transmission is faster
What is lossless compression?
- This type of compression applies an algorithm to represent the same data in a shorter number of bits, e.g. by replacing a string of ten repeated digits with a command to repeat the digit ten times.
- Quality of the file is maintained
- The original file can be recreated from the compressed file as no data is lost
- Creates a larger file size than lossy compression, so it takes up more space and transmission is slower