1.3 Exchanging Data Flashcards
What does compression do?
Reduce requirements on file storage
Reduce download times
Make best use of bandwidth
What are the two types of compression?
Lossy
Lossless
What is lossy compression?
Type of compression that removes data to reduce the file size, stripping out the least important data
Can the original be recreated in lossy compression?
No because detail is removed
What is lossy compression typically used for?
Multimedia files
What are examples of lossy compression?
JPEG
MP3
MPEG
GIF
What is lossless compression?
Type of compression to reduce the file size where data is not lost
Typically less effective at reducing file size
Can the original be recreated with lossless compression?
Yes
What is lossless compression used for?
Essential for file types like computer programs
What are some examples of lossless compression?
ZIP
PNG
What are two methods of lossless compression?
Run Length Encoding (RLE)
Dictionary Based Encoding
How does RLE work?
Finds runs of repeated binary pattern and replaces them with a single instance of the pattern and a number that specifies how many times the pattern is repeated
Is a real-life image suitable for RLE?
No because the image has too much detail
Does RLE have to be used on image data?
No
How are runs typically encoded with RLE?
Two bytes
One byte for the pattern, one byte for the number of repetitions in a run
What is a disadvantage of RLE?
Only achieves significant reductions in file size if there are long runs of data
How does dictionary based encoding work?
Searching for matches between the text to be compressed and a set of strings contained in a data structure (dictionary) maintained by the coder
When encoder finds a match, substitutes a reference to the string’s position in the data dictionary
What is plaintext?
The data that is being stored or is going to be transferred
The data to be encrypted
What is ciphertext?
The encrypted text
What is a cipher?
Algorithm used to encrypt the data
What is a key?
Data that is used within the cipher
What is decryption?
Converting the ciphertext back into the original plaintext
What is encryption?
Process of converting a message into plaintext into ciphertext using a cipher so it cannot be understood if the message is intercepted
How does symmetric encryption work?
Sender and receiver use the same key to encrypt and decrypt data
Is symmetric encryption faster than asymmetric encryption?
Yes because it uses less complex mathematical operations
What is a disadvantage of symmetric encryption?
Key has to be exchanged across a network, which could be intercepted by an attacker
What is asymmetric encryption?
A different key is used to encrypt and decrypt the data (public and private key pair)
Public key used to encrypt the message, which can be known by anyone, but cannot be used to decrypt the message
How does asymmetric encryption work?
Public key is used to encrypt the message, which can be known to anyone. Therefore recipient can send their public key to a device that wants to send them data, which is used to encrypt the data
Public key CANNOT be used to decrypt the message so can therefore be sent over the internet
Private key is NOT transferred over the internet and is only known by the recipient and is used to decrypt the message
Two keys cannot be used to determine the other and together are known as a key pair
What are the problems with asymmetric encryption?
Public key is widely known by any device so any device can use it to encrypt the data and pretend to be the sender (useful, therefore, to authenticate)
How does authentication work?
Sender encrypts the data using their PRIVATE key then encrypts that ciphertext using the recipients PUBLIC key
Data is then sent
Recipient decrypts the ciphertext using their PRIVATE key, leaving the ciphertext
Recipient decrypts the data using the senders PUBLIC key
Data can now be read and transfer has been authenticated
What does SSL stand for?
Secure Sockets Layer
What is the SSL handshake?
HTTPS protocol encrypts communication between web browser and web server in both directions
Creates encrypted link between server and a client
How does the SSL handshake work?
Browser sends HTTPS request to web server it wants to communicate with
Web server transfer back a digital certificate with its public key
Symmetric session key generated on browser and encrypted using the key received
This key sent to the server
Server retrieves symmetric session key by decrypting the data received using its private key
Symmetric session key is key that will be used to encrypt and decrypt data during the transfer
Symmetric session key cannot be decrypted during transfer by anyone that intercepts it
Allows for fast data transfer over the internet
What is a hash function?
Generates an output (hash value) for an input (key), which is always the same for each key
With the key, you can calculate the hash value
With a hash value, you can’t calculate the key
What is a hash value?
A string of characters of a fixed size which is different for each key
What is a collision?
When two different keys produce the same hash value
How can hashing algorithms be used for integrity validation?
Generate a hash called a checksum that’s appended to the end of the message being set
Makes sure that the message has been issued by the right person and not tampered with before reaching its destination
How can hashing algorithms be used?
Barcodes and ISBN book numbers use a similar approach called check digit
CSV number of a credit card is form of checksum used to validate credit cards
What is a database?
Persistent organised store of data
What does a persistent database mean?
Data stored on secondary storage device and is non-volatile
What does an organised database mean?
Data organised into records and fields
What is a relational database?
A database with multiple tables linked by primary key-foreign key relationships
What is a flat file database?
A database with a single table with information about a single entity
What is a primary key?
Unique identifier for each record in a table
What is a foreign key?
Non-primary key field in a table that links to primary key field in another table
What is an attribute in a database?
The different fields within a database
What is a record in a database?
A row within a database
What is a composite key?
Use multiple (two or more) fields to create a unique identifier for a record
What is data integrity?
The accuracy of the data within the database
What is data redundancy?
Duplicating data in multiple places in the database
What is a secondary key?
A field that will be used to search the table often
What is an entity?
Category of object, person, event or thing of interest about which data needs to be recorded
What is SQL?
Structured Query Language
Standard language used to query a relational database
How do you SELECT from a database?
SELECT field1, field2
FROM table_name
WHERE filtering criteria
ORDER BY fields ASC/DESC
How do you INSERT INTO a database?
INSERT INTO table_name (field1, field2)
VALUES (value1, value2)
How do you UPDATE a database?
UPDATE table_name
SET field1 = value1, field2 = value2
WHERE filtering criteria
How do you DELETE from a database?
DELETE
FROM table_name
WHERE filtering criteria
How do you create a table in a database?
CREATE TABLE IF NOT EXISTS table_name
(field1 TEXT NOT NULL PRIMARY KEY,
field2 INTEGER NOT NULL, field3 FLOAT)
How do you create a table with a foreign key and a composite key? (using the example of student, course, enrolled)
CREATE TABLE IF NOT EXISTS enrolled_table
(date_started TEXT NOT NULL, student_id INTEGER NOT NULL, course_code INTEGER NOT NULL,
PRIMARY KEY(student_id, course_code),
FOREIGN KEY(student_id) REFERENCES student_table(student_id) ON DELETE CASCADE,
FOREIGN KEY(course_code) REFERENCES course_table(course_code) ON DELETE CASCADE)
How do you SELECT data from three tables? (using an example of appointment, doctor, patient)
SELECT patient_table.firstname, patient_table.surname, patient_table.contact, doctor_table.firstname, doctor_table.surname
FROM appointment_table
JOIN patient_table
ON patient_table.patient_id = appointment_table.patient_id
JOIN doctor_table
ON doctor_table.doctor_id = appointment_table.doctor_id
WHERE appointment_table.appointment_id = 1
What are the different relationships between two entities?
One-to-one
One-to-many/many-to-one
Many-to-many
What is an example of a one-to-one relationship?
Husband — Wife
What is an example of a one-to-many relationship?
Mother —<- Child
What is many-to-many relationship?
Not allowed
Have to be split up using additional entity
What does a split up many-to-many relationship look like?
Patient —<- Appointment ->— Doctor
Are entities singular or plural?
Singular
What does ERD stand for?
Entity Relationship Diagram
What is an ERD?
Represents all the entities/tables with their attributes/fields and the relationships between entities
What should a structure of a database enable a user to do?
Enable a user to enter as many or as few records as required so make sure there are no many-to-many relationships
What is normalisation?
Splitting up data in databases and arranging the data to be in 1st, 2nd and 3rd normal form
What is the purpose of normalisation?
To design a database more efficient and easier to maintain
Removing unneeded and redundant data - as redundant data takes up storage and may make searching longer
Organising the data in a logical structure so all data in tables is related
What is 1st normal form?
Each data item cannot be broken down any further (ATOMIC)
Each row is unique (has a primary key)
Should be no repeating groups of attributes
What is 2nd normal form?
Be in 1NF
Each non-key attributes must depend fully on the primary key (depend on each part of the primary key) - NO PARTIAL DEPENDENCIES
What is 3rd normal form?
Be in 2NF
All non-primary key attributes must be dependent on only the primary key (must not be dependent on any other non-key attributes)
What is data integrity?
Accuracy and reliability of data
What is data redundancy?
Where data is stored multiple times
May happen in a flat file database but shouldn’t happen in a relational database that’s correctly designed
More redundant data had, more memory used and more chance there is a lack of data integrity
What is referential integrity?
Adds constraints to the data updated, deleted and entered into a relational database to ensure the data is as accurate as possible
How can referential integrity affect when data is entered into a database?
Could not add a record with a foreign key where the primary key it links to doesn’t have a record with that
pragma foreign_keys = ON
How does referential integrity affect when data is deleted from a database?
Deletes all records with that primary key
ON DELETE CASCADE
How does referential integrity affect when a primary key of a record is updated in a database?
Changes all instances of that primary key in all tables
What are the problems with carrying out operations on a database?
Multiple users trying to change data at the same time
A transaction being part completed by not fully completed
What is ACID?
Rules that should be followed to maintain consistency within a database
What does ACID stand for?
Atomicity
Consistency
Isolation
Durability
What is atomicity?
All or nothing
Requires that a transaction is processed in its entirety or not at all
In any situation, including power cuts or hard disk crashes, it is not possible to process only partly of a transaction
If any part of the transaction fails, roll it back and don’t complete any of it
What is an example of atomicity?
Bank transfers
What is consistency?
Ensures that no transaction can violate any of the defined validation rules
Referential integrity, specified when the database is set up, will always be upheld
Cannot process a transaction that would break the rules we’ve set up on the database
What is an example of consistency?
Referential integrity
Validation
- length check
- NOT NULL
- range check
- type check
- format check
What is isolation?
Ensures that concurrent execution of transaction leads to the same result as if transactions were processed one after the other
Crucial in a multi-user database
IF PROCESSING INSTRUCTIONS CONCURRENCTLY, THE OUTCOME SHOULD BE THE SAME IF WE WERE PROCESSINGN THEM SEQUENTIALLY
Record locking and time stamping
What is an example of isolation?
If two people are booking tickets for a show at the same time, the outcome should be the same if they were doing it one after another
What is record locking?
Prevents simultaneous access to records in a database in order to prevent updates being lost or inconsistences in the data arriving
Using record locking, a record is locked when a user retrieves it for editing or updating
Anyone else attempting to retrieve it is denied access until the transaction is completed or cancelled
What is deadlock?
Problem with record locking
If two users are attempting to update two records, a situation can arise in which neither can proceed
What is serialisation?
Ensures transactions do not overlap in time and therefore cannot interfere with each other or lead to updates being lost
What is timestamp ordering?
Serialisation technique
Every record in the database has a READ TIMESTAMP and a WRITE TIMESTAMP
These are updated whenever an object is read or written
When a user tries to save an update, if the READ TIMESTAMP is not the same as it was when they started the transaction, the DBMS knows another user has accessed the same object
What is durability?
ENSURES THAT ONCE A TRANSACTION HAS BEEN COMMITTED, IT WILL REMAIN SO, EVEN IN THE EVENT OF A POWER CUT
As each part of a transaction is completed, it is held in a buffer on a disk until all elements of the transaction are completed
Only then will the changes to the database tables be made
Ensures that if the user is told a transaction is successful then the changes will actually be committed to the database
Redundancy
What is redundancy?
Many organisations have built-in redundancy in their computer systems
Duplicate hardware, located in different geographical areas, mirrors every transaction that takes place on the main system
If this fails, backup system automatically takes over
What are some examples of redundancy?
RAID setup mirroring data
Having redundant backup hardware
What does DBMS stand for?
Database Management System
What is the DBMS?
Software application that allows a database administrator to maintain one or more relational databases
Hides the complexity of the physical implements, allowing the administrator to define the database structures at a conceptual or logical level
What are examples of methods of capturing data?
Input forms
Barcodes
Magnetic strip readers
OMR (Optical Mark Reader)
ANPR (Automatic Number Plate Recognition)
OCR (Optical Character Recognition)
Are input forms automatic or manual?
Manual
What are the benefits of input forms?
Can use drop-down lists, checkboxes etc can be used to reduce data entry errors
If well designed can be clear to the end user what data they are expected to enter
What are the drawbacks of input forms?
Can be slow to enter large amounts of data
If poorly designed may result in the end user know being clear on what is expected
If suitable validation not in place then error in data can occur
Are barcodes automatic or manual?
Automatic
What can barcodes be used for?
Shopping items
What are the benefits of barcodes?
Faster and easier to ring up items
More accurate
What are the drawbacks of barcodes?
No read/write capabilities
Can be slow as have to scan individually
Are magnetic strip readers automatic or manual?
Automatic
What are the uses of magnetic strip readers?
Bank cards
What are the benefits of magnetic strip readers?
Low costs
Rewriteable data
Quick and easy to use
What are the drawbacks of magnetic strip readers?
Tape can be damaged
Special equipment has to purchased
Limited storage capacity
Are OMRs automatic or manual?
Automatic
What are the uses of OMRs?
Multiple choice tests
Lottery tickets
Exam questions
What are the benefits of OMRs?
Fast and efficient way of collecting data and inputting it into a database
Significantly reduces human error
Accurate
What are the drawbacks of OMRs?
If marks aren’t dark enough or in the right space, won’t be read correctly
Not suitable for text input
Needs the answers on a prepared form
Are ANPRs automatic or manual?
Automatic
What are the uses of ANPRs?
Car parks
Toll gates
What are the benefits of ANPRs?
Efficient and reliable
Enhanced security
What are the drawbacks of ANPRs?
Privacy concerns
Bad weather can affect accuracy
Are OCRs automatic or manual?
Automatic
What are the uses of OCRs?
Reading postcodes and routine mail
What are the benefits of OCRs?
Automatically reads texts by interpreting shape of letters
Fast and efficient
What are the drawbacks for OCRs?
Doesn’t work as well with handwriting
Sometimes inaccurate
What does JSON stand for?
JavaScript Object Notation
What is JSON?
Open source and language independent
Text-based
What are the two structures of JSON?
An ordered list of values
A collection of name/value pairs
What are the advantages of JSON?
Easy for humans to understand
Easy for computers to parse so quick to process
What are the disadvantages of JSON?
Supports limited types of data