Unit 4 Exchanging Data Flashcards
What name is given to the public and private keys used in asymmetric encryption?
Key pair
Name the two categories of compression
Lossy and lossless
In which type of compression is the quality of a file not degraded?
Lossless
What is the purpose of encryption?
To keep data secure during transmission
Name one type of lossless compression
One from:
● Run length encoding
● Dictionary encoding
In which form of encryption do the sender and receiver share the same private key?
Symmetric encryption
How many keys are used in asymmetric encryption?
Two (one public and one private)
If person A wants to send a message to person B using asymmetric encryption, which key should they use to encrypt the message?
B’s public key
A message encrypted with B’s public key can only be decrypted with B’s private key, which only B has access to
What is said to have occured when two keys map to the same hash?
A collision
In which kind of lossless compression are repeated characters replaced by one occurrence and the number of times to repeat the character?
Run length encoding
What name is given to the process of turning an input into a fixed size value?
Hashing
Which data structure uses hashing to store information with constant lookup time?
Hash table
What is meant by compression?
The process of reducing the space required to store a file
Name two properties that a hashing algorithm should have
● Low chance of collision
● Quick to calculate
● Output smaller than input
Lossy compression
Lossy compression reduces the size of a file while also removing some of its information. This could result in a more pixelated image or less clear audio recording.
Lossless compression
Lossless compression, the original file can be recovered from the compressed version. Something which is not possible when using lossy compression which reduces the size of the file by completely disregarding some information.
Run Length Encoding
Run length encoding is a method of lossless compression in which repeated values are removed and replaced with one occurrence of the data followed by the number of times it should be repeated.
What would the string AAAAAABBBBBCCC be represented as
A6B5C3
Dictionary encoding
Dictionary encoding is another example of a method of lossless compression. Frequently occurring pieces of data are replaced with an index and compressed data is stored alongside a dictionary which matches the frequently occurring data to an index. The original data can then be restored using the dictionary.
Encryption
Encryption is used to keep data secure when it’s being transmitted.
Symmetric Encryption
In symmetric encryption, both the sender and receiver share the same private key, which they distribute to each other in a process called a key exchange. This key is used for both encrypting and decrypting data.
It’s important that the private key is kept secret. If the key is intercepted during the key exchange then any communications sent can be intercepted and decrypted using the key. Asymmetric encryption gets around this issue.
Asymmetric Encryption
When sending information using asymmetric encryption, two keys are used: one public and a second, private, key. The public key can be published anywhere, free for the world to see, while the private key must be kept secret. Together, these keys are known as a key pair and are mathematically related to one another.
Hashing
Hashing is the name given to a process in which an input (called a key) is turned into a fixed size value (called a hash).
Relational Databases
Relational database work by splitting data about different entities (types of things) into separate relations (tables). Each relation only contains data about one entity and relationships (connections) are made between each of the relations through the use of primary and foreign keys.
Flat File
A flat file is a database that consists of a single file. The flat file will most likely be based around a single entity and its attributes.
Flat files are typically written out in the following way:
Entity1(Attribute1, Attribute2, Attribute3 …)
Primary Key
A primary key is a unique identifier for each record in the table.
Foreign Key
A foreign key is the attribute which links two tables together. The foreign key will exist in one table as the primary key and act as the foreign key in another.
Secondary Key
A secondary key allows a database to be searched quickly. The patient is unlikely to remember their patientID but will know their surname. Therefore, a secondary index (secondary key) is set up on the surname attribute. This makes it possible to order and search by surname which makes it easier to find specific patients in the database.
One-to-one
Each entity can only be linked to one other entity, such as the relationship between a husband and wife. The husband entity can only be associated with one wife entity and vice versa.
One-to-many
One table can be associated with many other tables, such as a mother having multiple children. Similarly, multiple child entities can be linked to the same mother entity.
Many-to-many
One entity can be associated with many other entities and the same applies the other way round. An example is students and courses - each student can enrol in more than one course and each course can have more than one student.
Normalisation
The process of coming up with the best possible layout for a relational database is called normalisation.
Normalisation tries to accomplish the following things:
● No redundancy (unnecessary duplicates).
● Consistent data throughout linked tables.
● Records can be added and removed without issues.
● Complex queries can be carried out.
First Normal Form
There must be no attribute that contains more than a single value.
Second Normal Form
A database which doesn’t have any partial dependencies and is in first normal form can be said to be in second normal form. This means that no attributes can depend on part of a composite key.
Third Normal Form
If the database is in second normal form and contains no non-key dependencies, it is in third normal form. A non-key dependency means the attribute only depends on the value of the primary key and nothing else.
Indexing
Indexing is a method used to store the position of each record ordered by a certain attribute. This is used to look up and access data quickly. The primary key is automatically indexed; however, the primary key is almost never queried since it is not normally remembered. This is why secondary keys are used. Secondary keys are indexed to make the table easier and faster to search through on those particular attributes.
Capturing Data
Data needs to be input into the database and there are multiple methods of doing this. The chosen method is always dependent on the context. For example, if pedestrians are participating in a survey, their responses will need to be manually entered into the database.
Data is also captured when people pay cheques. Banks scan cheques using Magnetic Ink Character Recognition (MICR). All of the details excluding the amount are printed in a special magnetic ink which can be recognised by a computer but the amount must be entered manually. Optical Mark Recognition (OMR) is used for multiple choice questions on a test. Other forms use Optical Character Recognition (OCR)
SQL
SQL stands for Structured Query Language and is a declarative language used to manipulate databases. SQL enables the creating, removing and updating of databases.
SELECT, FROM, WHERE
The SELECT statement is used to collect fields from a given table and can be paired with the FROM statement to specify which table(s) the information will come from. The WHERE statement can be used in conjunction to specify the search criteria.
ORDER BY
The ORDER BY part of the code specifies whether you want it in ascending or descending order. Values are automatically placed in ascending order and adding ‘Desc’ to the end of statement will cause values to be displayed in descending order:
ORDER BY DatePublished Desc
JOIN
JOIN provides a method of combining rows from multiple tables based on a common field between them. The example below shows the joining of two tables, Movies and Directors.
CREATE
The CREATE function allows you to make new databases, as shown below:
CREATE TABLE TableName
Data Types
- CHAR(n): this is a string of fixed length n
- VARCHAR(n): this is a string of variable length with upper limit n
- BOOLEAN: TRUE or FALSE values
- INTEGER/INT: integer
- FLOAT: number with a floating decimal point
- DATE: the date in the format Day/Month/Year
- TIME: the time in the format Hour/Minute/Second
- CURRENCY: sets the number as a monetary amount
Transaction Processing
A transaction is defined as a single operation executed on data. However a collection of operations can also sometimes be considered a transaction.
ACID
Atomicity, Consistency, Isolation, Durability
Atomicity
A transaction must be processed in its entirety or not at all
Consistency
Ensures that no transaction can violate any of the defined validation rules for maintaining the integrity of the database. When a database is created, referential integrity rules will be specified between linked tables.
Isolation
Simultaneous executions of transactions should lead to the same result as if they were executed one after the other
Durability
Once a transaction has been executed it will remain so regardless of the
circumstances surrounding it, such as in the event of a power cut.
Record Locking
The process of preventing simultaneous access to records in a database is called record locking and it is used in order to prevent inconsistencies or a loss of updates. While one person is editing a record, this ‘locks’ the record so prevents others from accessing the same record.
Redundancy
Some information is very important and people and companies cannot afford to lose this information. This is where redundancy comes in. Redundancy is the process of having one or more copies of the data in physically different locations. This means that if there is any damage to one copy the others will remain unaffected and can be recovered