Unit 4 - Exchanging Data Flashcards
what is lossy compression
non-essential data is permanently removed
how does lossy compression work on images
different shades of the same colour in an image will be made into a single shade
how does lossy compression work on sound files
frequencies outside the human hearing range are removed
what is lossless compression
patterns in the data are spotted and summarised in a shorter format without permanently removing any information
what is run length encoding
a basic method of compression that summaries consecutive patterns of the same data. this works well with image and sound where data can be repeated many times
what is dictionary compression
spots regularly occurring data and stores it separately in a dictionary. the reference to the entry in the dictionary is stored in the main file thereby reducing the original data stored
what is encryption
the transformation of data from one form to another to prevent an unauthorised third party from being able to understand it
what is plain text
the original data or message
what is cipher text
the encrypted data
what is the cipher
the encryption method or algorithm
what is the key
the secret information to lock or unlock the message
give an overview of the caesar cipher
a type of substitution cipher that works by shifting the letters of the alphabet along by a given number of characters
give an overview of the vernam cipher
the one-time pad (key) must be equal or longer in characters than the plaintext, random and only used once. one-time pads are used in pairs where the sender and recipient are both party to the key
what is symmetric encryption
- private key
- the same key is used to encrypt and decrypt
- this means that the key must also be transferred to the recipient
what is asymmetric encryption
- public key
- one key is made public so that others wishing to send you data can use this key to encrypt it
- the public key cannot decrypt the data
- a private key is used to decrypt the data
what is hashing
hashing turns your password or PIN into a short sting of letters and/or numbers using a hashing algorithm
what is a database
a database is an organised store of data usually controlled by a DBMS
what is a flat file database
consisting of information about a single entity
define entity
an entity is a category, object, person, event or thing of interest about which data needs to be recorded
define field
a field is one piece of information about the object (columns)
define record
a record is all of the fields about the individual object within an entity (rows)
how is an entity represented
table
define attribute
attributes define what type of data is stored in an entity
what is a primary key (entity identifier)
- each entity needs an identifier which uniquely identifies a particular record
- in a relational database, the identifier is knows as the primary key
- it is underlined in the entity description
what is the secondary key
- the primary key is automatically indexed so that any particular record can be found very quickly
- in some databases, searches may often need to be made on many fields
- so a secondary key is introduced
what is a composite primary key
- sometimes one or more attributes are needed to uniquely define a record
- in this case there is still one primary key, but made up of data from two or more fields
what are the three different relationships between entities
- one-to-one
- one-to-many
- many-to-many
how does a relational database work
a separate table is created for each entity identified in the system. where a relationship exists between entities, an extra field called a foreign key links the two tables
what is a foreign key
- a foreign key is an attribute that creates a join between two tables (relation)
- it is the attribute that is common to both tables
- the primary key in one table is the foreign key in the table to which it is linked
what entity relationship doesn’t work in databases
many-to-many
how to get around a many-to-many relationship
introduce a new table to split the many-to-many relationship
what is referential integrity
no foreign key in one table can reference a non-existent record in a related table
what is normalisation
a process used to come up with the best possible design for a relational database
how should a normalised table be organised
- no data is unnecessarily duplicated
- data is consistent throughout the database. consistency should be an automatic consequence if not holding any duplicated data
- structure is flexible
- users should be able to make complex queries
what is a complex query
involves two or more search criteria
what is first normal form
- no repeating attributes or groups of attributes
- all attributes must be atomic - a single attribute cannot consist of two data items such as a first name and surname
what is second normal form
- must be in 1NF
- contains no partial dependencies
- this can only occur if the primary key is a composite key
what is third normal form
- must be in 2NF
- contains no non -key dependencies
- “all attributes are dependent on the key, the whole key not nothing but the key”
advantages of normalisation
- maintaining and modifying the database
- faster sorting and searching
- deleting records
what is SQL
structured query language - a declarative language used for querying and updating tables in a relational database. it can also be used to create tables.
SELECT…FROM…WHERE
used to extract a collection of fields from a given table
SQL: dates
- # …# (dates in american format)
SQL: not equal to
<>
SQL: JOIN
an alternative method for combining records (rows) from two or more tables
SQL: defining a database table
CREATE TABLE tblProduct
(
Product ID CHAR(4) NOT NULL PRIMARY KEY,
Description VARCHAR(20) NOT NULL,
Price CURRENCY
)
SQL: CHAR(n)
character string of fixed length n
SQL: VARCHAR(n)
character string variable length, max. n
SQL: adding a column in a table
ALTER TABLE tblProduct
ADD QtyInStock INTEGER
SQL: deleting or altering a table structure
to delete a column:
ALTER TABLE tblProduct
DROP QtyInStock
to change the data type of a column:
ALTER TABLE tblProduct
MODIFY COLUMN Description VARCHAR(30) NOT NULL
SQL: defining linked tables
CREATE TABLE ProductComponent
(ProductID. CHAR(4) NOT NULL,
CompID CHAR(6) NOT NULL,
Quantity INTEGER,
FOREIGN KEY ProductID REFERENCES Product(ProductID)
FOREIGN KEY CompID REFERENCES Component(CompID)
PRIMARY KEY (ProductID, CompID)
SQL: inserting data
INSERT INTO Product(ProductID, Description,Price),
VALUES (“A345”, “Pink Rabbit”, 7.50)
SQL: updating a table
UPDATE Product,
SET Description = “Blue Rabbit”, Price = 8.25
SQL: deleting a record
DELETE FROM Product
manual methods of data capture
forms (filled out by hand)
automated methods of data capture
- smart card readers
- barcode readers
- scanners
- optical character recognition
- sensors
examples of inputting data
- automatically using the DMBS software
- typing it in
- importing form a spreadsheet or file
- using EDI
what is EDI
electronic data interchange - the computer-to-computer exchange of documents
- it replaces post, email or fax
- all documents must be in a standard form so that the computer can understand them
how is data selected
data may be selected if it matches certain criteria. can be done using SQL, for example, Query by Example in Access
what is transaction processing
in the context of databases, a single logical operation is defined as a transaction. it may consist of several operations
what is ACID
this is a set of properties to ensure that the integrity of the database is maintained under all circumstances. it guarantees that transactions are processed reliably
define A in ACID
atomicity - this property requires that a transaction is processed in its entirety or not all. in any situation, including power cuts or hard disk crashes it is not possible to process only part of a transaction
define C in ACID
consistency - this property ensures that no transaction can violate any of the defined validation rules. referential integrity, specified when the database is set up, will always be upheld
define I in ACID
isolation - ensures that concurrent execution of transactions leads to the same result as if the transactions were processed one after the other. this is crucial in a multi user database
define D in ACID
durability - this ensures that once a transaction has been committed, it will remain so, even in the event of a power cut. as each part of the transaction is completed, it is held in a buffer in the disk until all elements of the transaction are completed. only then will the changes to the database tables be made
what is record locking
record locking prevent simultaneous access to objects in a database in order to prevent updates being lost or inconsistencies in the data arising. using record locking, a record is locked when a data is retrieved for editing and updating. anyone else attempting to retrieve it is denied access until the transaction is completed or cancelled
problems with record locking
if two users are attempting to update two records, a situation can arise in which neither can proceed, known as deadlock
what is serialisation
ensures that transactions do not overlap in time and therefore cannot integrate with each other or lead to updates being lost
what is time stamp ordering
every object in the database has a read timestamp and a write timestamp. these are updated whenever an object is read or written. when a user tries to save the update, if the read timestamp is not the same as it was when they started the transaction, the DBMS knows another user has accessed the same object.
what is commitment ordering
this is another serialisation technique to ensure that no transactions are lost if two clients are simultaneously trying to update a record. transactions are ordered in terms of their dependencies on one another as well as the time they were initiated. it can be used to prevent deadlock by blocking one request until another is complete.
what is redundancy
many organisations cannot afford to have their computer systems go down for even a short time. therefore, they have built-in redundancy in their computer systems - this is called a failover system. it is duplicate hardware, located in different geographical areas, mirrors every transaction that takes place on the main system. if this fails, the backup system automatically takes over.