Applications of Computer System Flashcards
Check Sum
An error detection system. The check-sum is calculated by summing all the values in a block of data and including it when the data is stored or transmitted.
Flat File
A Flat File refers to a single file used to hold all of the data within a specific problem.
Flat File Problems
- Data redundancy.
- Data consistency
- Data integrity
Relational database
A relational database is a collection of tables that hold records. The records may be connected through relationships (links).
Data Consistency
This deals with ensuring data is accurate and valid.
Problems can arise if data goes out of date, e.g. storing age instead of date of birth.
It can also arise if the same data is stored more than once.
Data redundancy
Data items are said to be redundant if they are stored more than once. Redundancy can lead to loss of consistency.
Data Independence
Data being inconsistent in a flat file due to possibility of different formats, etc, and being consistent in a RDBMS as each record is only stored once so cannot have different attributes. A date field could be stored in file as a text field but in another field as a date/time field and the data would be incompatible. In a relational database because the attributes of any one entity are contained within one file, there is no risk of the same attribute being stored in a different format in a different file (Spelling mistakes in names).
Normalizing
Normalizing is the process of ensuring that data is correctly organized. There are three tests, called normal forms, which should be applied to the data to ensure that the data is correctly organized.
1st Normal Form
A database is in 1st Normal Form if it has no repeating attributes or group of attributes.
2nd Normal Form
A database is in 2nd Normal Form if the non key fields depend on the whole key and not part of it.
3rd Normal Form
A database is in 3rd Normal Form if data items are dependent on the (primary) key, the whole (primary) key and nothing but the (primary) key.
Primary Key
A primary key uniquely identifies a record in a file. No two records in the file will have the same primary key.
Foreign Key
A foreign key forms a relationship from a record in one entity to a record in another entity.
DBMS
A Data Base Management System is responsible for managing, storing, retrieving data, in a database, for other applications to use.
View
A view is the name given to the part of the data structure that can be seen by a particular use or application.
Transaction Support
A transaction is a sequence of events that must be carried out either completely or not at all. If the transactions fails then all of the steps carried out must be reversed.
Concurrency Support
This DBMS functions enables many user processes to access a database at the same time. In Particular, when user processes involve transactions that update the same data, a DBMS must perform at he updates in a way that prevents them interfering with each other. For example record locking.
Database Security
It covers:
- Making sure each user has access only to what they need to do their job.
- Ensuring data is not lost through hardware or software failures.
Preventing Loss of Data is managed by:
- Time back-ups
- Log/transaction files.
Query Languages
A query language is a (simple) language that allows data to be extracted from a database.
Example: SQL
SQL
Structured query language. A 4GL used to define, interrogate and manage databases.
Data dictionary
A data dictionary is a document that describes each table (file) in the database, each field in terms of data type and validation, and identifies primary and foreign keys.
Database Administrator
The database administrator manages the structure of and access to the data in a database.
Recovery
Restoring a computer system after failure. Possibly by copying files from a backup to the working store.
Distributed Systems
Distributed systems refer to the use of more than one computer in order to store or process data. This generally takes place behind the scenes so the user is not aware that more than one computer is being used.
Distributed Processing
Distributed Processing is the sharing of the processing requirements by using many computers accessed across a network. (It’s a form of parallel processing)
Distributed Processing Problems:
- Speed of data transfer.
- Not suitable for all tasks.
Distributed Database
A distributed database is one where several computers on a network each hold part of the data and cooperate to make it available to a user.
Distributed Database Benefit
It is to make the data as close as possible to the user to reduce network traffic and to speed up data transfer.
Distributed Database drawback
It is difficult to ensure that all the data in all the computers is always up-to-date/maintain integrity.
Data Warehousing
Data Warehousing is the accumulation of data possibly from several sources, for future processing.
Data Mining
Data Mining is the analysis of large amounts of data in a data warehouse to provide new information. The analysis will try to find statistically significant correlations or trends in the data.
Uses of data mining (supermarket)
- To identify unexpected shopping patterns in supermarkets.
- Optimize website profitability by making appropriate offers to each visitor.
- Identify suspicious (unusual) behavior, as part of a fraud detection process.
Uses of data mining (Insurance companies)
- Spot and understand trends in claims.
- Identify high risk customers.
- Identify customers who may not renew policy (so they can be targeted).
- Predict and detect fraudulent or risky behavior.
Problems of Data Mining
- Concerns over access to personal data.
- People being refused services e.g. insurance based on age, post code.
Data security
Refers to making sure data is not lost.
Bio-metric Security
Identifying users by measuring some aspect of the body eg Fingerprints/iris/retina/voice.
Process of bio-metric security
Take fingerprint and register on a database.
To log on - take finger print and compare it with stored prints.
If they match then let the user in.
Indexed Sequential File
An indexed sequential file is a sequential file (records are in key order). An index allows direct access to any particular record if the key is known.
Advantages of Indexed sequential file
- Rapid search since the index is in RAM.
- Key might be of different size if it’s variable length record.
Multi-Level Index
A file, which is fully indexed, where the index is so large it also has an index .
The high-level index contains the block in the low level index where the record key is to be found.
How can file become corrupted?
- Hacking - deliberate damage.
- Physical damage to hard disk.
- System error.
- Virus.
Full Backup
At regular intervals a copy of the file is made and kept secure.
Incremental Backup
An incremental backup is a type of backup that only copies files that have changed since the last backup.
Incremental Backup - Advantages
- Less storage needed.
- More often - less data lost.
- Faster.
File Generations
File generation refer to the successive backup copies made of a file that frequently change. Father, Grand father, son.