F453 Key terms Flashcards
Flat file database
- A single table containing all information.
Relational database
- Linked tables, each representing an entity in the system
- Pieces of data are stored once and looked up via links from other tables
- DBMS manages tables and relationships between them
Data redundancy
- The same piece of data in a database is stored in more than one place
Data inconsistency
- Two versions of data in a database may be different.
Advantages of flat file databases
- Databases are simple to construct
Advantages of relational databases
- Reduced data duplication
- Reduced data redundancy
- Improved data consistency
- Improved data integrity/security
- Control over access to data
Disadvantages of relational databases
- Complex to set up
- Requires a good DBMS to control efficient views of data
Entity
- A thing about which data is stored
Relationship
- A link between two entities
Normalisation
- Resolves many to many relationships
- Ensures data is stored in one place
- Ensures data is stored in the most appropriate place
First Normal Form (1NF)
- No repeating attributes
Second Normal Form (2NF)
- No partial-key dependencies
- Non-key attributes are dependent on all of the primary key
Third Normal Form (3NF)
- No non-key dependencies
- No non-key attributes can be dependent on another non-key attribute
Primary key
- A unique identifier within a table
- Referred to as a composite key when comprised of multiple fields
Composite key
- Two or more attributes which act to uniquely identify an entity occurrence
Foreign key
- A primary key from one table used as an attribute in another
- Used to link the two tables
Where foreign keys are always found
- In the table with the ‘many’ side of the relationship
Secondary key
- An attribute that can be used to search or sort data
- They are indexed which allows for fast searching of data
Database Management System (DBMS)
- Software to handle the complexities of managing a database
- Could provide a UI
- May used SQL to communicate with other programs
- Provides different views of data
Tasks performed by the DBMS
- Manages access rights
- Adds new data
- Updates existing data
- Finds data
- Maintains indexes
- Enforces data integrity rule
Access rights
- Controls what data each user is allowed to view
- Controls what each user is able to do with data (view, update etc)
Data dictionary
- A file containing descriptions of data in a database
- Uses metadata to define tables
- Used by database managers when altering database structure
Data contained within the data dictionary
- Names of table
- Relationships between data
- Primary and foreign keys
- Field characteristics (data type, length etc)
- Which programs can access the data
Data Definition Language
- Used to create a database
- Used to create tables, the users, their access right, alter the table (eg defining a foreign key)
- Records the attributes, data types, validation used and relationships between entities, creates users, grant access rights
Data Manipulation Language
- Used to query and sort data
- Used to add, update and delete data from a database
SQL
- Structured Query Language
Reports
- A presentation of selected data, usually in table form
- Features include: a query and a display order
Static data structure
- Data structure with a fixed size when created
Dynamic data structure
- Data structure which changes size as data is added and removed.
Advantage of static data structures
- The amount of storage required is known so easier to program
Disadvantage of dynamic data structures
- The amount of storage required is not known so more complex to program
Stack
- A list in which insertion and deletion take place at the same end
- A LIFO data structure
- Has only one pointer; TOP
LIFO
Last In First Out
Uses of a stack in a computer system
- Storing return addresses (during subroutine calling)
- Passing parameters
- Storing contents of registers while processing interrupt
- Reversing the order of a set of data
Error that could be encountered when adding to a stack
- Stack overflow error (stack is alredy full)
Error that could be encountered when removing from a stack
- Stack underflow error (stack is already empty)
Algorithm for adding an item to a stack (push item onto stack)
- If Stack is full then
- Output error and stop
- Else
- Increment StackPointer
- Add item at position of StackPointer
- End If
Algorithm for removing an item from a stack (pop item off stack)
- If Stack is empty then
- Output error and stop
- Else
- Return data item at position of StackPointer
- Decrement StackPointer
- End If
Queue
- A list
- Insertion and deletion take place at opposite ends
- A FIFO data structure
- Has two pointers: Front and Next
FIFO
- First In First Out
Queue overflow error
- When after an attempt to enqueue an item onto a full queue
Queue underflow error
When after an attempt to dequeue an item off of an empty queue
Uses of a queue in a computer system
- Jobs waiting for a printer in a spool queue
- Job queue batch processing system
- Keyboard buffer
Shuffle queue
Front index always fixed (items shuffle up to front when an item is dequeued)
Circular queue
- When enqueuing next index moves forward
- When dequeuing the front index moves forward
- The Next pointer wraps to beginning once last index is reached.
Algorithm for circular queue enqueue
- If Queue is full then
- Output error and stop
- Else
- Add data item at position NextPointer
- Increment NextPointer
- If NextPointer > MaxIndex Then
- Assign the value of 1 to NextPointer
- End If
- End If
Algorithm for circular queue dequeue
- If Queue is empty then
- Output error and stop
- Else
- Retreive data item at position FrontPointer
- Increment FrontPointer
- If FrontPointer > MaxIndex Then
- Assign the value of 1 to FrontPointer
- End If
- End If
Tree
- Represents the relationship between data items
Uses of trees
- Binary search tree: storing data in such a way that it makes it easy and fast to search
- Representing sorted lists of data
- Most databases use this concept as the basis of storing, searching and sorting its records
Algorithm for inserting an item to a tree
- Start at root node
- REPEAT
- If new data < current data Then
- Follow left pointer
- Else
- Follow right pointer
- End If
- UNITIL pointer is Null
- Write new data
- Create (null) pointers for new data
Algorithm for retrieving an item from a tree
- Start at root node
- WHILE pointer is not Null
- If current data = required data Then
- Return current data
- Else If required data < current data Then
- Follow left pointer
- Else
- Follow right pointer
- End If
- End WHILE
Algorithm for deleting an item from a tree
Could be done by:
- Marking a node as deleted but leave it in the tree
- Deleting a node and reattaching any child nodes to the tree
- More difficult than just removing a node
Binary search
- More efficient than linear searching
- Requires data to already be sorted in order
Binary search algorithm
- Look at the item at the midpoint of the list (the current item)
- If the current item is the item sought, then it has been found
- If the current item is not the item sought, discard the half of the list that can’t contain the item sought and repeat the same method with the other half of the list
- Keep repeating these steps until the item is found or the list contains no items
Advantages of binary searching
- Usually faster because half of the data is discarded at each step and fewer items checked
- More efficient for large files
Disadvantages of binary searching
- Items must be in an order to allow appropriate items to be discarded
Serial (linear) search
- Search starts at the beginning of the list
- Each element in turn is compared with the required value until a match is found or the end of the list is reached
Advantages of serial searching
- A serial search can be used on unordered sets of data
Disadvantages of serial searching
- Generally slower because all data must be checked until the item is found or the end of the list is reached
Algorithm for merging data files
- Open existing files
- Create a new file
- Check existing files are not empty
- Use pointers to identify records for comparison
- REPEAT
- Compare records indicated by pointers
- Copy earlier value record to new file
- Move correct pointer Until end of one file
- Copy the remaining records from the other file
- Close the files
Insertion sort
- Insert one number at a time into correct position…
- So list of sorted numbers is built up
Advantages of insertion sort
- Simple sorting algorithm and relatively efficient for small sets of data and mostly-sorted lists
- Is relatively fast for adding a small number of new items periodically
Disadvantages of insertion sort
- Inefficient for large sets of data
Quick sort
- If the first pointer is less than the last pointer (if there are elements in the list) Then
- Select an item at random to act as the ‘pivot’
- Create two new lists
- … One with all items less than pivot
- … Other with all items greater than pivot
- Repeat these four steps until all lists only have one item
Advantages of quick sort
- Very quick for large sets of data
Disadvantages of quick sort
- Initial arrangement of data affects the time taken
- Harder to code
Stepwise refinement
Breaking down a problem into progressively smaller sections until each section can be represented by a procedure / function
Top-down design
- The produced result of stepwise refinemeent
Procedure
- A block of code which performs a task
- Receives parameter values
- Uses local variables
- May/may not produce a single value
Function
- A block of code which performs a task
- Uses local variables
- Returns a single value
Benefits of using procedures and functions
- Each one can be tested separately
- Program is more easily maintainable
- Code is clearly structured
- Code is reusable
- Library routines save time
Variable
.- An identifier / memory location representing a value needed by the program
Parameter
- An item of data supplied to a procedure or function
- That can be passed by reference or by value
- Uses local variables
Pass by value
- Only actual value passed
- Procedure / function can only modify a copy of original variable
Pass by reference
- Provides procedure/function with actual address to original variable therefore allowing it to permanently alter its contents
Local variables
- A variable defined and only accessible within one part of the program (procedure/function)
- Data contained is lost when execution of that part of program is completed
- The same variable names can be used in different procedures and functions
Example of a local variable
- A loop counter
Global variables
- Defined at the start of the program and exists throughout the program (including within functions and procedures)
- Allow data to be shared by modules
- Are overridden by local variables with the same name
Example of a global variable
- VAT rate
Uses of a stack with relation to procedures
- To help control the flow of instructions
- When a subroutine is called the return address is placed on the stack followed by the parameters of the subroutine (if there are any)
- When the subroutine executes, it first pops the parameters off the stack
- When it finishes it pops the return address off the stack so control can pass back to the instruction at the calling address
Backus-Naur Form (BNF)
- A notation technique used to unambiguously describe the syntax of a programming language / computer language
Recursion in BNF
::= 0|1|2|3|4|5|6|7|8|9
::= |
Infix notation
- The common arithmetic and logical formula notation, in which operators are written between the operands
- Eg. 2 + 4
Reverse Polish Notation (RPN)
- Also known as postfix notation
- A mathematical notation in which every operator follows all of its operands
- Eg. 24+
- Does not require parentheses
RPN to infix conversion
- Completely parenthesize the infix expression according to order of priority you want
- Move each operator to its corresponding right parenthesis
- Remove all parentheses
RPN and infix - stacks
- Stacks are used in the evaluation of Reverse Polish Notation expressions
RPN and infix - trees
- Trees can be traversed in-order to obtain the infix and post-order to obtain the RPN (postfix
In-order tree traversal
- Place a dot under each node
- Trace around tree starting to left of root
- As you pass underneath a node output its data
Post-order traversal
- Place a dot on the right of each node
- Trace around tree starting to left of root
- As you pass to right of node output its data
In-order algorithm
- Traverse the left branch until there is no left branch
- Visit the root of the current branch
- Traverse the right sub tree if there is one
- Repeat previous steps until every node has been visited
Post-order algorithm
- Traverse the left sub tree until there isn’t one
- Traverse the right sub tree (if there is one)
- Re-visit the root of the current branch
- Repeat previous steps until every node has been visited
Low-level language
- Machine-oriented
- Usually one instruction to one machine code instruction
- Mnemonics for operations
- Labels for addresses