Section Seven: Data structures Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Chapter 33 – Arrays, tuples and records
Data structures

A

Computer languages such as Python, Pascal and VB have built-in elementary data types such as integer, real, Boolean and char. They also have some built-in structured data types such as string, array and record. These are made up of a number of elements of a specified type such as integer, real or string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Chapter 33 – Arrays, tuples and records
1-dimensional arrays

A

An array is defined as a finite, ordered set of elements of the same type, such as integer, real or char. Finite means that there is a specific number of elements in the array. Ordered implies that there is a first, second, third etc. element of the array.
For example, (assuming the first element of the array is myArray[0]:
myArray = [51, 72, 35, 37, 0, 3]
x = myArray[2] #assigns 35 to x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Chapter 33 – Arrays, tuples and records
2-dimensional arrays

A

An array can have two or more dimensions. A two-dimensional array can be visualised as a table, rather like a spreadsheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Chapter 33 – Arrays, tuples and records
Arrays of three dimensions

A

Arrays may have more than two dimensions. An n-dimensional array is a set of elements of the same type, indexed by n integers. In a 3-dimensional array x, a particular element may be referred to as x[4,5,2], for example. The first element would be referred to as x[0,0,0].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chapter 33 – Arrays, tuples and records
Tuples

A

A tuple is an ordered set of values, which could be elements of any type such as strings, integers or real numbers, or even graphic images, sound files or arrays. Unlike arrays, the elements do not all have to be of the same type. However, a tuple, like a string, is immutable, which means that its elements cannot be changed, and you cannot dynamically add elements to or delete elements from a tuple.

In Python a tuple is written in parentheses, for example:
pupil = (“John”, 78, “a”)
You can refer to individual elements of a tuple, for example:
name = pupil[0]
but the following staement is invalid:
pupil[0] = “Mary”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chapter 33 – Arrays, tuples and records
Records

A

If you want to store data permanently so that you can read or update it at a future date, the data needs to be stored in a file on disk. The most common way of storing large amounts of data conveniently is to use a database, but sometimes you need to create and interrogate your own files.

Generally, a file consists of a number of records. A record contains a number of fields, each holding one item of data. For example, in a file holding data about students, you might have the following record structure:
ID Firstname Surname DateOfBirth Class
1453 Gemma Baines 01/05/2004 2G
1768 Paul Gerrard 17/11/2003 2G
2016 Brian Davidson 03/08/2002 3H

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Chapter 34 – Queues
Abstract data types

A

An abstract data type is one that is created by the programmer, rather than defined within the programming language. They include structures such as queues, stacks, trees and graphs. These can easily be shown in graphical form, and it is not hard to understand how to perform operations such as adding, deleting or counting elements in each structure. However, programming languages require data types to represent them.

An abstract data type (ADT) is a logical description of how the data is viewed and the operations that can be performed on it, but how this is to be done is not necessarily known to the user. It is up to the programmer who creates the data structure to decide how to implement it, and it may be built in to the programming language. This is a good example of data abstraction, and by providing this level of abstraction we are creating an encapsulation around the data, hiding the details
of implementation from the user.

As a programmer, you will be quite familiar with this concept. When you call a built-in function such as random to generate a random number, or sqrt to find the square root of a number, you are not at all concerned with how these functions are implemented.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Chapter 34 – Queues
Queues

A

A queue is a First In First Out (FIFO) data structure. New elements may only be added to the end of a queue, and elements may only be retrieved from the front of a queue. The sequence of data items in a queue is determined, therefore, by the order in which they are inserted. The size of the queue depends on the number of items in it, just like a queue at traffic lights or at a supermarket checkout.

Queues are used in a variety of applications:
- Output waiting to be printed is commonly stored in a queue on disk. In a room full of networked computers, several people may send work to be printed at more or less the same time. By putting the output into a queue on disk, the output is printed on a first come, first served basis as soon as the printer is free.
- Characters typed at a keyboard are held in a queue in a keyboard buffer.
- Queues are useful in simulation problems. A simulation program is one which attempts to model a real-life situation so as to learn something about it. An example is a program that simulates customers arriving at random times at the check-outs in a supermarket store, and taking random times to pass through the checkout. With the aid of a simulation program, the optimum number of check-out counters can be established.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Chapter 34 – Queues
Operations on a queue

A

The abstract data type queue is defined by its structure and the operations which can be performed on it. It is described as an ordered collection of items which are added at the rear of the queue, and removed from the front.

The following queue operations are needed:
- enQueue(item) Add a new item to the rear of the queue
- deQueue() Remove the front item from the queue and return it
- isEmpty() Test to see whether the queue is empty
- isFull() Test to see whether queue is full

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Chapter 34 – Queues
Dynamic vs static data structures

A

A dynamic data structure refers to a collection of data in memory that has the ability to grow or shrink in size. It does this with the aid of the heap, which is a portion of memory from which space is automatically allocated or de-allocated as required.

Languages such as Python, Java and C support dynamic data structures, such as the built-in list data type in Python. A potential drawback of using a dynamic data structure is that the data structure may cause overflow if it exceeds the maximum memory limit.

Dynamic data structures are very useful for implementing data structures such as queues when the maximum size of the data structure is not known in advance. The queue can be given some arbitrary maximum to avoid causing memory overflow, but it is not necessary to allocate space in advance. A
further advantage of using a built-in dynamic data structure such as a list is that many methods or functions such as append, remove, length, insert, search and pop may already be written and can be used in the implementation of other data structures such as a queue or stack.

A static data structure such as an array is fixed in size, and cannot increase in size or free up memory while the program is running. An array is suitable for storing a fixed number of items such as the months of the year, monthly sales or average monthly temperatures. The disadvantage of using an array to implement a dynamic data structure such as a queue is that the size of the array has to be decided in advance by the programmer, and if the number of items added fills up the array, then no more can be added, regardless of how much free space there is in memory. Python does not have a built-in array data structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chapter 34 – Queues
Implementing a linear queue

A

There are basically two ways to implement a linear queue in an array or list:

  1. As items leave the queue, all of the other items move up one space so that the front of the queue is always the first element of the structure, e.g. q[0]. With a long queue, this may require significant processing time.
  2. A linear queue can be implemented as an array with pointers to the front and rear of the queue. An integer holding the size of the array (the maximum size of the queue) is needed, as well as a variable giving the number of items currently in the queue. However, clearly a problem will arise as many items are added to and deleted from the queue, as space is created at the front of the queue which cannot be filled, and items are added until the rear pointer points to the last element of the data structure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Chapter 34 – Queues
A circular queue

A

One way of overcoming the limitation of a static data structure such as an array is to implement the queue as a circular queue, so that when the array fills up and the rear pointer points to the last element of the array, say q[5], it will be made to point to the first element, q[0], when the next person joins the queue, assuming this element is empty. This solution requires some extra effort on the part of the programmer, and is less flexible than a dynamic data structure if the maximum number of items is not known in advance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Chapter 34 – Queues
Priority queues

A

In some situations where items are placed in a queue, a system of priorities is used. For example an operating system might schedule jobs in order of priority, or a printer may give shorter print jobs priority over longer ones.

A priority queue acts like a queue in that items are dequeued by removing them from the front of the queue. However, the logical order of items within the queue is determined by their priority, with the highest priority items at the front of the queue and the lowest priority items at the back. It is therefore possible that a new item joins the queue at the front, rather than at the rear.

Such a queue could be implemented by checking the priority of each item in the queue, starting at the rear and moving it along one place until an item with the same or lower priority is found, at which point the new item can be inserted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Chapter 35 – Lists and linked lists
Definition of a list

A

In computer science, a list is an abstract data type consisting of a number of items in which the same item may occur more than once. The list is sequenced so can refer to the first, second, third, item and we can also refer to the last element of the list.
A list is a very useful data type for a wide variety of operations, and can be used, for example, to implement other data structures such as a queue, stack or tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Chapter 35 – Lists and linked lists
Operations on lists

A

List operation, Description, Example, list contents, Return
value
isEmpty(), Test for empty list, a.isEmpty(), [45, 13, 19, 13, 8], False

append(item), Add a new item to list to the end of the list, a.append(33), [45, 13, 19, 13, 8, 33]

remove(item), Remove the first occurrence of an item from list, a.remove(13), [45, 19, 13, 8, 33]

search(item), Search for an item in list, a.search(22), [45, 19, 13, 8, 33], False

length(), Return the number of items, a.length(), [45, 19, 13, 8, 33], 5

index(item), Return the position of item, a.index(8), [45, 19, 13, 8, 33], 3

insert(pos,item), Insert a new item at position pos a.insert(2,7), [45, 19, 7, 13, 8, 33]

pop() Remove and return the last item in the list, a.pop(), [45, 19, 7, 13, 8], 33

pop(pos), Remove and return the item at position, pos a.pop(1), [45, 7, 13, 8], 18

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Chapter 35 – Lists and linked lists
Using an array

A

It is possible to maintain an ordered collection of data items using an array, which is a static data structure. This may be an option if the programming language does not support the list data type and if the maximum number of data items is small, and is known in advance.
The programmer then has to work out and code algorithms for each list operation. The empty array must be declared in advance as being a particular length, and this could be used, for example, to hold a priority queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Chapter 35 – Lists and linked lists
Inserting a new name in the list

A

If the list needs to be held in sequential order in the array, the algorithm could first determine where a new item has to be added, and then if necessary, start at the end of the list and move the rest of the items along in order to make room for it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Chapter 35 – Lists and linked lists
Linked lists

A

A linked list is a dynamic data structure used to hold a sequence, as described below:
- The items which form the sequence are not necessarily held in contiguous data locations, or in the order in which they occur in the sequence
- Each item in the list is called a node and contains a data field and a next address field called a link or pointer field (the data field may consist of several subfields.)
- The data field holds the actual data associated with the list item, and the pointer field contains the address of the next item in the sequence
- The link field in the last item indicates that there are no further items by the use of a null pointer
- Associated with the list is a pointer variable which points to (i.e. contains the address of) the first node in the list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Chapter 36 – Stacks
Concept of a stack

A

A stack is a Last In, First Out (LIFO) data structure. This means that, like a stack of plates in a cafeteria, items are added to the top and removed from the top.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Chapter 36 – Stacks
Applications of stacks

A

A stack is an important data structure in Computing. Stacks are used in calculations, and to hold return addresses when subroutines are called. When you use the Back button in your Web browser, you will be taken back through the previous pages that you looked at, in reverse order as their URLs are removed from the stack and reloaded. When you use the Undo button in a word processing package, the last operation you carried out is popped from the stack and undone.

21
Q

Chapter 36 – Stacks
Implementation of a stack

A

A stack may be implemented as either a static or dynamic data structure. A static data structure such as an array can be used with two additional variables, one being a pointer to the top of the stack and the other holding the size of the array (the maximum size of the stack).

22
Q

Chapter 36 – Stacks
Operations on a stack

A

The following operations are required to implement a stack:
- push(item) adds a new item to the top of the stack
- pop() removes and returns the top item from the stack
- peek() returns the top item from the stack but does not remove it
- isEmpty() tests to see whether the stack is empty, and returns a Boolean value
- isFull() tests to see whether the stack is full, and returns a Boolean value

23
Q

Chapter 36 – Stacks
Overflow and underflow

A

A stack will always have a maximum size, because memory cannot grow indefinitely. If the stack is implemented as an array, a full stack can be tested for by examining the value of the stack pointer. An attempt to push another item onto the stack would cause overflow so an error message can be given to the user to avoid this. Similarly, if the stack pointer is -1, the stack is empty and underflow will occur if an attempt is made to pop an item.

24
Q

Chapter 36 – Stacks
Functions of a call stack

A

A major use of the stack data structure is to store information about the active subroutines while a computer program is running. The details are hidden from the user in all high level languages.

25
Q

Chapter 36 – Stacks
Holding return addresses

A

The call stack keeps track of the address of the instruction that control should return to when a subroutine ends (the return address). Several subroutines may be nested, so that the stack may contain several return addresses which will be popped as each subroutine completes. For example, a subroutine which draws a robot may call subroutines drawCircle, drawRectangle etc. Subroutine drawRectangle may in turn call a subroutine drawLine. A recursive subroutine may contain several calls to itself, so that with each call, a new item (the return address) is pushed onto the stack.

When the recursion finally ends, the return addresses that have been pushed onto the stack each time the routine is called are popped one after the other, each time the end of the subroutine is reached. If the programmer makes an error and the recursion never ends, sooner or later memory will run out, the stack will overflow and the program will crash.

26
Q

Chapter 36 – Stacks
Holding parameters

A

Parameters required for a subroutine (such as, for example, the centre coordinates, line colour and thickness for a circle subroutine) may be held on the call stack. Each call to a subroutine will be given separate space on the call stack for these values.

27
Q

Chapter 36 – Stacks
Local variables

A

A subroutine frequently uses local variables which are known only within the subroutine. These may also be held in the call stack. Each separate call to a subroutine gets its own space for its local variables. Storing local variables on the call stack is much more efficient than using dynamic memory allocation, which uses heap space.

28
Q

Chapter 36 – Stacks
The stack frame

A

A call stack is composed of stack frames. Each stack frame corresponds to a call to a subroutine which has not yet terminated.

29
Q

Chapter 37 – Hash tables
Hashing

A

Large collections of data, for example customer records in a database, need to be accessible very quickly without having to look through all the records. This can be done by holding an index of the physical address on the file where the data is held. But how is the index created?

The answer is that a hashing algorithm is applied to the value in the key field of each record to transform it into an address. Normally there are many more possible keys than actual records that need to be stored. For example, if 300 records are to be stored, each having a unique 6-digit identifier or key, 1000 free spaces may be allocated to store the records.

30
Q

Chapter 37 – Hash tables
Hash table

A

A hash table is a collection of items stored in such a way that they can quickly be located. The hash table could be implemented as an array or list of a given size with a number of empty spaces. An empty hash table that can store a maximum of 11 items is shown below, with spaces labelled 0,1, 2,…10.

31
Q

Chapter 37 – Hash tables
Searching for an item

A

When searching for an item, these steps are followed:
- apply the hashing algorithm
- examine the resulting cell in the list
- if the item is there, return the item
- if the cell is empty, the item is not in the table
- if there is another item in that spot, keep moving forward until either the item is found or a blank cell is encountered, when it is apparent that the item is not in the table

32
Q

Chapter 37 – Hash tables
Folding method

A

divides the item into equal parts, and adds the parts to give the hash value.

33
Q

Chapter 37 – Hash tables
Collision resolution

A

The fuller the hash table becomes, the more likely it is that there will be collisions, and this needs to be taken into account when designing the hashing algorithm and deciding on the table size. For example, the size of the table could be designed so that when all the items are stored, only 70% of the table’s cells are occupied.

Rehashing is the name given to the process of finding an empty slot when a collision has occurred. The rehashing algorithm used above simply looks for the next empty slot. It will loop round to the first cell if the table of the end is reached. A variation on this would be to look at every third cell, for example (the “plus 3” rehash). Alternatively, the hash value could be incremented by 1, 3, 5, 7… until a free space is found.

Different hashing and rehashing methods will work more efficiently on different data sets – the aim is to minimise collisions.

34
Q

Chapter 37 – Hash tables
Uses of hash tables

A

Hash tables are primarily used for efficient lookup, so that for example an index would typically be organised as a hash table. A hash table could be used to look up, say a person’s telephone number given their name, or vice versa. They can also be used to store data such as user codes and encrypted passwords that need to be looked up and verified quickly.

Hash tables are used in the implementation of the data structure called a dictionary, which is discussed below.

35
Q

Chapter 37 – Hash tables
Dictionaries

A

A dictionary is an abstract data type consisting of associated pairs of items, where each pair consists of a key and a value. It is a built-in data structure in Python and Visual Basic, for example. When the user supplies the key, the associated value is returned. Items can easily be amended, added to or removed from the dictionary as required.

In Python, dictionaries are written as comma-delimited pairs in the format key:value and enclosed in curly braces. For example:
IDs = {342:’Harry’, 634:’Jasmine’, 885:’Max’,571:’Sheila’}

36
Q

Chapter 37 – Hash tables
Operations on dictionaries

A

It is possible to implement a dictionary using either a static or a dynamic data structure. The implementation needs to include the following operations:
- Create a new empty dictionary
- Add a new key:value pair to the dictionary
- Delete a key:value pair from the dictionary
- Amend the value in a key:value pair
- Return a value associated with key k
- Return True or False depending on whether key is in the dictionary
- Return the length of the dictionary

37
Q

Chapter 38 – Graphs
Definition of a graph

A

A graph is a set of vertices or nodes connected by edges or arcs. The edges may be one-way or two way. In an undirected graph, all edges are bidirectional. If the edges in a graph are all one-way, the graph is said to be a directed graph or digraph.

The weights in this example represent distances between towns. A human driver can find their way from one town to another by following a map, but a computer needs to represent the information about distances and connections in a structured, numerical representation.

38
Q

Chapter 38 – Graphs
The adjacency matrix

A

A two-dimensional array can be used to store information about a directed or undirected graph. Each of the rows and columns represents a node, and a value stored in the cell at the intersection of row i, column j indicates that there is an edge connecting node i and node j.

39
Q

Chapter 38 – Graphs
Advantages and disadvantages of the adjacency matrix

A

An adjacency matrix is very convenient to work with, and adding an edge is very simple. However, a sparse graph with many nodes but not many edges will leave most of the cells empty, and the larger the graph, the more memory space will be wasted. Another consideration is that using a static two- dimensional array, it is harder to add or delete nodes.

40
Q

Chapter 38 – Graphs
The adjacency list

A

An adjacency list is a more space-efficient way to implement a sparsely connected graph. A list of all the nodes is created, and each node points to a list of all the adjacent nodes to which it is directly linked. The adjacency list can be implemented as a list of dictionaries, with the key in each dictionary being the node and the value, the edge weight.

41
Q

Chapter 38 – Graphs
Traversing a graph

A

There are two ways to traverse a graph so that every node is visited.
- A depth-first traversal
- A breadth-first traversal

42
Q

Chapter 38 – Graphs
Applications of graphs

A

Graphs may be used to represent, for example:
- computer networks, with nodes representing computers and weighted edges representing the bandwidth between them
- roads between towns, with edge weights representing distances, rail fares or journey times
- tasks in a project, some of which have to be completed before others
- states in a finite state machine
- web pages and links

43
Q

Chapter 39 – Trees
Concept of a tree

A

Trees are a very common data structure in many areas of computer science and other contexts. Like a tree in nature, a rooted tree has a root, branches and leaves, the difference being that a tree in computer science has its root at the top and its leaves at the bottom.
Typical uses for rooted trees include:
- manipulating hierarchical data, such as folder structures or moves in a game
- making information easy to search
- manipulating sorted lists of data

44
Q

Chapter 39 – Trees
A binary search tree

A

A binary tree is a rooted tree in which each node has a maximum of two children. A binary search tree holds items in such a way that the tree can be searched quickly and easily for a particular item, new items can be easily added, and the whole tree can be printed out in sequence. A binary search tree is a typical use of a rooted tree.

45
Q

Chapter 39 – Trees
Traversing a binary tree

A

There are three ways of traversing a tree:
- Pre-order traversal
- In-order traversal
- Post-order traversal
The names refer to whether the root of each sub-tree is visited before, between or after both branches have been traversed.

46
Q

Chapter 39 – Trees
Implementation of a binary search tree

A

A binary search tree can be implemented using an array of records, with each node consisting of:
- left pointer
- data item
- right pointer
Alternatively, it could be held in a list of tuples, or three separate lists or arrays, one for each of the pointers and one for the data items.

47
Q

Chapter 39 – Trees
Tree traversal algorithms

A

We have looked at three tree traversal algorithms: in-order, pre-order and post-order. The pseudocode algorithm for each of these traversals is recursive. If you have not yet covered recursion, you may like to leave the pseudocode algorithms to be covered at a later date.
The algorithm for an in-order traversal is:
- traverse the left subtree
- visit the root node
- traverse the right subtree

48
Q

Chapter 39 – Trees
Tree traversal algorithms

A

We have looked at three tree traversal algorithms: in-order, pre-order and post-order. The pseudocode algorithm for each of these traversals is recursive. If you have not yet covered recursion, you may like to leave the pseudocode algorithms to be covered at a later date.
The algorithm for an in-order traversal is:
- traverse the left subtree
- visit the root node
- traverse the right subtree