Section Twelve: Algorithms Flashcards

1
Q

Chapter 59 – Analysis and design of algorithms
Comparing algorithms

A

Algorithms may be compared on how much time they need to solve a particular problem. This is referred to as the time complexity of the algorithm. The goal is to design algorithms which will run quickly while taking up the minimal amount of resources such as memory.

In order to compare the efficiency of different algorithms in terms of execution time, we need to quantify the number of basic operations or steps that the algorithm will need, in terms of the number of items to be processed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Chapter 59 – Analysis and design of algorithms
A linear function

A

A linear function is expressed in general terms as f(x) = ax + c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Chapter 59 – Analysis and design of algorithms
A polynomial function

A

A polynomial expression is expressed as f(x) = axm + bx + c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Chapter 59 – Analysis and design of algorithms
An exponential function

A

An exponential function takes the form f(x) = abx.
This function grows very large, very quickly!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chapter 59 – Analysis and design of algorithms
A logarithmic function

A

A logarithmic function takes the form f(x) = a logn x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chapter 59 – Analysis and design of algorithms
Permutations

A

The permutation of a set of objects is the number of ways of arranging the objects. For example, if you have 3 objects A, B and C you can choose any of A, B or C to be the first object. You then have two choices for the second object, making 3 x 2 = 6 different ways of arranging the first two objects, and then just one way of placing the third object. The six permutations are ABC, ACB, BAC, BCA, CAB, CBA.

The formula for calculating the number of permutations of four objects is 4 x 3 x 2 x 1, written 4! and spoken as “four factorial”. (Note that 10! = 3.6 million so don’t try getting 10 students to line up in all possible ways!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Chapter 59 – Analysis and design of algorithms
Big-O notation

A

Now that we have got all the maths out of the way and hopefully understood, we can study the so-called Big-O notation which is used to express the time complexity, or performance, of an algorithm. (‘O’ stands for ‘Order’.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Chapter 59 – Analysis and design of algorithms
O(1) (Constant time)

A

O(1) describes an algorithm that takes constant time (the same amount of time) to execute regardless of the size of the input data set.
Suppose array a has n items. The statement
length = len(a)
will take the same amount of time to execute however many items are held in the array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Chapter 59 – Analysis and design of algorithms
O(n) (linear time)

A

O(n) describes an algorithm whose performance will grow in linear time, in direct proportion to the size of the data set. For example, a linear search of an array of 1000 unsorted items will take 1000 times longer than searching an array of 1 item.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Chapter 59 – Analysis and design of algorithms
O(n2) (Polynomial time)

A

O(n2) describes an algorithm whose performance is directly proportional to the square of the size of the data set. A program with two nested loops each performed n times will typically have an order of time complexity O(n2). The running time of the algorithm grows in polynomial time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chapter 59 – Analysis and design of algorithms
O(2n) (Exponential time)

A

O(2n) describes an algorithm where the time taken to execute will double with every additional item added to the data set. The execution time grows in exponential time and quickly becomes very large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Chapter 59 – Analysis and design of algorithms
O(log n) (Logarithmic time)

A

The time taken to execute an algorithm of order O(log n) (logarithmic time) will grow very slowly as the size of the data set increases. A binary search is a good example of an algorithm of time complexity O(log2n). Doubling the size of the data set has very little effect on the time the algorithm takes to complete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Chapter 59 – Analysis and design of algorithms
O(n!) (Exponential time)

A

The time taken to execute an algorithm of order O(n!) will grow very quickly, faster than O(2n).
Suppose that the problem is to find all the permutations of n letters. If n=2, there are 2 permutations to find. If n=6, there are 720 permutations – far more than 2n+1, which is only 64.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Chapter 59 – Analysis and design of algorithms
Calculating the time complexity of an algorithm

A

Here are two different algorithms for finding the smallest element in an array called arrayX of size n. Assume the index starts at 1.

The first algorithm puts the first value in the array equal to a variable called minimum. It then compares each subsequent item in the array to the first item, and if it is smaller, replaces minimum with the new lowest value.

minimum = arrayX[0]
for k = 1 to n - 1
if arrayX[k] < minimum then
minimum = arrayX[k]
endif
next k

To calculate the time complexity of the algorithm in Big-O notation, we need to count the number of basic operations it performs. There is one initial statement, and n if statements, so the time complexity is 1 + n. However, as we have already discussed, the 1 is insignificant compared to n and this algorithm therefore executes in linear time and has time complexity O(n).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Chapter 60 – Searching algorithms
Linear search

A

Sometimes it is necessary to search for items in a file, or in an array in memory. If the items are not in any particular sequence, the data items have to be searched one by one until the required one is found or the end of the list is reached. This is called a linear search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Chapter 60 – Searching algorithms
Time complexity of linear search

A

We can determine the algorithm’s efficiency in terms of execution time, expressed in Big-O notation. To do this, you need to compute the number of operations that the algorithm will require for n items. The loop is performed n times for a list of length n, and there are two steps in the loop (an IF statement and an assignment statement), giving a total of 3 + 2n steps (including 3 steps at the start). The constant term and the coefficient of n become insignificant as n increases in size, and the time complexity of the
algorithm basically depends on how often the loop has to be performed in the worst-case scenario.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Chapter 60 – Searching algorithms
Binary search

A

The binary search is a much more efficient method of searching a list for an item than a linear search, but crucially, the items in the list must be sorted. If they are not sorted, a linear search is the only option. The algorithm works by repeatedly dividing in half the portion of the data list that could contain the required data item. This is continued until there is only one item in the list.

18
Q

Chapter 60 – Searching algorithms
Binary search algorithm

A

The ordered array is divided into three parts; a middle item, the first part of the array starting at aList[0] up to the middle item and the second part starting after the middle item and ending with the final item in the list. The middle item is examined to see if it is equal to the sought item.

If it is not, then if the middle item is greater than the sought item, the second half of the array is of no further interest. The number of items being searched is therefore halved and the process repeated until the last item is examined, with either the first or second half of the array of items being eliminated at each pass.

19
Q

Chapter 60 – Searching algorithms
Time complexity of binary search

A

The binary search halves the search area with each execution of the loop – an excellent example of a
divide and conquer strategy. If we start with n items, there will be approximately n/2 items left after
the first comparison, n/4 after 2 comparisons, n/8 after 3 comparisons, and n/2i after i comparisons. The number of comparisons needed to end up with a list of just one item is i where n/2i = 1. One further comparison would be needed to check if this item is the one being searched for or not.

Solving this equation for i, n = 2i
Taking the logarithm of each side, log2 n = i log2 2 giving i = log2 n (since log2 2 = 1)
Therefore, the binary search is O(log n).

20
Q

Chapter 60 – Searching algorithms
A recursive algorithm

A

The basic concept of the binary search is in fact recursive, and a recursive algorithm is given below. The procedure calls itself, eventually “unwinding” when the procedure ends. When recursion is used there must always be a condition that if true, causes the program to terminate the recursive procedure, or the recursion will continue forever.

Once again, first, last and midpoint are integer variables used to index elements of the array, with first starting at 0 and last starting at the upper limit of the array index.

21
Q

Chapter 60 – Searching algorithms
Binary tree search

A

The recursive algorithm for searching a binary tree is similar to the binary search algorithm above, except that instead of looking at the midpoint of a list, or a subset of the list, on each pass, half of the tree or subtree is eliminated each time its root is examined.

22
Q

Chapter 60 – Searching algorithms
Time complexity of binary tree search

A

Like the binary search, the number of items to be searched is halved with each pass through the algorithm. The time complexity is the same as the binary search, i.e. O(log n).

23
Q

Chapter 61 – Bubble sort and insertion sort
Sorting algorithms

A

Sorting is a very common task in data processing, and frequently the number of items may be huge, so using a good algorithm can considerably reduce the time spent on the task. There are many different sorting algorithms and we will start by looking at a simple but inefficient example.

24
Q

Chapter 61 – Bubble sort and insertion sort
Bubble sort

A

The Bubble sort is one of the most basic sorting algorithms and the simplest to understand. The basic idea is to bubble up the largest (or smallest) item to the end of the list, then the second largest, then the third largest and so on until no more swaps are needed.

Suppose you have an array of n items:
- Go through the array, comparing each item with the one next to it. If it is greater, swap them.
- The last element of the array will be in the correct place after the first pass
- Repeat n-2 times, reducing by one on each pass the number of elements to be examined

25
Q

Chapter 61 – Bubble sort and insertion sort
Insertion Sort

A

This is a sorting algorithm that sorts one data item at a time. It is rather similar to how you might sort a hand of cards. The algorithm takes one data item from the list and places it in the correct location in the list. This process is repeated until there are no more unsorted data items in the list. Although more efficient than the bubble sort, it is not as efficient as the merge sort or quick sort.

26
Q

Chapter 61 – Bubble sort and insertion sort
Time complexity of bubble and insertion sorts

A

The bubble sort requires close to n passes through the list, with each pass requiring a maximum of n – 1 swaps. It is of order O(n2).

The insertion sort also has two nested loops and so has time complexity O(n2). However, if the list is already almost sorted, the time complexity is reduced to close to O(n).

27
Q

Chapter 62 – Merge sort and quick sort
Merge sort

A

The merge sort uses a divide and conquer approach. The list is successively divided in half, forming two sublists, until each sublist is of length one. The sublists are then sorted and merged into larger sublists until they are recombined into a single sorted list.

The basic steps are:
- Divide the unsorted list into n sublists, each containing one element
- Repeatedly merge sublists to produce new sorted sublists until there is only one sublist remaining.

28
Q

Chapter 62 – Merge sort and quick sort
Time complexity of merge sort

A

The merge sort is another example of a divide and conquer algorithm, but in this case, there are n sublists to be merged, so the time complexity has to be multiplied by a factor of n.
The time complexity is therefore O(nlog n).

29
Q

Chapter 62 – Merge sort and quick sort
Space complexity

A

The amount of resources such as memory that an algorithm requires, known as the space complexity, is also a consideration when comparing the efficiency of algorithms. The bubble sort, for example, requires n memory locations for a list of size n. The merge sort, on the other hand, requires additional memory to hold the left half and right half of the list, so takes twice the amount of memory space.

30
Q

Chapter 62 – Merge sort and quick sort
Quick sort

A

The quick sort algorithm, like the insertion sort, uses a Divide and Conquer algorithm to quickly reduce the size of the problem, but without using the additional storage required by the merge sort.

The steps in the quick sort are as follows:
1. Select a value called the pivot value. There are different ways to choose the pivot value but we will choose the first item in the list. The actual position where the pivot value belongs in the final sorted list, called the split point, will be used to divide the list for subsequent calls. In the list shown below, 9 is the first pivot value.

  1. Divide the remainder of the list into two partitions
    - all elements less than the pivot value must be in the first partition
    - all elements greater than the pivot value must be in the second partition
  2. 3 and 15 are now the pivots in the left and right partitions. Recursively repeat the process.
31
Q

Chapter 62 – Merge sort and quick sort
Advantages and disadvantages of the quick sort algorithm

A

The quicksort algorithm is extremely fast. If the partition always occurs in the middle of the list, there will be log n divisions in a list of length n, and each of the n items needs to be checked against the pivot value to find the split point. It therefore has time complexity O(n log n).

Another advantage is that it does not need additional memory, like the merge sort.

A disadvantage is that if the split points are not near the middle of the list, but are close to the start or end of the list, the division will be very uneven. If the split point is, for example, the first item in the sequenced list, the division results in a list of 0 items and a list of n-1 items. The list of n-1 items divides into 0 items and n-2 items and so on. The resulting time complexity is O(n2).

If the list is very large, and recursion continues too long, it may cause stack overflow and the program will crash.

32
Q

Chapter 62 – Merge sort and quick sort
Summary of sort algorithms

A
  • Bubble sort is the slowest of the sorts, with time complexity O(n2)
  • Insertion sort is O(n2) but if the list is already almost sorted, this reduces to O(n)
  • Merge sort is O(n log n) but requires additional memory space for the merging process
  • Quick sort is generally the fastest sort, but is dependent on using a pivot that is not close to the smallest or largest elements of the list. There are several methods for selecting a pivot to ensure this does not happen. It has average time complexity O(n log n). It does not require additional memory space.
33
Q

Chapter 63 – Graph-traversal algorithms
Graph traversals

A

There are two ways to traverse a graph so that every node is visited. Each of them uses a supporting data structure to keep track of which nodes have been visited, and which node to visit next.
- A depth-first traversal uses a stack, which is implemented automatically during execution of a recursive routine to hold local variables, parameters and return addresses each time a subroutine is called. Alternatively, a non-recursive routine could be written and the stack maintained as part of
the routine.
- A breadth-first traversal uses a queue.

34
Q

Chapter 63 – Graph-traversal algorithms
Depth-first traversal

A

In this traversal, we go as far down one route as we can before backtracking and taking the next route.

The following recursive subroutine dfs is called initially from the main program, which passes it a graph, defined here as an adjacency list (see Chapter 38) and implemented as a dictionary with nodes A, B, C, … as keys, and neighbours of each node as data. Thus if “A” is the current vertex, graph[“A”] will return the list [“B”,”D”,”E”] with reference to the algorithm below and the graph overleaf.

35
Q

Chapter 63 – Graph-traversal algorithms
Breadth-first traversal

A

With a breadth first traversal, starting at A we first visit all the nodes adjacent to A before moving to B and repeating the process for each node at this ‘level’, before moving to the next level. Instead of a stack, a queue is used to keep track of nodes that we still have to visit. Nodes are coloured pale blue when queued and dark blue when dequeued and added to the list of nodes that have been visited.

36
Q

Chapter 63 – Graph-traversal algorithms
Pseudocode algorithm for breadth-first traversal

A

The breadth-first traversal is an iterative, rather than a recursive routine. The first node (‘A’ in this example), is appended to the empty queue as soon as the subroutine is entered. A Python definition of the graph as a dictionary is given below for interest, but is not directly used in the pseudocode, as implementations will vary in different languages.

37
Q

Chapter 63 – Graph-traversal algorithms
Applications of depth-first search

A

Applications of the depth-first search include the following:
- In scheduling jobs where a series of tasks is to be performed, and certain tasks must be completed before the next one begins.
- In solving problems such as mazes, which can be represented as a graph

38
Q

Chapter 63 – Graph-traversal algorithms
Applications of breadth-first search

A

Breadth-first searches are used to solve many real-life problems.
For example:
- A major application of a breadth-first search is to find the shortest path between two points A and B, and this will be explained in detail in the next chapter. Finding the shortest path is important in, for example, GPS navigation systems and computer networks.
- Facebook. Each user profile is regarded as a node or vertex in the graph, and two nodes are connected if they are each other’s friends. This example is considered in more depth in Chapter 72, Big Data.
- Web crawlers. A web crawler can analyse all the sites you can reach by following links randomly on a particular website.

39
Q

Chapter 64 – Optimisation algorithms
Optimisation problems

A

We increasingly rely on computers to find the optimum solution to a range of different problems.
For example:
- scheduling aeroplanes and staff so that air crews always have the correct minimum rest time between flights
- finding the best move in a chess problem
- timetabling classes in schools and colleges
- finding the shortest path between two points – for building circuit boards, route planning, communications networks and many other applications

Finding the shortest path from A to B has numerous applications in everyday life and in computer-related
problems. For example, if you visit a site like Google Maps to get directions from your current location to
a particular destination, you probably want to know the shortest route. The software that finds it for you
will use representations of street maps or roads as graphs, with estimated driving times or distances as
edge weights.

40
Q

Chapter 64 – Optimisation algorithms
Dijkstra’s shortest path algorithm

A

Dijkstra (pronounced dike-stra) lived from 1930 to 2002. He was a Dutch computer scientist who received the Turing award in 1972 for fundamental contributions to developing programming languages. He wrote a paper in 1968 which was published under the heading “GO TO Statement Considered Harmful” and was an advocate of structured programming.

Dijkstra’s algorithm is designed to find the shortest path between one particular start node and all other nodes in a weighted graph. This is similar to a breadth first search.

The weights could represent, for example, distances or time taken to travel between towns, or the cost of travel between airports.

41
Q

Chapter 64 – Optimisation algorithms
The A* algorithm

A

Dijkstra’s algorithm is a special case of a more general path-finding algorithm called the A* algorithm. Dijkstra’s algorithm has one cost function, which is the real cost value (e.g. distance) from the source node to every other node.

The A* algorithm has two cost functions:
1. g(x) – as with Dijkstra’s algorithm, this is the real cost from the source to a given node.
2. h(x) – this is the approximate cost from node x to the goal node. It is a heuristic function, meaning that it is a good or adequate solution, but not necessarily the optimum one. This algorithm stipulates that the heuristic function should never overestimate the cost, therefore the real cost should be greater than or equal h(x).

The total cost of each node is calculated as f(x) = g(x) + h(x).

The A* algorithm focusses only on reaching the goal node, unlike Dijkstra’s algorithm which finds the lowest cost or shortest path to every node. It is used, for example, in video games to enable characters to navigate the world.