Algorithms - CL1 Flashcards
Algorithms complexity (understanding, big O notation, complexity of common algorithms)
In mathematics and computer science, an algorithm (/ˈælɡərɪðəm/ (About this soundlisten)) is a sequence of instructions, typically to solve a class of problems or perform a computation. Algorithms are unambiguous specifications for performing calculation, data processing, automated reasoning, and other tasks. As an effective method, an algorithm can be expressed within a finite amount of space and time[1] and in a well-defined formal language[2] for calculating a function.[3] Starting from an initial state and initial input (perhaps empty),[4] the instructions describe a computation that, when executed, proceeds through a finite[5] number of well-defined successive states, eventually producing "output"[6] and terminating at a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate random input.[7] The concept of algorithm has existed for centuries. Greek mathematicians used algorithms in the sieve of Eratosthenes for finding prime numbers, and the Euclidean algorithm for finding the greatest common divisor of two numbers.[8] The word algorithm itself is derived from the 9th-century mathematician Muḥammad ibn Mūsā al-Khwārizmī, Latinized Algoritmi. A partial formalization of what would become the modern concept of algorithm began with attempts to solve the Entscheidungsproblem (decision problem) posed by David Hilbert in 1928. Later formalizations were framed as attempts to define "effective calculability"[9] or "effective method".[10] Those formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's Formulation 1 of 1936, and Alan Turing's Turing machines of 1936–37 and 1939.
The idea behind big O notation
Big O notation is the language we use for talking about how long an algorithm takes to run. It’s how we compare the efficiency of different approaches to a problem.
It’s like math except it’s an awesome, not-boring kind of math where you get to wave your hands through the details and just focus on what’s basically happening.
With big O notation we express the runtime in terms of—brace yourself—how quickly it grows relative to the input, as the input gets arbitrarily large.
Let’s break that down:
1. how quickly the runtime grows—It’s hard to pin down the exact runtime of an algorithm. It depends on the speed of the processor, what else the computer is running, etc. So instead of talking about the runtime directly, we use big O notation to talk about how quickly the runtime grows.
2. relative to the input—If we were measuring our runtime directly, we could express our speed in seconds. Since we’re measuring how quickly our runtime grows, we need to express our speed in terms of…something else. With Big O notation, we use the size of the input, which we call “nn.” So we can say things like the runtime grows “on the order of the size of the input” (O(n)O(n)) or “on the order of the square of the size of the input” (O(n^2)O(n 2)).
3. as the input gets arbitrarily large—Our algorithm may have steps that seem expensive when nn is small but are eclipsed eventually by other steps as nn gets huge. For big O analysis, we care most about the stuff that grows fastest as the input grows, because everything else is quickly eclipsed as nn gets very large. (If you know what an asymptote is, you might see why “big O analysis” is sometimes called “asymptotic analysis.”)
If this seems abstract so far, that’s because it is. Let’s look at some examples.
Some examples def print_first_item(items): print items[0]
This function runs in O(1)O(1) time (or “constant time”) relative to its input. The input list could be 1 item or 1,000 items, but this function would still just require one “step.”
def print_all_items(items): for item in items: print item
This function runs in O(n)O(n) time (or “linear time”), where nn is the number of items in the list. If the list has 10 items, we have to print 10 times. If it has 1,000 items, we have to print 1,000 times.
def print_all_possible_ordered_pairs(items):
for first_item in items:
for second_item in items:
print first_item, second_item
Here we’re nesting two loops. If our list has nn items, our outer loop runs nn times and our inner loop runs nn times for each iteration of the outer loop, giving us n^2n 2
total prints. Thus this function runs in O(n^2)O(n 2) time (or “quadratic time”). If the list has 10 items, we have to print 100 times. If it has 1,000 items, we have to print 1,000,000 times.
N could be the actual input, or the size of the input
Both of these functions have O(n)O(n) runtime, even though one takes an integer as its input and the other takes a list:
def say_hi_n_times(n): for time in xrange(n): print "hi"
def print_all_items(items): for item in items: print item
So sometimes nn is an actual number that’s an input to our function, and other times nn is the number of items in an input list (or an input map, or an input object, etc.).
Drop the constants
This is why big O notation rules. When you’re calculating the big O complexity of something, you just throw out the constants. So like:
def print_all_items_twice(items): for item in items: print item
# Once more, with feeling for item in items: print item
This is O(2n)O(2n), which we just call O(n)O(n).
def print_first_item_then_first_half_then_say_hi_100_times(items): print items[0]
middle_index = len(items) / 2 index = 0 while index < middle_index: print items[index] index += 1
for time in xrange(100): print "hi"
This is O(1 + n/2 + 100)O(1+n/2+100), which we just call O(n)O(n).
Why can we get away with this? Remember, for big O notation we’re looking at what happens as nn gets arbitrarily large. As nn gets really big, adding 100 or dividing by 2 has a decreasingly significant effect.
Drop the less significant terms
For example:
def print_all_numbers_then_all_pair_sums(numbers): print "these are the numbers:" for number in numbers: print number
print "and these are their sums:" for first_number in numbers: for second_number in numbers: print first_number + second_number
Here our runtime is O(n + n^2)O(n+n 2 ), which we just call O(n^2)O(n 2 ). Even if it was O(n^2/2 + 100n)O(n 2 /2+100n), it would still be O(n^2)O(n 2 ).
Similarly:
O(n^3 + 50n^2 + 10000)O(n 3 \+50n 2 \+10000) is O(n^3)O(n 3 ) O((n + 30) * (n + 5))O((n+30)∗(n+5)) is O(n^2)O(n 2 ) Again, we can get away with this because the less significant terms quickly become, well, less significant as nn gets big.
We’re usually talking about the “worst case”
Often this “worst case” stipulation is implied. But sometimes you can impress your interviewer by saying it explicitly.
Sometimes the worst case runtime is significantly worse than the best case runtime:
def contains(haystack, needle):
# Does the haystack contain the needle? for item in haystack: if item == needle: return True return False
Here we might have 100 items in our haystack, but the first item might be the needle, in which case we would return in just 1 iteration of our loop.
In general we’d say this is O(n)O(n) runtime and the “worst case” part would be implied. But to be more specific we could say this is worst case O(n)O(n) and best case O(1)O(1) runtime. For some algorithms we can also make rigorous statements about the “average case” runtime.
Space complexity: the final frontier
Sometimes we want to optimize for using less memory instead of (or in addition to) using less time. Talking about memory cost (or “space complexity”) is very similar to talking about time cost. We simply look at the total size (relative to the size of the input) of any new variables we’re allocating.
This function takes O(1)O(1) space (we use a fixed number of variables):
def say_hi_n_times(n): for time in xrange(n): print "hi"
This function takes O(n)O(n) space (the size of hi_list scales with the size of the input):
def list_of_hi_n_times(n): hi_list = [] for time in xrange(n): hi_list.append("hi") return hi_list
Usually when we talk about space complexity, we’re talking about additional space, so we don’t include space taken up by the inputs. For example, this function takes constant space even though the input has nn items:
def get_largest_item(items): largest = float('-inf') for item in items: if item > largest: largest = item return largest
Sometimes there’s a tradeoff between saving time and saving space, so you have to decide which one you’re optimizing for.
Big O analysis is awesome except when it’s not
You should make a habit of thinking about the time and space complexity of algorithms as you design them. Before long this’ll become second nature, allowing you to see optimizations and potential performance issues right away.
Asymptotic analysis is a powerful tool, but wield it wisely.
Big O ignores constants, but sometimes the constants matter. If we have a script that takes 5 hours to run, an optimization that divides the runtime by 5 might not affect big O, but it still saves you 4 hours of waiting.
Beware of premature optimization. Sometimes optimizing time or space negatively impacts readability or coding time. For a young startup it might be more important to write code that’s easy to ship quickly or easy to understand later, even if this means it’s less time and space efficient than it could be.
But that doesn’t mean startups don’t care about big O analysis. A great engineer (startup or otherwise) knows how to strike the right balance between runtime, space, implementation time, maintainability, and readability.
You should develop the skill to see time and space optimizations, as well as the wisdom to judge if those optimizations are worthwhile.
Links:
https: //www.interviewcake.com/article/python/big-o-notation-time-and-space-complexity?
https: //www.bigocheatsheet.com/
Array sorting methods(bubble sort, quick sort, merge sort)
In other words, a sorted array is an array that is in a particular order. For example, [a,b,c,d][a,b,c,d] is sorted alphabetically, [1,2,3,4,5][1,2,3,4,5] is a list of integers sorted in increasing order, and [5,4,3,2,1][5,4,3,2,1] is a list of integers sorted in decreasing order.
A sorting algorithm takes an array as input and outputs a permutation of that array that is sorted.
There are two broad types of sorting algorithms: integer sorts and comparison sorts.
Comparison Sorts
Comparison sorts compare elements at each step of the algorithm to determine if one element should be to the left or right of another element.
Comparison sorts are usually more straightforward to implement than integer sorts, but comparison sorts are limited by a lower bound of O(n \log n)O(nlogn), meaning that, on average, comparison sorts cannot be faster than O(n \log n)O(nlogn). A lower bound for an algorithm is the worst-case running time of the best possible algorithm for a given problem. The “on average” part here is important: there are many algorithms that run in very fast time if the inputted list is already sorted, or has some very particular (and overall unlikely) property. There is only one permutation of a list that is sorted, but n!n! possible lists, so the chances that the input is already sorted is very unlikely, and on average, the list will not be very sorted.
Integer Sorts
Integer sorts are sometimes called counting sorts (though there is a specific integer sort algorithm called counting sort). Integer sorts do not make comparisons, so they are not bounded by \Omega(n\log n)Ω(nlogn). Integer sorts determine for each element xx how many elements are less than xx. If there are 1414 elements that are less than xx, then xx will be placed in the 15^\text{th}15th slot. This information is used to place each element into the correct slot immediately—no need to rearrange lists.
Properties of Sorting Algorithms
All sorting algorithms share the goal of outputting a sorted list, but the way that each algorithm goes about this task can vary. When working with any kind of algorithm, it is important to know how fast it runs and in how much space it operates—in other words, its time complexity and space complexity. As shown in the section above, comparison-based sorting algorithms have a time complexity of \Omega(n\log n)Ω(nlogn), meaning the algorithm can’t be faster than n \log nnlogn. However, usually, the running time of algorithms is discussed in terms of big O, and not Omega. For example, if an algorithm had a worst-case running time of O(n\log n)O(nlogn), then it is guaranteed that the algorithm will never be slower than O(n\log n)O(nlogn), and if an algorithm has an average-case running time of O(n^2)O(n2), then on average, it will not be slower than O(n^2)O(n2).
The running time describes how many operations an algorithm must carry out before it completes. The space complexity describes how much space must be allocated to run a particular algorithm. For example, if an algorithm takes in a list of size nn, and for some reason makes a new list of size nn for each element in nn, the algorithm needs n^2n 2 space.
Additionally, for sorting algorithms, it is sometimes useful to know if a sorting algorithm is stable.
Common Sorting Algorithms
There are many different sorting algorithms, with various pros and cons. Here are a few examples of common sorting algorithms.
Link:
https://brilliant.org/wiki/sorting-algorithms/
Tree structure(construction, traversal)
The tree is one of the most powerful of the advanced data structures and it often pops up in even more advanced subjects such as AI and compiler design. Surprisingly though, the tree is important in a much more basic application - namely the keeping of an efficient index.
Whenever you use a database there is a 99% chance that an index is involved somewhere. The simplest type of index is a sorted listing of the key field. This provides a fast lookup because you can use a binary search to locate any item without having to look at each one in turn.
The trouble with a simple ordered list only becomes apparent once you start adding new items and have to keep the list sorted - it can be done reasonably efficiently but it takes some advanced juggling. A more important defect in these days of networking and multi-user systems is related to the file locking properties of such an index. Basically if you want to share a linear index and allow more than one user to update it then you have to lock the entire index during each update. In other words a linear index isn’t easy to share and this is where trees come in - I suppose you could say that trees are shareable.
Binary trees
A worthwhile simplification is to consider only binary trees. A binary tree is one in which each node has at most two descendants - a node can have just one but it can’t have more than two.
Clearly each node in a binary tree can have a left and/or a right descendant. The importance of a binary tree is that it can create a data structure that mimics a “yes/no” decision making process.
For example, if you construct a binary tree to store numeric values such that each left sub-tree contains larger values and each right sub-tree contains smaller values then it is easy to search the tree for any particular value. The algorithm is simply a tree search equivalent of a binary search:
start at the root
REPEAT until you reach a terminal node
IF value at the node = search value
THEN found
IF value at node < search value
THEN move to left descendant
ELSE move to right descendant
END REPEAT
Of course if the loop terminates because it reaches a terminal node then the search value isn’t in the tree, but the fine detail only obscures the basic principles.
The next question is how the shape of the tree affects the efficiency of the search. We all have a tendency to imagine complete binary trees like the one in Figure 2a and in this case it isn’t difficult to see that in the worst case a search would have to go down the to the full depth of the tree. If you are happy with maths you will know that if the tree in Figure 2a contains n items then its depth is log2 n and so at best a tree search is as fast as a binary search.
The worst possible performance is produced by a tree like that in Figure 2b. In this case all of the items are lined up on a single branch making a tree with a depth of n. The worst case search of such a tree would take n compares which is the same as searching an unsorted linear list.
So depending on the shape of the tree search efficiency varies from a binary search of a sorted list to a linear search of an unsorted list. Clearly if it is going to be worth using a tree we have to ensure that it is going to be closer in shape to the tree in Figure 2a than that in 2b.
All a question of balance
You might at first think that the solution is always to order the nodes so that the search tree a perfect example of the complete tree in Figure 2a.
The first problem is that not all trees have enough nodes to be complete. For example, a tree with a single node is complete but one with two nodes isn’t and so on. It doesn’t take a genius to work out that complete trees always have one less than a power of two nodes. With other numbers of nodes the best we can do is to ask that a tree’s terminal nodes are as nearly as possible on the same level.
You can think of this as trying to produce a tree with `branches’ of as nearly the same length as possible. In practice it turns out to be possible always to arrange a tree so that the total number of nodes in each node’s right and left sub-trees differ by one at most, see Figure 3.
Such trees are called perfectly balanced trees because they are as in balance as it is possible to be for that number of nodes. If you have been following the argument it should be obvious that the search time is at a minimum for perfectly balanced trees.
At this point it looks as though all the problems are solved. All we have to do is make sure that the tree is perfectly balanced and everything will be as efficient as it can be. Well this is true but it misses the point that ensuring that a tree is perfectly balanced isn’t easy. If you have all of the data before you begin creating the tree then it is easy to construct a perfectly balanced tree but it is equally obvious that this task is equivalent to sorting the data and so we might as well just use a sorted list and binary search approach.
The only time that a tree search is to be preferred is if the tree is built as data arrives because there is the possibility of building a well shaped search tree without sorting.
Terminology used in trees
Node
A node is a structure which may contain a value or condition, or represent a separate data structure.
Root
The top node in a tree, the prime ancestor.
Child
A node directly connected to another node when moving away from the root, an immediate descendant.
Parent
The converse notion of a child, an immediate ancestor.
Siblings
A group of nodes with the same parent.
Neighbor
Parent or child.
Descendant
A node reachable by repeated proceeding from parent to child. Also known as subchild.
Ancestor
A node reachable by repeated proceeding from child to parent.
Leaf / External node (not common)
A node with no children.
Branch node / Internal node
A node with at least one child.
Degree
For a given node, its number of children. A leaf is necessarily degree zero. The degree of a tree is the degree of its root.
Degree of tree
The degree of the root.
Edge
The connection between one node and another.
Path
A sequence of nodes and edges connecting a node with a descendant.
Distance
The number of edges along the shortest path between two nodes.
Depth
The distance between a node and the root.
Level
1 + the number of edges between a node and the root, i.e. (Depth + 1)
Height
The number of edges on the longest path between a node and a descendant leaf.
Width
The number of nodes in a level.
Breadth
The number of leaves.
Height of tree
The height of the root node or the maximum level of any node in the tree
Forest
A set of n ≥ 0 disjoint trees.
Sub Tree
A tree T is a tree consisting of a node in T and all of its descendants in T.
Ordered Tree
A rooted tree in which an ordering is specified for the children of each vertex.
Size of a tree
Number of nodes in the tree.
Terminology
A node is a structure which may contain a value or condition, or represent a separate data structure (which could be a tree of its own). Each node in a tree has zero or more child nodes, which are below it in the tree (by convention, trees are drawn growing downwards). A node that has a child is called the child’s parent node (or ancestor node, or superior). A node has at most one parent.
An internal node (also known as an inner node, inode for short, or branch node) is any node of a tree that has child nodes. Similarly, an external node (also known as an outer node, leaf node, or terminal node) is any node that does not have child nodes.
The topmost node in a tree is called the root node. Depending on definition, a tree may be required to have a root node (in which case all trees are non-empty), or may be allowed to be empty, in which case it does not necessarily have a root node. Being the topmost node, the root node will not have a parent. It is the node at which algorithms on the tree begin, since as a data structure, one can only pass from parents to children. Note that some algorithms (such as post-order depth-first search) begin at the root, but first visit leaf nodes (access the value of leaf nodes), only visit the root last (i.e., they first access the children of the root, but only access the value of the root last). All other nodes can be reached from it by following edges or links. (In the formal definition, each such path is also unique.) In diagrams, the root node is conventionally drawn at the top. In some trees, such as heaps, the root node has special properties. Every node in a tree can be seen as the root node of the subtree rooted at that node.
The height of a node is the length of the longest downward path to a leaf from that node. The height of the root is the height of the tree. The depth of a node is the length of the path to its root (i.e., its root path). This is commonly needed in the manipulation of the various self-balancing trees, AVL Trees in particular. The root node has depth zero, leaf nodes have height zero, and a tree with only a single node (hence both a root and leaf) has depth and height zero. Conventionally, an empty tree (tree with no nodes, if such are allowed) has height −1.
A subtree of a tree T is a tree consisting of a node in T and all of its descendants in T.[f][21] Nodes thus correspond to subtrees (each node corresponds to the subtree of itself and all its descendants) – the subtree corresponding to the root node is the entire tree, and each node is the root node of the subtree it determines; the subtree corresponding to any other node is called a proper subtree (by analogy to a proper subset).
Drawing trees
Trees are often drawn in the plane. Ordered trees can be represented essentially uniquely in the plane, and are hence called plane trees, as follows: if one fixes a conventional order (say, counterclockwise), and arranges the child nodes in that order (first incoming parent edge, then first child edge, etc.), this yields an embedding of the tree in the plane, unique up to ambient isotopy. Conversely, such an embedding determines an ordering of the child nodes.
If one places the root at the top (parents above children, as in a family tree) and places all nodes that are a given distance from the root (in terms of number of edges: the “level” of a tree) on a given horizontal line, one obtains a standard drawing of the tree. Given a binary tree, the first child is on the left (the “left node”), and the second child is on the right (the “right node”).
Representations
There are many different ways to represent trees; common representations represent the nodes as dynamically allocated records with pointers to their children, their parents, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap).
Indeed, a binary tree can be implemented as a list of lists (a list where the values are lists): the head of a list (the value of the first term) is the left child (subtree), while the tail (the list of second and subsequent terms) is the right child (subtree). This can be modified to allow values as well, as in Lisp S-expressions, where the head (value of first term) is the value of the node, the head of the tail (value of second term) is the left child, and the tail of the tail (list of third and subsequent terms) is the right child.
In general a node in a tree will not have pointers to its parents, but this information can be included (expanding the data structure to also include a pointer to the parent) or stored separately. Alternatively, upward links can be included in the child node data, as in a threaded binary tree.
Traversal methods
Main article: Tree traversal
Stepping through the items of a tree, by means of the connections between parents and children, is called walking the tree, and the action is a ‘walk’ of the tree. Often, an operation might be performed when a pointer arrives at a particular node. A walk in which each parent node is traversed before its children is called a pre-order walk; a walk in which the children are traversed before their respective parents are traversed is called a post-order walk; a walk in which a node’s left subtree, then the node itself, and finally its right subtree are traversed is called an in-order traversal. (This last scenario, referring to exactly two subtrees, a left subtree and a right subtree, assumes specifically a binary tree.) A level-order walk effectively performs a breadth-first search over the entirety of a tree; nodes are traversed level by level, where the root node is visited first, followed by its direct child nodes and their siblings, followed by its grandchild nodes and their siblings, etc., until all nodes in the tree have been traversed.
Common operations
Enumerating all the items
Enumerating a section of a tree
Searching for an item
Adding a new item at a certain position on the tree
Deleting an item
Pruning: Removing a whole section of a tree
Grafting: Adding a whole section to a tree
Finding the root for any node
Finding the lowest common ancestor of two nodes
Common uses
Representing hierarchical data such as syntax trees
Storing data in a way that makes it efficiently searchable (see binary search tree and tree traversal)
Representing sorted lists of data
As a workflow for compositing digital images for visual effects[citation needed]
Storing Barnes-Hut trees used to simulate galaxies.
Link:
https://en.wikipedia.org/wiki/Tree_(data_structure)
Binary search algorithm
Link:
https://algorithms.openmymind.net/search/binarysearch.html
Hash table(creating, collisions)
In computing, a hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found.
Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key. Such collisions must be accommodated in some way.
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized[2]) constant average cost per operation.[3][4]
In many situations, hash tables turn out to be on average more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
Link:
https://en.wikipedia.org/wiki/Hash_table
Stack, queue, linked list(construction, understanding, usage)
https: //www.studytonight.com/data-structures/introduction-to-linked-list.php
https: //www.studytonight.com/data-structures/stack-data-structure.php
https: //www.studytonight.com/data-structures/queue-data-structure.php