SWE Flashcards
SWE: What does it mean for function f(n) to be big-O of g(n), or f(n) in O(g(n)), or f(n) = O(g(n))
There exists number n* and constant c so for all n > n*,
f(n) < cg(n)
SWE: At a high level, what does it mean for f(x) to be big-Omega of g(x)?
It means g(x) is big-O of f(x); it reverses the relationship.
What type of bound does big-O provide on the runtime of our algorithm? What about big-Omega? What about big-Theta?
Big-O provides an upper bound
Big-Omega provides a lower bound
Big-Theta provides both an upper and lower bound, or a tight bound
SWE: What does it mean for f(n) to be big-Theta of g(n)
f(n) is both big-O of g(n) and big-Omega of g(n)
SWE: When we use big-O, are we typically describing best-case, worst-case, or expected-case run time?
Worst case. (But it can be used for the other two.)
SWE: What does asymptotic analysis do for function f(n)? What is our primary tool for asymptotic analysis?
It describes the limiting behavior for function f(n). We use big-O.
SWE: If we have two arrays with lengths a and b respectively, what is big-O run time for nested loops iterating through all elements of one array for all elements of another?
O(ab)
SWE: What are the two main rules for calculating big-O of a function f(n)?
Drop any constants, either multiplicative or additive, and drop any additive terms that are asymptotically not the largest.
SWE: What is big-O of f(n) = 4*2n + 3n10 + 10?
O(2n)
SWE: What “general pattern” emerges for O(log n) runtimes?
Continually dividing n by some constant, such as 2, until you reach 1. (Or, continually multiplying a constant by another constant until you reach n.)
SWE: Why can string questions be approached as array questions, and vice versa?
Because a string is just an array of characters.
SWE: What is a (singly) linked list?
A list of nodes, where each node has a value and a pointer to the next node.
SWE: What is a double linked list?
A list of nodes, where each node has a value and a pointer to both the previous and next node.
SWE: What is an advantage of linked lists over traditional arrays? What is a disadvantage?
Advantage: You can add and remove elements from the end of the list in constant time (whereas with a resizable array, additions are only amortized constant time)
Disadvantage: You can no longer use an index to find a value in constant time. If you want the kth element, you must iterate through k nodes.
SWE: In a linked list, what is a null node, and what is is used for? (In 122’s particular implementation, which I’ll use)
It is a note with a null data value and a null pointer to the next node. It serves as the end of the linked list, or the final node. When iterate throught he list and find a null node, we know the list is over.
SWE: In a linked list, what is a root node?
The root node is the current first node.
SWE: How would we draw a 2 element linked list, where we have 3 nodes: a root node, a normal node, and our null node?
SWE: When we add a node to a linked list, and the linked list is drawn in our typical way, do we add the node to the left or the right? Why?
The left!
SWE: How do we add a new node to a linked list?
Set the new node to be the root node, and set the pointer of the new node to go to the old root node.
SWE: How do you implement a linked list in python, with add and remove methods?
SWE: Are linked list problems commonly recursive or iterative?
Recursive, because linked lists have a naturally recursive structure. (But there are many problems with iterative solutions too!)
SWE: What is the runner technique for linked lists?
You examine the linked list using two pointers: one that is “slow”, and one that is “fast” and is generally at a farther position in the linked list, and which moves faster than the other.
This is a pretty general technique, but be aware that it exists: examining with two pointers might be handy for many problems.
SWE: What is a hash table used for, and what makes it so useful?
A hash table is a data structure that maps keys to values; you insert key-value pairs, and then search for a key to get its value.
It is useful because it enables fast lookup of elements, in basically constant time as long as your hash table has enough available memory.
SWE: What Python data structure implements the hash table?
Dictionaries (or defaultdicts, which are convenient)
SWE: How is a hash table implemented? What components are needed, and how are they used to insert a key-value pair, or retrieve a key-value pair? (There are multiple possible implementation strategies, we just focus on one in particular.)
Our hash table is an array of linked lists. So each entry in the array is its own linked list.
We use a hash function to map keys to an entry in the array, and this mapping is used for both insertion and retrieval.
To insert pair (key,val), get your hash value h(key), and then find your array index using h(key)%m, where m is the length of the array. We use modulus to keep the index within bounds. We then add our (key,value) pair to the linked list: if the key is already there, we update the value; otherwise, we add a new node with that key-value pair.
To check the value for a key, we get our index h(key)%m again, and look for our key in the linked list at that index. If it isn’t there, then that key hasn’t been inserted.
SWE: What is a hash function? What in general makes a “good” hash function in the context of, say, a hash table?
A hash function maps some value to an integer, in a non-random way. A good hash function will “evenly distribute” values across the different integers. In the context of a hash table, it should distribute the integers evenly across each of the cells of the array. This is good because it leads to fewer long chains in the hash table.
SWE: When making a hash table, why do we need a linked list in every cell? Why can’t we just insert key-value pairs into the cell directly?
Because of collisions: if 2 pairs are mapped to the same cell, they collide, and we can’t put them both in the same cell. Hence, a linked list is used.
SWE: When discussing hash tables, what is a load factor? What does it describe
The load factor is the number of keys in the hash table divided by the length of the array. It is n/m.
It describes how “full” the array is, or how long on average the chains are: if you have a load factor well over 1, then you have several collisions and longer chains.
SWE: What is the worst-case runtime for a lookup in a hash table, in terms of the number of keys n? In what sorts of situations would this happen, and how do these situations cause this runtime?
The worst-case runtime is O(n). This would happen if your array isn’t large enough, and as a result you have some long chains in some of the array indexes.
When looking for a key in a chain, it is O(k), where k is the number of keys in the chain. If you have a lot of keys in the chain – an amount that becomes substantively “on the order of” n, such as say n/10, then you’re looking at linear time.
SWE: When using a hash table, how can we assure constant-time insertion and lookups?
You need to keep your load factor low; you need to have a large enough array so there are few collisions, leading to chains that are at most a few elements long.
SWE: Hash tables are typically used for storing (key,value) pairs, but what is an intuitive way to have them store just keys? So like a set in python rather than a dictionary?
In traditional hash tables, you hash the key, then insert the key-value pair at the corresponding location. But there’s no reason you need a value: you could just insert the key.
(If you have a hash table set up like a dictionary, you could also just give every key a value of NULL which you ignore.)
(This is pretty simple, but I want to remember you can also easily make a hash table that works like a set!)
SWE: Is a stack first-in-first-out or first-in-last-out? Is a queue first-in-first-out or first-in-last-out?
A stack is first-in-last-out
A queue is first-in-first-out
SWE: What is a stack? How is it used at a high level?
A stack is a data structure which stores elements. You can add an element (or “push” it), and you can remove an element (or “pop” it). Elements are removed in reverse order of how they’re added, or first-it-last-out. This is like a stack of blocks where you continually add a blocks to the top when you want to add, and remove a blocks from the top when you want to remove.
SWE: How do you simply implement a stack – with pop, push, peek and isEmpty methods – if you already have an implementation of another key data structure?
You need a linked list implementation. Recall that a linked list has a null node as the “final node” on the right with no descendants, and the “first node” being the root node which has nothing pointing to it. To add an element to the linked list, you make it the root and make the old root its descendant. To remove an element, you make its descendant the root.
In other words, a linked list is basically already an implementation of a stack. To push, you just add the value to the LL. To pop, you just remove the root from the LL. To peek, you just look at the value in the root of the LL. To check if it’s empty, you see if the root node is the null node.
SWE: What is a queue? How is it used at a high level?
A queue is a data structure which stores elements. You can add an element (or “push” it), and you can remove an element (or “pop” it). Elements are removed the same order as how they’re added, or first-it-first-out. This is like a queue of people who are waiting in line, where the first person to get in line is the first person to get out.
SWE: How do we implement a queue, with push, pop, peek, and isEmpty, using another common data structure?
We will implement the queue using a slight variation of a linked list.
With a normal linked list, the list is visualized several nodes with arrows pointing left to right. On the far left is the root node; the linked list data structure keeps track of the root node, and a newly inserted value enters on the left and becomes the new root. The linked list also has a null node to the far right.
In a queue, we are going to insert elements on the right side of the linked list, or the back, and we (same as a normal linked list) remove elements from the left side of the linked list, or the front. So the arrows between nodes point from front to back, which may seem unintuitive.
Thus, we now keep track of two nodes rather than just one. We keep track of the front, or the root, as well as the back, which is the last node in the list. The back node doesn’t point to anything; it’s easier in a queue to not have a null node and instead keep an “empty” bool in the data structure. When the queue’s empty, set the front and back pointers to just point to NULL.
To push a value onto the back of the list, we make a new node, have the current back point to it, and set the new node as the back. To pop a value from the list, we set the front’s descendant as the new front. To peek, we return the value at the front. For isEmpty, we check the bool.
SWE: What is a tree?
A tree is a data structure which organizes nodes in terms of parent-child relationships, where each node can have at most one parent, but nodes could have several children.
A tree is basically a DAG, or directed acyclic graph, that is also connected, meaning it’s not like two or more graphs that don’t touch each other.
SWE: In a tree, what is a parent and what is a child?
Simply, if node A is directly above node B, A is the parent of B and B is the child of A
SWE: In a tree, what is an ancestor and what is a descendant?
A descendant of node A is any node you can get to by following branch down from A.
An ancestor of A is any node for which A is a descendant.
SWE: In trees, what is a branch, starting at say node A
A branch is a “path” from start node A down through multiple parent-child relationships, always moving from parent to child, until we reach a leaf node.
SWE: In trees, what is a leaf? What is the root?
A leaf is a tree with no descendants; the root is a tree with no parents.
SWE: Given a tree, what is the subtree with node A as the root?
It’s simply all node A as the root of a tree with all of the descendants. It’s basically the tree you get if you erase anything from the original tree that wasn’t A or a descendant of A.
SWE: What makes a tree binary? What makes a tree ternary?
A binary tree is where each node can have at most two descendants (but only 1 parent). A ternary tree is where each node can have at most three descendants (but only 1 parent). And so on.
SWE: What is a binary search tree?
It’s a binary tree where, for every node n, you have the property that:
All left descendents of n < n < all right descendants of n.
(There can be differing opinions on equality in the tree. Sometimes above will read <= n
SWE: How do you search for an element in a binary search tree? How do you insert a new element into a binary search tree (where it doesn’t necessarily need to be balanced)?
To search for an element x, you follow a branch down the tree using the information that the tree is in a way “sorted”. So if you’re at node y and y < x, you move down the left branch, because x then would definitionally need to be on the left. If x < y, you move down the right branch.
To insert, you do the same thing and move down the tree in this way. If you find x, you typically wouldn’t insert it again. If you find a leaf without finding x, you insert it on the appropriate side of the leaf.
SWE: Why is a binary search tree useful for searching for elements?
Because of the ordering properties of the BST, you only need to search down one branch rather than the whole tree. This means that if the tree is balanced, you can search for an element in only log(n) time.
SWE: What is a complete binary tree?
A complete binary tree has every level fully filled out, except for potentially the last level. If the last level isn’t full, it’s filled in from left to right.
SWE: What is the height of a tree? What is the height of a tree with just a root, or an arbitrary tree?
The height of a tree is the length of its longest branch. Specifically, it’s the number of edges along the longest branch.
We say that just a root node has a height of 0, because the longest path has no edges.
To find the height of an arbitrary tree, just follow all the branches down to their leaves, adding 1 to that branch’s length for each node past the root, and thus for each edge. Take the max branch length as the height.
SWE: What is one way you learned to achieve balanced binary trees?
AVL Trees
SWE: What is the technical definition of a balanced tree? And what is a more colloquial, slightly more relaxed definition of a balanced tree?
Technically, a balanced tree is a tree such that, for every subtree present, the height of its left subtree and its right subtree differ by at most 1.
Colloquially, it’s basically a tree that is balanced “enough” among all subtrees such that we can expect traversals of a single branch to take O(log n) time, rather than linear time.
SWE: How do you implement a tree where each node can have an arbitrary number of descendants? What datatype is necessary, and what datatype is optional but usually not useful for interviews?
Also, how would you adapt this implementation to be specifically a binary tree?
You will need a treeNode class, which has a data attribute holding its value and a descendants attribute which holds a list of its descendants.
You can also include a Tree class, whose only attribute is a root node, but it’s usually not helpful for interviews?
If you want specifically a binary tree, your treeNode class would have three attributes: data, leftDescendant, and rightDescendant.
SWE: What is an in-order traversal of a binary tree? How would you write a function to perform an in-order traversal of a binary tree, which takes a root node as input and prints a node’s data whenever it “visits” that node?
In an in-order traversal, at every node, you explore the left subtree of the node (and visit all of the nodes in that subtree), then visit the node itself, then visit all of the nodes in the right subtree.
A recursive function is as follows, which implicitly handles the base case by doing nothing if we’ve reached a null node:
SWE: What is an pre-order traversal of a binary tree? How would you write a function to perform an pre-order traversal of a binary tree, which takes a root node as input and prints a node’s data whenever it “visits” that node?
In an pre-order traversal, at every node, visit that node, then you explore the left subtree of the node (and visit all of the nodes in that subtree), and then visit all of the nodes in the right subtree.
A recursive function is as follows, which implicitly handles the base case by doing nothing if we’ve reached a null node:
SWE: What is an post-order traversal of a binary tree? How would you write a function to perform an post-order traversal of a binary tree, which takes a root node as input and prints a node’s data whenever it “visits” that node?
In an post-order traversal, at every node, you explore the left subtree of the node (and visit all of the nodes in that subtree), then visit all of the nodes in the right subtree, and then lastly you visit the node itself.
A recursive function is as follows, which implicitly handles the base case by doing nothing if we’ve reached a null node:
SWE: What is a max heap, in the context of a min heap? Why is it very easy to implement a max heap if you understand min heaps?
In a min heap, all nodes’ descendants are larger than the node. For a max heap it’s simply the opposite: all nodes’ descendants are smaller, so the max is at the top. So the implementation is basically the exact same, you just do the orderings in reverse.
SWE: What is a binary min-heap?
A complete binary tree where each node is smaller than its children, so the min of the tree is always at the root.