Basic Data Structures and Algorithms Flashcards

Question 1

Q

What are the 5 steps for handling technical questions?

Answer

A

Ask questions
Design an algorithm
Pseudocode
Code
Test

Question 2

Q

What are the 5 approaches to algorithms?

Answer

A

Examplify
Pattern Matching
Simplify and Generalize
Base Case and Build
Data Structure Brainstorm

Question 3

Q

What is a linked list?

Answer

A

a data structure in which objects are arranged in linear order. Unlike an array (where order is determined by array indicies) the order is determined by a pointer in each object.

Question 4

Q

What is a singly linked list?

Answer

A

a linked list where the prev pointer is omitted in each element

Question 5

Q

What is a doubly linked list?

Answer

A

a linked list with both a ‘prev’ and next pointer in each element

Question 6

Q

What is a circular list?

Answer

A

a list where the prev pointer of the head of the list (first item in the list) points to the tail of the list. And the next pointer points to the head of the list.

Question 7

Q

What is a binary tree?

Answer

A

A binary tree is made of nodes, where each node contains a “left” pointer, a “right” pointer, and a data element. The “root” pointer points to the topmost node in the tree. The left and right pointers recursively point to smaller “subtrees” on either side.

Question 8

Q

What’s are tries?

Answer

A

A trie is a tree structure, where each edge represents one character, and the root represents the null string.

Thus, each path from the root represents a string, described by the characters labeling the edges traversed. Any ﬁnite set of words deﬁnes a trie, and two words with common preﬁxes branch oﬀ from each other at the ﬁrst distinguishing character. Each leaf denotes the end of a string.

Question 9

Q

When are tries useful?

Answer

A

Tries are useful for testing whether a given query string q is in the set.

We traverse the trie from the root along branches deﬁned by successive characters of q. If a branch does not exist in the trie, then q cannot be in the set of strings

Question 10

Q

What is a suﬃx tree?

Answer

A

A trie of all the proper suﬃxes of S.

The suﬃx tree enables you to test whether q is a substring of S, because any substring of S is the preﬁx of some suﬃx.

The search time is again linear in the length of q.

Question 11

Q

What are 3 things a suﬃx tree can be used for?

Answer

A

Find all occurrences of q as a substring of S
Longest substring common to a set of strings
Find the longest palindrome in S

Question 12

Q

Define Big-O notation.

Answer

A

Big O notation is most commonly used by programmers as an approximate measure of how long a computation (algorithm) will take to complete expressed as a function of the size of the input set.

Big O is useful to compare how well two algorithms will scale up as the number of inputs is increased.

More precisely Big O notation is used to express the asymptotic behavior of a function. That means how the function behaves as it approaches infinity.

Question 13

Q

What is linear complexity in Big-O notation?

Answer

A

O(n)

ex.) counting the number of items in a linked list

Question 14

Q

What is quadratic complexity in Big-O notation?

Answer

A

O(n2)

ex.) bubble sort

Question 15

Q

What is logarithmic complexity in Big-O notation?

Answer

A

O(log n)

ex.) binary search tree

Question 16

Q

What is constant complexity in Big-O notation?

Answer

A

O(1)

ex.) accessing an array element by index

Question 17

Q

What is factorial or combinatorial complexity in Big-O notation?

Answer

A

O(n!)

ex.) traveling salesman problem brute-force solution

Question 18

Q

What is linearithmic complexity in Big-O notation?

Answer

A

O(n log n)

ex.) heap sort and quick sort

Question 19

Q

What is a stack?

Answer

A

a dynamic set with a LIFO (last in first out) policy.

Think of it as a stack of cafeteria trays, when adding new trays to the stack they go on top, when taking trays off of the stack they come from the top.

Question 20

Q

What is a queue?

Answer

A

a dynamic set with a FIFO (first in, first out) policy.

Question 21

Q

What happens when you try to pop an empty stack?

Answer

A

the stack underflows (normally resulting in an error)

Question 22

Q

What happens if you try adding an element to a stack when it is already holding the maximum amount of elements it can hold?

Answer

A

the stack overflows

Question 23

Q

what is the insert operation on a stack called?

Question 24

Q

what is the delete operation on a stack called?

Question 25

Q

what is the insert operation on a queue called?

Question 26

Q

what is the delete operation on a queue called?

Question 27

Q

what is the head and the tail of a queue?

Answer

A

When an element is enqueued (added) to a queue it happens at the tail of the queue. When an element is it dequeued (removed) from a queue it is always the one at the head of the queue.

Question 28

Q

what happens if you try to dequeue an element from an empty queue?

Answer

A

the queue underflows

Question 29

Q

what happens if you try to enqueue an element to a full queue?

Answer

A

the queue overflows

Question 30

Q

what is the difference between a stack and a queue?

Answer

A

a stack is LIFO and a queue is FIFO

Question 31

Q

what is a hash table?

Answer

A

a Python dictionary

an unordered set of key: value pairs, where the keys are unique and immutable

under reasonable assumptions the average time to search for an element in a hash table is O(1)

only support dictionary operations INSERT, SEARCH, and DELETE

Question 32

Q

what are “collisions” in a hash table?

Answer

A

when more than one key maps to the same array index

Question 33

Q

when is a hash table a better alternative to directly addressing an array?

Answer

A

when the number of keys actually stored is small relative to the total number of possible keys

Question 34

Q

what is the worst-case running time for most search-tree operations?

Answer

A

worst-case running time is proportional to the height of the tree

Question 35

Q

which sorting algorithm is typically the best practical choice for storting and why?

Answer

A

Quicksort. it is remarkably efficient on the average

Question 36

Q

how does quicksort work?

Answer

A

It is a divide-and-conquer algorithm.

Choose any element of the array to be the pivot.
Divide all other elements (except the pivot) into two partitions.
- Elements less than the pivot must be in the first partition.
- elements greater in the second partition.
Use recursion to sort both partitions.

(Once the pivot is selected, we select successive elements from the beginning of the array that are greater than or equal to the pivot and select elements from the end of the array that are less than or equal to the pivot. Once we find one of each of these, we swap their values and move the selectors toward the opposite ends of the array and repeat until the selectors meet at the pivot. This partitions the array.)

Join the first sorted partition, the pivot, and the second sorted partition.

The runtime of Quicksort ranges from O(n log n) with the best pivots, to O(n2) with the worst pivots, where n is the number of elements in the array.

Question 37

Q

what is a set?

Answer

A

A set is a collection (container) of certain values, without any particular order, and no repeated values. It corresponds with a finite set in mathematics. A set can be implemented as an associative array (partial mapping) in which the value of each key-value pair is ignored.

Question 38

Q

what is a powerset?

Answer

A

Given a set S, the power set (or powerset) of S, written P(S), or 2S, is the set of all subsets of S.

For example, the power set of {1,2,3,4} is {{}, {1}, {2}, {1,2}, {3}, {1,3}, {2,3}, {1,2,3}, {4}, {1,4}, {2,4}, {1,2,4}, {3,4}, {1,3,4}, {2,3,4}, {1,2,3,4}}.

Question 39

Q

what is the runtime of quicksort?

Answer

A

Worst Case: O(N^2)
Best Case: O(N log N)
Average Case: O(N log N)

The runtime of Quicksort ranges from O(n log n) with the best pivots, to O(n2) with the worst pivots, where n is the number of elements in the array.

Question 40

Q

how does mergesort work?

Answer

A

Mergesort is an example of a divide and conquer algorithm.

Mergesort is a recursive algorithm. Given a list A, divide it into two halves. Recursively sort the two halves and then merge the result into one sorted list.

Question 41

Q

what is the runtime of mergesort?

Answer

A

Mergesort runs in O(N log N) running time and the number of comparisons is almost optimal.

Worst Case: O(N log N)
Best Case: O(N log N)
Average Case: O(N log N)

Question 42

Q

what are some advantages and disadvantages of mergesort?

Answer

A

Advantages:
Mergesort runs in O(N log N) running time and the number of comparisons is almost optimal.

Disadvantages:

it uses linear extra memory
the extra eﬀort of moving items between the temporary list and main list slows down the sort considerably
the recursive nature of the algorithm causes an overhead on short lists.

Mergesort is seldom used for main memory sorts because of these disadvantages

Question 43

Q

what impacts the efficiency of quicksort?

Answer

A

The eﬃciency of quicksort depends critically on the details of the implementation.

How do you pick the pivot?
How do you partition?
When do you drop out of the recursion?

Question 44

Q

what are the differences between quicksort and mergesort?

Answer

A

mergesort is always O(N log N)
quicksort’s worst case is O(N^2)

Unlike MergeSort, Quicksort does not always recurse on subproblems of equal size. The size of the recursive calls depends on the pivot. The more unequal the size of the partitions, the worse the performance.

Question 45

Q

when is it not a good idea to use recursive sorting algorithms (such as quicksort and mergesort)?

Answer

A

when few items are being sorted.

In any recursive sorting algorithm, there is a tradeoﬀ when the size of the inputs falls below a certain point. For example, do you really want to call Mergesort on an list of size 2? Size 3? Size 4?

Question 46

Q

what is the heap?

Answer

A

The heap is memory set aside for dynamic allocation. Unlike the stack, there’s no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time.

Question 47

Q

what is the stack? (memory)

Answer

A

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

Question 48

Q

what is the difference between the heap and the stack? what’s the same?

Answer

A

Same:
Both stored in RAM

Different:
Think of a stack as a stack of papers while a heap is a pile of stuff with no clear top.

The heap is memory set aside for dynamic allocation.

Unlike the stack, there’s no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time.

This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Question 49

Q

what are three ways to represent a graph in memory?

Answer

A

objects & pointers - nodes as objects and edges as pointers
matrix - a matrix containing all edge weights between numbered node x and node y
adjacency list - a list of edges between numbered nodes

Question 50

Q

what are the advantages / disadvantages to each way we can represent a graph in memory?

Answer

A

** you should be able to figure this out, this isn’t the full answer **

objects & pointers - nodes as objects and edges as pointers
matrix - a matrix containing all edge weights between numbered node x and node y
adjacency list - a list of edges between numbered nodes

Question 51

Q

what is breadth-first search (BFS)?

Answer

A

uses a queue data structure to store intermediate results as it traverses the graph

a strategy for searching in a graph when search is limited to essentially two operations: (a) visit and inspect a node of a graph; (b) gain access to visit the nodes that neighbor the currently visited node. The BFS begins at a root node and inspects all the neighboring nodes. Then for each of those neighbor nodes in turn, it inspects their neighbor nodes which were unvisited, and so on.

Question 52

Q

what is depth-first search (DFS)?

Answer

A

uses a stack data structure to store intermediate results as it traverses the graph

an algorithm for traversing or searching tree or graph data structures. One starts at the root (selecting some node as the root in the graph case) and explores as far as possible along each branch before backtracking.

Question 53

Q

when should you use breadth-first search (BFS) and when should you use depth-first search (DFS)?

Answer

A

BFS is better for problems related to finding the shortest paths or somewhat related problems.

DFS on the other end helps more in connectivity problems and also in finding cycles in graph (though I think you might be able to find cycles with a bit of modification of BFS too)

Question 54

Q

what is the traveling salesman problem?

Answer

A

Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?

It is an NP-hard problem in combinatorial optimization.

Question 55

Q

what is the knapsack problem?

Answer

A

Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items.

Question 56

Q

what does NP-hard mean?

Answer

A

a class of problems that are, informally, “at least as hard as the hardest problems in NP”

NP-hard stands for “Non-deterministic Polynomial-time hard”

Question 57

Q

what does NP-complete mean?

Answer

A

A decision problem L is NP-complete if it is in the set of NP problems and also in the set of NP-hard problems. The abbreviation NP refers to “nondeterministic polynomial time.”

NP-complete problems are often addressed by using approximation algorithms.

Question 58

Q

what is a combination?

Answer

A

a way of selecting several things out of a larger group, where (unlike permutations) order does not matter. In smaller cases it is possible to count the number of combinations.

For example given three fruit, say an apple, orange and pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange.

Question 59

Q

what is n-choose-k useful for?

Answer

A

Useful for many problems involving independent events, like counting equipment failures over time.

ex) what is the number of different k-member teams that can be formed among n potential players?

“how many ways can I choose 2 items out of 5”

Question 60

Q

what is a tree?

Answer

A

it has a set of nodes that are linked to each other. A node on a tree may contain a condition or value; it can also be a tree of its own or it can represent a separate data structure.

Question 61

Q

what is a n-ary tree?

Answer

A

a rooted tree in which each node has no more than n children

Question 62

Q

what is a balanced binary tree?

Answer

A

a binary tree in which the depth of the left and right subtres of every node differ by 1 or less. Trees that are balanced have predictable depth.

Question 63

Q

what is a red/black tree?

Answer

A

a binary search tree with one extra bit of storage per node: its color, which can be red or black.

Question 64

Q

how does splay tree work?

Answer

A

A splay tree is a self-adjusting binary search tree with the additional property that recently accessed elements are quick to access again.

All normal operations on a binary search tree are combined with one basic operation, called splaying. Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree. One way to do this is to first perform a standard binary tree search for the element in question, and then use tree rotations in a specific fashion to bring the element to the top. Alternatively, a top-down algorithm can combine the search and the tree reorganization into a single phase.

Answer 61

A

to go through something

Answer 62

A

The primary difference between the preorder, postorder and in-order traversals is where the node is visited in relation to the recursive calls; i.e., before, after or in-between.

Answer 63

A

preorder, postorder, and inorder

Answer 64

A

A*, BFS, DFS

these are also graph traversal algorithms

Answer 65

A

a data structure that simulates a hierarchical tree structure with a root value and subtrees of children represented as a set of linked nodes

Answer 66

A

A graph is a specific data structure known in the computer science, that is often used to give a model of different kind of problems where a set of objects relate to each other in some way.

Answer 67

A

Basically, a graph is a more complex tree.

A tree is a specialized case of a graph. A tree is a connected graph with no circuits and no self loops.

Answer 68

A

in a directed graph direction matters (there can by one way relationships)

An undirected graph means that in case there is an edge between the nodes i and j we shell assume that there is a path from i to j, as well as from j to i. In the case of directed graph, we’ll assume that if (i,j) exists there only path from node i to node j and there’s no path between j and i.

Answer 69

A

in a weighted graph, each edge is associated with a weight (which can be negative). An example is a graph of cities where the distance between cities is included on the edges between them

Answer 70

A

a two-dimensional array

Answer 71

A

O ( | V | + | E | )

assuming that the queue operations are O(1).

Answer 72

A

Time complexity of DFS is O ( | V | + | E | )

Why? each vertex is changed from
white to gray to black exactly once; each edge (u; v) is explored exactly once,
when DFV[u] is run. All steps are constant time

Answer 73

A

A* uses a best-first search and finds a least-cost path from a given initial node to one goal node (out of one or more possible goals). As A* traverses the graph, it follows a path of the lowest expected total cost or distance, keeping a sorted priority queue of alternate path segments along the way.

It uses a knowledge-plus-heuristic cost function of node (usually denoted) to determine the order in which the search visits nodes in the tree.

Answer 74

A

It picks the unvisited vertex with the lowest-distance, calculates the distance through it to each unvisited neighbor, and updates the neighbor’s distance if smaller. Mark visited (set to red) when done with neighbors.

Answer 75

A

The time complexity of A* depends on the heuristic. In the worst case, the number of nodes expanded is exponential in the length of the solution (the shortest path), but it is polynomial when the search space is a tree, there is a single goal state, and the heuristic function h meets the following condition:

Answer 76

A

Run time: O(|V|)

Answer 77

A

DFS uses a stack data structure to store intermediate results as it traverses the graph and BFS uses a queue data structure

Answer 78

A

A* is a modification of Dijkstra’s algorithm that uses a heuristic to determine which vertices to search.

Answer 79

A

an abstract data type which is like a regular queue or stack data structure, but where additionally each element has a “priority” associated with it. In a priority queue, an element with high priority is served before an element with low priority. If two elements have the same priority, they are served according to their order in the queue.

CS BOOK:
a data structure for maintaing a set of S elements, each with an associated value called a key.

Answer 80

A

A preorder traversal can be defined (recursively) as follows:

visit the root
perform a preorder traversal of the first subtree of the root
perform a preorder traversal of the second subtree of the root
etc. for all the subtrees of the root

Answer 81

A

A postorder traversal is similar to a preorder traversal, except that the root of each subtree is visited last rather than first:

perform a postorder traversal of the first subtree of the root
perform a postorder traversal of the second subtree of the root
etc. for all the subtrees of the root
visit the root

Answer 82

A

An in-order traversal involves visiting the root “in between” visiting its left and right subtrees. Therefore, an in-order traversal only makes sense for binary trees. The (recursive) definition is:

perform an in-order traversal of the left subtree of the root
visit the root
perform an in-order traversal of the right subtree of the root

Answer 83

A

The idea of a level-order traversal is to visit the root, then visit all nodes “1 level away” (depth 2) from the root (left to right), then all nodes “2 levels away” (depth 3) from the root, etc.

A level-order traversal requires using a queue (rather than a recursive algorithm, which implicitly uses a stack).

Answer 84

A

a binary search tree that is height balanced: the height of the left and right subtrees must differ by at most 1

IN ENGLISH: The tree must be balanced at all times.

Answer 85

A

the root is colored black
all the paths from root to the leaves have the same number of black nodes
no path from the root to the leaf has two consecutive red nodes

Answer 86

A

Searching a binary search tree for a specific key can be a recursive or iterative process.
We begin by examining the root node. If the tree is null, the key we are searching for does not exist in the tree. Otherwise, if the key equals that of the root, the search is successful. If the key is less than the root, search the left subtree. Similarly, if it is greater than the root, search the right subtree. This process is repeated until the key is found or the remaining subtree is null. If the searched key is not found before a null subtree is reached, then the item must not be present in the tree.

Answer 87

A

Red Black trees are good for creating well-balanced trees.

The major problem with binary search trees is that you can make them unbalanced very easily. Imagine your first number is a 15. Then all the numbers after that are increasingly smaller than 15. You’ll have a tree that is very heavy on the left side and has nothing on the right side.

Red Black trees solve that by forcing your tree to be balanced whenever you insert or delete. It accomplishes this through a series of rotations between ancestor nodes and child nodes.

Answer 88

A

Insertion begins as a search would begin; if the key is not equal to that of the root, we search the left or right subtrees as before. Eventually, we will reach an external node and add the new key-value pair (here encoded as a record ‘newNode’) as its right or left child, depending on the node’s key. In other words, we examine the root and recursively insert the new node to the left subtree if its key is less than that of the root, or the right subtree if its key is greater than or equal to the root.

Answer 89

A

There are three possible cases to consider:

Deleting a leaf (node with no children): Deleting a leaf is easy, as we can simply remove it from the tree.
Deleting a node with one child: Remove the node and replace it with its child.
Deleting a node with two children: Call the node to be deleted N. Do not delete N. Instead, choose either its in-order successor node or its in-order predecessor node, R. Replace the value of N with the value of R, then delete R.

As with all binary trees, a node’s in-order successor is the left-most child of its right subtree, and a node’s in-order predecessor is the right-most child of its left subtree. In either case, this node will have zero or one children. Delete it according to one of the two simpler cases above.

Answer 90

A

a n-ary (or k-ary) tree might result in better memory access patterns, because each node contains nodes next to eachother, which means the height of the tree is shorter and traversal might jump around less as well, since leaf nodes can contain multiple in-order keys.

Answer 91

A

Simple implementation—simpler than other self-balancing binary search trees, such as red-black trees or AVL trees.
Comparable performance—average-case performance is as efficient as other trees.
Small memory footprint—splay trees do not need to store any bookkeeping data.
Possibility of creating a persistent data structure version of splay trees—which allows access to both the previous and new versions after an update. This can be useful in functional programming, and requires amortized O(log n) space per update.
Working well with nodes containing identical keys—contrary to other types of self-balancing trees. Even with identical keys, performance remains amortized O(log n). All tree operations preserve the order of the identical nodes within the tree, which is a property similar to stable sorting algorithms. A carefully designed find operation can return the leftmost or rightmost node of a given key.

Answer 92

A

Perhaps the most significant disadvantage of splay trees is that the height of a splay tree can be linear. For example, this will be the case after accessing all n elements in non-decreasing order. Since the height of a tree corresponds to the worst-case access time, this means that the actual cost of an operation can be slow. However the amortized access cost of this worst case is logarithmic, O(log n). Also, the expected access cost can be reduced to O(log n) by using a randomized variant.[2]
A splay tree can be worse than a static tree by at most a constant factor.
Splay trees can change even when they are accessed in a ‘read-only’ manner (i.e. by find operations). This complicates the use of such splay trees in a multi-threaded environment. Specifically, extra management is needed if multiple threads are allowed to perform find operations concurrently.

Answer 93

A

a way to store an organize data in order to facilitate access and modifications.