CS_Algos Flashcards

1
Q

divide-and-conquer

A
  • take a problem break it down into smaller sub-problems
  • solve them recursively
  • combine the results of the sub-problems to get the solution to the original problem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Merge sort

A
  • recursive algorithm
  • canonical divide-and-conquer:
      • split the data in 2 parts,
      • solve recursively left half,
      • solve recursively right half
      • merge the 2 parts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Merge sort runtime

A
O(n log n)
- at level j in the *recursive tree*:
  -- 2^j sub-problems
  -- n /(2^j) items per sub-problem
  -- upper bound per sub-problem:
     --- 6 * #items
- total lines executed at level j:
2^j * [6 * n / (2^j)] = 6n
- total levels from root to leaf in *recursive tree*:
  log_2(n) + 1
- total lines executed in recursive tree:
  [log_2(n) + 1] * 6n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Asymptotic time complexity

A
  • big-O notation: (default) upper bound of the asymptotic computational complexity (worst-case scenario)
  • big-Omega notation: lower bound of the asymptotic computational complexity
  • big-Theta notation: (tight estimates) upper and lower bounds coincide: the function on the left can be sandwhiched between constant multiples of the function on the right: max[f, g] = Theta(f + g)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Counting inversions

A
  • inversion in a permutation p is pair i, j w/ i < j for which p(i) > p(j)
  • a comparison sorting algos (i.e. merge-sort) can be adapted to count inversions in a sequence
  • can be used as a measure of similarity between ratings sequences of 2 users
  • merging and counting the split inversions are done in O(n) and the overall sorting and counting in O(n log n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quick Sort

A
  • choose a good pivot
  • split the array in 2 parts: (1) < pivot, (2) > pivot
      • in linear time and in place (by swapping elements)
      • pivot ends up in the right position
      • keep the seen elements in the 2 partitions
      • use 2 counters: (a) the separator for 2 parts in seen elements, (b) the next element unseen yet
  • sort the partition (1)
  • sort the partition (2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random contraction algo

A
  • while there are more than 2 vertices in the graph:
      • pick a remaining edge (u,v) uniformly at random
      • merge u and v into a single node
      • remove self-loops and allow for parallel edges
  • return the cut represented by the final 2 vertices
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

BFS

A
  • explore the entire graph (all nodes) in layers
  • used for shortest path discovery (using dist(v) property), connected components (using unexplored property)
  • uses a queue, FIFO,
      • initialize it with start of the graph
      • append new unseen (unxeplored property) item to the end
      • extract the next item to process from the front
      • stop when the queue is empty
  • O(m+n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DFS

A
  • explore the entire graph (all nodes) aggresivelly withbacktrack
  • used for topological sort, strongly connected components in DAG
  • uses a stack, LIFO,
      • initialize it with start of the graph
      • append new unseen (unxeplored property) item to the front
      • extract the next item to process from the front
      • stop when the queue is empty
  • O(m+n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Topological sort

A

in a DG is a node labelling such that:

  • for f(v) in the set {1..m}
  • for any edge (u, v), f(u) < f(v)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Strongly connected components

A
  • for a DG graph
    1. reverse the graph
    2. on reversed graph do DFS and mark ‘finnishing time’ as new labelling (start with the main sink node, end of graph)
    3. on the normal graph with new labelliing do DFS repeatdly and mark all nodes discovered by a DFS with its leader (start node) : a SCC; start with max label/finnishing time from previous step
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Heap

A
  • container of keyed objects with parent’s key smaller than the children keys and the following operations:
      • EXTRACT-MIN O(log n)
      • INSERT O(log n)
      • HEAPIFY with n items batches O(n)
      • DELETE O(log n)
  • speeds up Djikstra algo to O(m log n)
  • the array representation is by levels in a binary tree such that children of node i are 2i and 2i + 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sorted arrays

A
  • static data structure with following operations
      • SEARCH O(log n)
      • RANK O(log n)
      • SELECT O(1)
      • MIN / MAX O(1)
      • PRED / SUCC O(1)
      • OUTPUT values O(n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Balanced binary search trees

A
  • similar to sorted arrays but are trees with key of left child smaller than parent’s, and the key of the right child is greater than parent’s
  • dynamic data structure with following operations
      • SEARCH O(log n)
      • RANK O(log n)
      • SELECT O(log n)
      • MIN / MAX O(log n)
      • PRED / SUCC O(log n)
      • INSERT O(log n)
      • DELETE O(log n
      • OUTPUT values O(n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Search tree property

A
  • all keys in the left seach subtree are smaller than the current node key
  • all keys in the right search subtree are greater than the current node key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Hash tables

A
  • the hash function, with diff ways of solving collisions:
      • chaining: using linked lists for buckets
      • open addressing: probes a sequence of hashed buckets, i.e. linear (first bucket empty in sequence), double hashing (2 hash functions with 2nd as an offset)
  • no order guarantee
  • operations:
      • INSERT O(1)
      • DELETE O(1)
      • LOOKUP O(1)
17
Q

Bloom filters

A
  • similar to hash tables with more efficient space a allocation used to remember values seen
  • aka HASH SETS with NO FALSE NEGATIVES guarantee
  • cons:
      • cannot store pointers to the objects just the values
      • can have FALSE POSITIVES with SMALL probability (may say a point is in when actually is not) i.e. validate new item in a list of weak pwd
      • cannot do deletes
  • operations:
      • INSERT
      • LOOKUP
18
Q

Python list - time complexity

A
(https://wiki.python.org/moin/TimeComplexity)
avg case, amortized worst case (CPython impl.):
Copy O(n), O(n)
Append O(1), O(1)
Pop last O(1), O(1)
Pop intermediate O(k), O(k)
Insert O(n), O(n)
Get Item O(1), O(1)
Set Item O(1), O(1)
Delete Item O(n), O(n)
Iteration O(n), O(n)
Get Slice O(k), O(k)
Del Slice O(n), O(n)
Set Slice O(k+n), O(k+n)
Extend O(k), O(k)
Sort O(n log n), O(n log n)
Multiply O(nk), O(nk)
x in s O(n),
min(s), max(s) O(n),
Get Length O(1), O(1)
19
Q

Python dict/set - time complexity

A
(https://wiki.python.org/moin/TimeComplexity)
avg case, amortized worst case (CPython impl.):
Copy O(n), O(n)
Get Item O(1), O(n)
Set Item O(1), O(n)
Delete Item O(1), O(n)
Iteration O(n), O(n) 
(set) x in s O(1), O(n)
20
Q

Python deque - time complexity

A
(https://wiki.python.org/moin/TimeComplexity)
avg case, amortized worst case (CPython impl.):
Copy O(n), O(n)
append O(1), O(1)
appendleft  O(1), O(1)
pop O(1), O(1)
popleft O(1), O(1)
extend O(k), O(k)
extendleft O(k), O(k)
rotate O(k), O(k)
remove O(n), O(n)
21
Q

Python open() - buffering

A

the default buffering policy for open():
- binary files will use the default buffer size of the host system
- text files will use a TextIOWrapper only if marked with attribute isatty (interactive tty or terminal)
otherwise specify explicitly the buffer size (> 1, in bytes) for binary files or 1 for text files