Subset 1 Flashcards

Subset 1 of Coding Problems

1
Q

<p>This problem was recently asked by Google.</p>

<p>Given a list of numbers and a number <code>k</code>, return whether any two numbers from the list add up to <code>k</code>.</p>

<p>For example, given <code>[10, 15, 3, 7]</code> and <code>k</code> of <code>17</code>, return true since <code>10 + 7</code> is <code>17</code>.</p>

<p>Bonus: Can you do this in one pass?</p>

A

<p>This problem can be solved in several different ways.</p>

<p>Brute force way would involve a nested iteration to check for every pair of numbers:</p>

<pre><code>def two_sum(lst, k):
for i in range(len(lst)):
for j in range(len(lst)):
if i != j and lst[i] + lst[j] == k:
return True
return False
</code></pre>

<p>This would take O(N2). Another way is to use a set to remember the numbers we've seen so far.
Then for a given number, we can check if there is another number that, if added, would sum to k.
This would be O(N) since lookups of sets are O(1) each.</p>

<pre><code>def two_sum(lst, k):
seen = set()
for num in lst:
if k - num in seen:
return True
seen.add(num)
return False
</code></pre>

<p>Yet another solution involves sorting the list. We can then iterate through the list and run a binary search on <code>K - lst[i]</code>. Since we run binary search on N elements, this would take O(N log N) with O(1) space.</p>

<pre><code>from bisect import bisect_left

def two_sum(lst, K):
lst.sort()

for i in range(len(lst)):
target = K - lst[i]
j = binary_search(lst, target)

# Check that binary search found the target and that it's not in the same index
# as i. If it is in the same index, we can check lst[i + 1] and lst[i - 1] to see
# if there's another number that's the same value as lst[i].
if j == -1:
continue
elif j != i:
return True
elif j + 1 < len(lst) and lst[j + 1] == target:
return True
elif j - 1 >= 0 and lst[j - 1] == target:
return True
return False

def binary_search(lst, target):
lo = 0
hi = len(lst)
ind = bisect_left(lst, target, lo, hi)

if 0 <= ind < hi and lst[ind] == target:
return ind
return -1
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

<p>This problem was asked by Uber.</p>

<p>Given an array of integers, return a new array such that each element at index <code>i</code> of the new array is the product of all the numbers in the original array except the one at <code>i</code>.</p>

<p>For example, if our input was <code>[1, 2, 3, 4, 5]</code>, the expected output would be <code>[120, 60, 40, 30, 24]</code>.
If our input was <code>[3, 2, 1]</code>, the expected output would be <code>[2, 3, 6]</code>.</p>

<p>Follow-up: what if you can't use division?</p>

A

<p>This problem would be easy with division: an optimal solution could
just find the product of all numbers in the array and then divide
by each of the numbers.</p>

<p>Without division, another approach would be to first see that the ith
element simply needs the product of numbers before i and the product
of numbers after i. Then we could multiply those two numbers to get
our desired product.</p>

<p>In order to find the product of numbers before i, we can generate
a list of prefix products. Specifically, the ith element in the list
would be a product of all numbers including i. Similarly, we would generate
the list of suffix products.</p>

<pre><code>def products(nums):
# Generate prefix products
prefix_products = []
for num in nums:
if prefix_products:
prefix_products.append(prefix_products[-1] * num)
else:
prefix_products.append(num)

# Generate suffix products
suffix_products = []
for num in reversed(nums):
if suffix_products:
suffix_products.append(suffix_products[-1] * num)
else:
suffix_products.append(num)
suffix_products = list(reversed(suffix_products))

# Generate result
result = []
for i in range(len(nums)):
if i == 0:
result.append(suffix_products[i + 1])
elif i == len(nums) - 1:
result.append(prefix_products[i - 1])
else:
result.append(prefix_products[i - 1] * suffix_products[i + 1])
return result
</code></pre>

<p>This runs in O(N) time and space, since iterating over the input arrays takes O(N)
time and creating the prefix and suffix arrays take up O(N) space.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

<p>This problem was asked by Google.</p>

<p>Given the root to a binary tree, implement <code>serialize(root)</code>, which serializes the tree into a
string, and <code>deserialize(s)</code>, which deserializes the string back into the tree.</p>

<p>For example, given the following <code>Node</code> class</p>

<pre><code>class Node:
def \_\_init\_\_(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
</code></pre>

<p>The following test should pass:</p>

<pre><code>node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'
</code></pre>

A

<p>There are many ways to serialize and deserialize a binary tree, so don't worry
if your solution differs from this one. We will be only going through one
possible solution.</p>

<p>We can approach this problem by first figuring out what we would like the
serialized tree to look like. Ideally, it would contain the minimum
information required to encode all the necessary information about
the binary tree. One possible encoding might be to borrow <a>S-expressions</a>
from Lisp. The tree <code>Node(1, Node(2), Node(3))</code> would then look like
'(1 (2 () ()) (3 () ()))', where the empty brackets denote nulls.</p>

<p>To minimize data over the hypothetical wire, we could go a step further
and prune out some unnecessary brackets. We could also replace the
2-character '()' with '#'. We can then infer leaf nodes by their form
'val # #' and thus get the structure of the tree that way. Then our
tree would look like <code>1 2 # # 3 # #</code>.</p>

<pre><code>def serialize(root):
if root is None:
return <span>'#'
return <span>'{} {} {}'.format(root.val, serialize(root.left), serialize(root.right))

def deserialize(data):
def helper():
val = next(vals)
if val == <span>'#':
return None
node = Node(int(val))
node.left = helper()
node.right = helper()
return node
vals = iter(data.split())
return helper()
</span></span></span></code></pre>

<p>This runs in O(N) time and space, since we iterate over the whole tree when serializing and deserializing.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

<p>This problem was asked by Stripe.</p>

<p>Given an array of integers, find the first missing positive integer in linear time and constant space.
In other words, find the lowest positive integer that does not exist in the array.
The array can contain duplicates and negative numbers as well.</p>

<p>For example, the input <code>[3, 4, -1, 1]</code> should give <code>2</code>. The input <code>[1, 2, 0]</code> should give <code>3</code>.</p>

<p>You can modify the input array in-place.</p>

A

<p>Our lives would be easier without the linear time constraint:
we would just sort the array, while filtering out negative numbers,
and iterate over the sorted array and return the first number that doesn't
match the index. However, sorting takes O(n log n), so we can't use that here.</p>

<p>Clearly we have to use some sort of trick here to get it running in linear time.
Since the first missing positive number must be between 1 and len(array) + 1 (why?),
we can ignore any negative numbers and numbers bigger than len(array).
The basic idea is to use the indices of the array itself to reorder the elements
to where they should be. We traverse the array and swap elements between 0
and the length of the array to their value's index. We stay at each index until
we find that index's value and keep on swapping.</p>

<p>By the end of this process, all the first positive numbers should be grouped in
order at the beginning of the array. We don't care about the others.
This only takes O(N) time, since we swap each element at most once.</p>

<p>Then we can iterate through the array and return the index of the first number that doesn't match, just like before.</p>

<pre><code>def first_missing_positive(nums):
if not nums:
return 1
for i, num in enumerate(nums):
while i + 1 != nums[i] and 0 < nums[i] <= len(nums):
v = nums[i]
nums[i], nums[v - 1] = nums[v - 1], nums[i]
if nums[i] == nums[v - 1]:
break
for i, num in enumerate(nums, 1):
if num != i:
return i
return len(nums) + 1
</code></pre>

<p>Another way we can do this is by adding all the numbers to a set, and then
use a counter initialized to 1. Then continuously increment the counter and
check whether the value is in the set.</p>

<pre><code>def first_missing_positive(nums):
s = set(nums)
i = 1
while i in s:
i += 1
return i
</code></pre>

<p>This is much simpler, but runs in O(N) time and space, whereas the previous algorithm uses no extra space.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

<p>This problem was asked by Jane Street.</p>

<p><code>cons(a, b)</code> constructs a pair, and <code>car(pair)</code> and <code>cdr(pair)</code> returns the first and last element of that pair. For example, <code>car(cons(3, 4))</code> returns <code>3</code>, and <code>cdr(cons(3, 4))</code> returns <code>4</code>.</p>

<p>Given this implementation of cons:</p>

<pre><code>def cons(a, b):
def pair(f):
return f(a, b)
return pair
</code></pre>

<p>Implement <code>car</code> and <code>cdr</code>.</p>

A

<p>This is a really cool example of using <a>closures</a> to store data. We must look
at the signature type of cons to retrieve its first and last elements. cons takes in
a and b, and returns a new anonymous function, which itself takes in f, and calls
f with a and b. So the input to car and cdr is that anonymous function, which is <code>pair</code>. To get a
and b back, we must feed it yet another function, one that takes in two parameters
and returns the first (if car) or last (if cdr) one.</p>

<pre><code>def car(pair):
return pair(lambda a, b: a)

def cdr(pair):
return pair(lambda a, b: b)
</code></pre>

<p>Fun fact: cdr is pronounced "cudder"!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

<p>This problem was asked by Google.</p>

<p>An XOR linked list is a more memory efficient doubly linked list.
Instead of each node holding <code>next</code> and <code>prev</code> fields, it holds a field named <code>both</code>,
which is an XOR of the next node and the previous node. Implement
an XOR linked list; it has an <code>add(element)</code> which adds the element to the
end, and a <code>get(index)</code> which returns the node at index.</p>

<p>If using a language that has no pointers (such as Python), you can assume you have access to <code>get_pointer</code> and
<code>dereference_pointer</code> functions that converts between nodes and memory addresses.</p>

A

<p>For the head, <code>both</code> will just be the address of next, and if it's the tail, it should
just be the address of prev. And intermediate nodes should have an XOR of <code>next</code> and <code>prev</code>.</p>

<p>Here's an example XOR linked list which meets the above conditions:</p>

<pre><code>A <-> B <-> C <-> D

B A ⊕ C B ⊕ D C
</code></pre>

<p>Let's work through <code>get</code> first, assuming that the above conditions are maintained. Then, given a
node, to go to the next node, we have to XOR the current node's <code>both</code> with the previous node's
address. And to handle getting the next node from the head, we would initialize the previous node's address as 0.</p>

<p>So in the above example, <code>A</code>'s <code>both</code> is <code>B</code> which when XOR'd with <code>0</code> would become <code>B</code>.
Then <code>B</code>'s <code>both</code> is <code>A ⊕ C</code>, which when XOR'd with <code>A</code> becomes C, etc.</p>

<p>To implement <code>add</code>, we would need to update current tail's <code>both</code> to be XOR'd by its current <code>both</code> the new node's memory address.
Then the new node's <code>both</code> would just point to the memory address of the current tail. Finally, we'd update
the current tail to be equal to the new node.</p>

<pre><code>import ctypes


# This is hacky. It's a data structure for C, not python.
class Node(object):
def \_\_init\_\_(self, val):
self.val = val
self.both = 0


class XorLinkedList(object):
def \_\_init\_\_(self):
self.head = self.tail = None
self.\_\_nodes = [] # This is to prevent garbage collection

def add(self, node):
if self.head is None:
self.head = self.tail = node
else:
self.tail.both = id(node) ^ self.tail.both
node.both = id(self.tail)
self.tail = node

# Without this line, Python thinks there is no way to reach nodes between
# head and tail.
self.\_\_nodes.append(node)


def get(self, index):
prev_id = 0
node = self.head
for i in range(index):
next_id = prev_id ^ node.both

if next_id:
prev_id = id(node)
node = _get_obj(next_id)
else:
raise IndexError(<span>'Linked list index out of range')
return node


def _get_obj(id):
return ctypes.cast(id, ctypes.py_object).value
</span></code></pre>

<p><code>add</code> runs in O(1) time and <code>get</code> runs in O(N) time.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

<p>This problem was asked by Facebook.</p>

<p>Given the mapping a = 1, b = 2, ... z = 26, and an encoded message, count the number of ways it can be decoded.</p>

<p>For example, the message '111' would give 3, since it could be decoded as 'aaa', 'ka', and 'ak'.</p>

<p>You can assume that the messages are decodable. For example, '001' is not allowed.</p>

A

<p>This looks like a problem that is ripe for solving with recursion. First, let's try to
think of a recurrence we can use for this problem. We can try some cases:</p>

<ul><li>"", the empty string and our base case, should return 1.</li><li>"1" should return 1, since we can parse it as "a" + "".</li><li>"11" should return 2, since we can parse it as "a" + "a" + "" and "k" + "".</li><li>"111" should return 3, since we can parse it as:<ul><li>"a" + "k" + ""</li><li>"k" + "a" + ""</li><li>"a" + "a" + "a" + "".</li></ul></li><li>"011" should return 0, since no letter starts with 0 in our mapping.</li><li>"602" should also return 0 for similar reasons.</li></ul>

<p>We have a good starting point. We can see that the recursive structure is
as follows:</p>

<ul><li>If string starts with zero, then there's no valid encoding.</li><li>If the string's length is less than or equal to 1, there is only 1 encoding.</li><li>If the first two digits form a number <code>k</code> that is less than or equal to 26, we can recursively
count the number of encodings assuming we pick <code>k</code> as a letter.</li><li>We can also pick the first digit as a letter and count the number of encodings with this assumption.</li></ul>

<pre><code>def num_encodings(s):
if s.startswith(<span>'0'):
return 0
elif len(s) <= 1: # This covers empty string
return 1

total = 0

if int(s[:2]) <= 26:
total += num_encodings(s[2:])

total += num_encodings(s[1:])
return total
</span></code></pre>

<p>However, this solution is not very efficient. Every branch calls itself recursively twice,
so our runtime is O(2n). We can do better by using dynamic programming.</p>

<p>All the following code does is repeat the same computation as above except starting from
the base case and building up the solution. Since each iteration takes O(1), the whole
algorithm now takes O(n).</p>

<pre><code>from collections import defaultdict

def num_encodings(s):
# On lookup, this hashmap returns a default value of 0 if the key doesn't exist
# cache[i] gives us # of ways to encode the substring s[i:]
cache = defaultdict(int)
cache[len(s)] = 1 # Empty string is 1 valid encoding

for i in reversed(range(len(s))):
if s[i].startswith(<span>'0'):
cache[i] = 0
elif i == len(s) - 1:
cache[i] = 1
else:
if int(s[i:i + 2]) <= 26:
cache[i] = cache[i + 2]
cache[i] += cache[i + 1]
return cache[0]
</span></code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

<p>This problem was asked by Google.</p>

<p>A unival tree (which stands for "universal value") is a tree where all nodes under it have the same value.</p>

<p>Given the root to a binary tree, count the number of unival subtrees.</p>

<p>For example, the following tree has 5 unival subtrees:</p>

<pre><code> 0
/ \
1 0
/ \
1 0
/ \
1 1
</code></pre>

A

<p>To start off, we should go through some examples.</p>

<pre><code> a
/ \
a a
/\
a a
\
A
</code></pre>

<p>This tree has 3 unival subtrees: the two 'a' leaves, and the one 'A' leaf. The 'A' leaf causes all its
parents to not be counted as a unival tree.</p>

<pre><code> a
/ \
c b
/\
b b
\
b
</code></pre>

<p>This tree has 5 unival subtrees: the leaf at 'c', and every 'b'.</p>

<p>We can start off by first writing a function that checks whether a tree is unival or not.
Then, perhaps we could use this to count up all the nodes in the tree.</p>

<p>To check whether a tree is a unival tree, we must check that every node in the tree has
the same value. To start off, we could define an <code>is_unival</code> function that takes in
a root to a tree. We would do this recursively with a helper function. Recall that
a leaf qualifies as a unival tree.</p>

<pre><code>def is_unival(root):
return unival_helper(root, root.value)

def unival_helper(root, value):
if root is None:
return True
if root.value == value:
return unival_helper(root.left, value) and unival_helper(root.right, value)
return False
</code></pre>

<p>And then our function that counts the number of subtrees could simply use that function:</p>

<pre><code>def count_unival_subtrees(root):
if root is None:
return 0
left = count_unival_subtrees(root.left)
right = count_unival_subtrees(root.right)
return 1 + left + right if is_unival(root) else left + right
</code></pre>

<p>However, this runs in O(n^2) time. For each node of the tree, we're evaluating
each node in its subtree again as well. We can improve the runtime by starting
at the leaves of the tree, and keeping track of the unival subtree count and value
as we percolate back up. This should evaluate each node only once, making it
run in O(n) time.</p>

<pre><code>def count_unival_subtrees(root):
count, _ = helper(root)
return count

# Also returns number of unival subtrees, and whether it is itself a unival subtree.
def helper(root):
if root is None:
return 0, True

left_count, is_left_unival = helper(root.left)
right_count, is_right_unival = helper(root.right)
total_count = left_count + right_count

if is_left_unival and is_right_unival:
if root.left is not None and root.value != root.left.value:
return total_count, False
if root.right is not None and root.value != root.right.value:
return total_count, False
return total_count + 1, True
return total_count, False
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

<p>This problem was asked by Airbnb.</p>

<p>Given a list of integers, write a function that returns the largest sum of non-adjacent numbers. Numbers can be <code>0</code> or negative.</p>

<p>For example, <code>[2, 4, 6, 2, 5]</code> should return <code>13</code>, since we pick <code>2</code>, <code>6</code>, and <code>5</code>. <code>[5, 1, 1, 5]</code> should return <code>10</code>, since we pick <code>5</code> and <code>5</code>.</p>

<p>Follow-up: Can you do this in O(N) time and constant space?</p>

A

<p>This problem seems easy from the surface, but is actually quite tricky. It's tempting
to try to use a greedy strategy like pick the largest number (or first), then the 2nd-largest if
it's non-adjacent and so on, but these don't work -- there will always be some edge case that
breaks it.</p>

<p>Instead, we should look at this problem recursively. Say we had a function that already
returns the largest sum of non-adjacent integers on smaller inputs. How could we use it to figure out
what we want?</p>

<p>Say we used this function on our array from <code>a[1:]</code> and <code>a[2:]</code>. Then our solution should be
<code>a[1:]</code> OR <code>a[0] + a[2:]</code>, whichever is largest. This is because choosing <code>a[1:]</code> precludes us
from picking <code>a[0]</code>. So, we could write a straightforward recursive solution like this:</p>

<pre><code>def largest_non_adjacent(arr):
if not arr:
return 0

return max(
largest_non_adjacent(arr[1:]),
arr[0] + largest_non_adjacent(arr[2:]))
</code></pre>

<p>However, this solution runs in O(2n) time, since with each call, we're making two further recursive calls.
We could memoize the results, or use dynamic programming to store, in an array, the largest sum of
non-adjacent numbers from index <code>0</code> up to that point. Like so:</p>

<pre><code>def largest_non_adjacent(arr):
if len(arr) <= 2:
return max(0, max(arr))

cache = [0 for i in arr]
cache[0] = max(0, arr[0])
cache[1] = max(cache[0], arr[1])

for i in range(2, len(arr)):
num = arr[i]
cache[i] = max(num + cache[i - 2], cache[i - 1])
return cache[-1]
</code></pre>

<p>This code should run in O(n) and in O(n) space. But we can improve this even further. Notice that we only
ever use the last two elements of the cache when iterating through the array. This suggests that we could
just get rid of most of the array and just store them as variables:</p>

<pre><code>def largest_non_adjacent(arr):
if len(arr) <= 2:
return max(0, max(arr))

max_excluding_last= max(0, arr[0])
max_including_last = max(max_excluding_last, arr[1])

for num in arr[2:]:
prev_max_including_last = max_including_last

max_including_last = max(max_including_last, max_excluding_last + num)
max_excluding_last = prev_max_including_last

return max(max_including_last, max_excluding_last)
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

<p>This problem was asked by Apple.</p>

<p>Implement a job scheduler which takes in a function <code>f</code> and an integer <code>n</code>, and calls <code>f</code> after <code>n</code> milliseconds.</p>

A

<p>We can implement the job scheduler in many different ways, so don't worry if
your solution is different from ours. Here is just one way:</p>

<p>First, let's try the most straightforward solution. That would probably be
to spin off a new thread on each function we want to delay, sleep the
requested amount, and then run the function. It might look something like this:</p>

<pre><code>import threading
from time import sleep

class Scheduler:
def \_\_init\_\_(self):
pass

def delay(self, f, n):
def sleep_then_call(n):
sleep(n / 1000)
f()
t = threading.Thread(target=sleep_then_call)
t.start()
</code></pre>

<p>While this works, there is a huge problem with this method: we spin off a new
thread each time we call delay! That means the number of threads we use could
easily explode. We can get around this by having only one dedicated thread
to call the functions, and storing the functions we need to call in some data
structure. In this case, we use a list. We also have to do some sort of polling now to
check when to run a function. We can store each function along with a unix epoch timestamp
that tells it when it should run by. Then we'll poll some designated tick amount and
check the list for any jobs that are due to be run, run them, and then remove them
from the list.</p>

<pre><code>from time import sleep
import threading

class Scheduler:
def \_\_init\_\_(self):
self.fns = [] # tuple of (fn, time)
t = threading.Thread(target=self.poll)
t.start()

def poll(self):
while True:
now = time() * 1000
for fn, due in self.fns:
if now > due:
fn()
self.fns = [(fn, due) for (fn, due) in self.fns if due > now]
sleep(0.01)

def delay(self, f, n):
self.fns.append((f, time() * 1000 + n))
</code></pre>

<p>We'll stop here, but you can go much farther with this. Some extra credit work:</p>

<ul><li>Extend the scheduler to allow calling delayed functions with variables</li><li>Use a heap instead of a list to keep track of the next job to run more efficiently</li><li>Use a condition variable instead of polling (it just polls lower in the stack)</li><li>Use a threadpool or other mechanism to decrease the chance of starvation (one thread
not being able to run because of another running thread)</li></ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

<p>This problem was asked by Twitter.</p>

<p>Implement an autocomplete system. That is, given a query string <code>s</code> and a set of all possible query strings,
return all strings in the set that have s as a prefix.</p>

<p>For example, given the query string <code>de</code> and the set of strings [<code>dog</code>, <code>deer</code>, <code>deal</code>], return [<code>deer</code>, <code>deal</code>].</p>

<p>Hint: Try preprocessing the dictionary into a more efficient data structure to speed up queries.</p>

A

<p>The naive solution here is very straightforward: we need to only iterate over the
dictionary and check if each word starts with our prefix. If it does, then
add it to our set of results and then return it once we're done.</p>

<pre><code>WORDS = [<span>'foo', <span>'bar', ...]
def autocomplete(s):
results = set()
for word in WORDS:
if word.startswith(s):
results.add(word)
return results
</span></span></code></pre>

<p>This runs in O(N) time, where N is the number of words in the dictionary.
Let's think about making this more efficient. We can preprocess the words,
but what data structure would be best for our problem?</p>

<p>If we pre-sort the list, we could use binary search to find the first word
that includes our prefix and then the last, and return everything in between.</p>

<p>Alternatively, we could use a tree for this. Not a binary tree, but a tree where each child
represents one character of the alphabet. For example, let's say we had the
words 'a' and 'dog' in our dictionary. Then the tree would look like this:</p>

<pre><code> x
/ \
a d
\
o
\
g
</code></pre>

<p>Then, to find all words beginning with 'do', we could start at the root,
go into the 'd' child, and then the 'o', child, and gather up all the words
under there. We would also some sort of terminal value to mark whether or not
'do' is actually a word in our dictionary or not. This data structure is known
as a <a>trie</a>.</p>

<p>So the idea is to preprocess the dictionary into this tree, and then when
we search for a prefix, go into the trie and get all the words under
that prefix node and return those. While the worst-case runtime would still be
O(n) if all the search results have that prefix, if the words are uniformly
distributed across the alphabet, it should be much faster on average since we
no longer have to evaluate words that don't start with our prefix.</p>

<pre><code>ENDS_HERE = <span>'\_\_ENDS_HERE'

class Trie(object):
def \_\_init\_\_(self):
self._trie = {}

def insert(self, text):
trie = self._trie
for char in text:
if char not in trie:
trie[char] = {}
trie = trie[char]
trie[ENDS_HERE] = True

def elements(self, prefix):
d = self._trie
for char in prefix:
if char in d:
d = d[char]
else:
return []
return self._elements(d)

def _elements(self, d):
result = []
for c, v in d.items():
if c == ENDS_HERE:
subresult = [<span>'']
else:
subresult = [c + s for s in self._elements(v)]
result.extend(subresult)
return result

trie = Trie()
for word in words:
trie.insert(word)

def autocomplete(s):
suffixes = trie.elements(s)
return [s + w for w in suffixes]
</span></span></code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

<p>This problem was asked by Amazon.</p>

<p>There exists a staircase with N steps, and you can climb up either 1 or 2 steps at a time.
Given N, write a function that returns the number of unique ways you can climb the staircase.
The order of the steps matters.</p>

<p>For example, if N is 4, then there are 5 unique ways:</p>

<ul><li>1, 1, 1, 1</li><li>2, 1, 1</li><li>1, 2, 1</li><li>1, 1, 2</li><li>2, 2</li></ul>

<p>What if, instead of being able to climb 1 or 2 steps at a time, you could climb any number
from a set of positive integers X? For example, if X = {1, 3, 5}, you could climb 1, 3, or 5
steps at a time.</p>

A

<p>It's always good to start off with some test cases. Let's start with small cases
and see if we can find some sort of pattern.</p>

<ul><li>N = 1: [1]</li><li>N = 2: [1, 1], [2]</li><li>N = 3: [1, 2], [1, 1, 1], [2, 1]</li><li>N = 4: [1, 1, 2], [2, 2], [1, 2, 1], [1, 1, 1, 1], [2, 1, 1]</li></ul>

<p>What's the relationship?</p>

<p>The only ways to get to N = 3, is to first get to N = 1, and then go up by 2
steps, or get to N = 2 and go up by 1 step. So f(3) = f(2) + f(1).</p>

<p>Does this hold for N = 4? Yes, it does. Since we can only get to the 4th step
by getting to the 3rd step and going up by one, or by getting to the 2nd step
and going up by two. So f(4) = f(3) + f(2).</p>

<p>To generalize, f(n) = f(n - 1) + f(n - 2). That's just the <a>Fibonacci sequence</a>!</p>

<pre><code>def staircase(n):
if n <= 1:
return 1
return staircase(n - 1) + staircase(n - 2)
</code></pre>

<p>Of course, this is really slow (O(2N)) — we are doing a lot of repeated computations!
We can do it a lot faster by just computing iteratively:</p>

<pre><code>def staircase(n):
a, b = 1, 2
for _ in range(n - 1):
a, b = b, a + b
return a
</code></pre>

<p>Now, let's try to generalize what we've learned so that it works if you can take a number of steps
from the set X. Similar reasoning tells us that if X = {1, 3, 5}, then our algorithm should be
f(n) = f(n - 1) + f(n - 3) + f(n - 5). If n < 0, then we should return 0 since we can't start
from a negative number of steps.</p>

<pre><code>def staircase(n, X):
if n < 0:
return 0
elif n == 0:
return 1
else:
return sum(staircase(n - x, X) for x in X)
</code></pre>

<p>This is again, very slow (O(|X|N)) since we are repeating computations again. We can
use dynamic programming to speed it up.</p>

<p>Each entry cache[i] will contain the number of ways we can get to step i with the set X.
Then, we'll build up the array from zero using the same recurrence as before:</p>

<pre><code>def staircase(n, X):
cache = [0 for _ in range(n + 1)]
cache[0] = 1
for i in range(1, n + 1):
cache[i] += sum(cache[i - x] for x in X if i - x >= 0)
return cache[n]
</code></pre>

<p>This now takes O(N * |X|) time and O(N) space.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

<p>This problem was asked by Amazon.</p>

<p>Given an integer k and a string s, find the length of the longest substring that contains at most k distinct characters.</p>

<p>For example, given s = "abcba" and k = 2, the longest substring with k distinct characters is "bcb".</p>

A

<p>The most obvious brute force solution here is to simply try every possible substring of the string
and check whether it contains at most <code>k</code> distinct characters. If it does and it is greater than
the current longest valid substring, then update the current one. This takes O(n2 * k) time,
since we use n2 to generate each possible substring, and then take <code>k</code> to check each character.</p>

<pre><code>def longest_substring_with_k_distinct_characters(s, k):
current_longest_substring = <span>''
for i in range(len(s)):
for j in range(i + 1, len(s) + 1):
substring = s[i:j]
if len(set(substring)) <= k and len(substring) > len(current_longest_substring):
current_longest_substring = substring
return len(current_longest_substring)
</span></code></pre>

<p>We can improve this by instead keeping a running window of our longest substring.
We'll keep a dictionary that maps characters to the index of their last occurrence. Then,
as we iterate over the string, we'll check the size of the dictionary. If it's larger
than k, then it means our window is too big, so we have to pop the smallest item
in the dictionary and recompute the bounds. If, when we add a character to the
dictionary and it doesn't go over k, then we're safe -- the dictionary hasn't been
filled up yet or it's a character we've seen before.</p>

<pre><code>def longest_substring_with_k_distinct_characters(s, k):
if k == 0:
return 0

# Keep a running window
bounds = (0, 0)
h = {}
max_length = 0
for i, char in enumerate(s):
h[char] = i
if len(h) <= k:
new_lower_bound = bounds[0] # lower bound remains the same
else:
# otherwise, pop last occurring char
key_to_pop = min(h, key=h.get)
new_lower_bound = h.pop(key_to_pop) + 1

bounds = (new_lower_bound, bounds[1] + 1)
max_length = max(max_length, bounds[1] - bounds[0])

return max_length
</code></pre>

<p>This takes O(n * k) time and O(k) space.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

<p>This problem was asked by Google.</p>

<p>The area of a circle is defined as πr^2. Estimate π to 3 decimal places using a Monte Carlo method.</p>

<p>Hint: The basic equation of a circle is x2 + y2 = r2.</p>

A

<p>Monte Carlo methods rely on random sampling. In this case, if we take a cartesian plane and inscribe a
circle with radius <code>r</code> inside a square with lengths <code>2r</code>, then the area of the circle will be πr2 while
the area of the square will be (2r)2 = 4r2. Then, the ratio of the areas of the circle to the square
is <code>π / 4</code>.</p>

<p>So, what we can do is the following:</p>

<ul><li>Set r to be 1 (the unit circle)</li><li>Randomly generate points within the square with corners (-1, -1), (1, 1), (1, -1), (-1, 1)</li><li>Keep track of the points that fall inside and outside the circle<ul><li>You can check whether a point (x, y) is inside the circle if x2 + y2 < r2, which is another way of representing a circle</li></ul></li><li>Divide the number of points that fall inside the circle to the total number of points -- that should give us an approximation of π / 4.</li></ul>

<pre><code>from random import uniform
from math import pow

def generate():
return (uniform(-1, 1), uniform(-1, 1))

def is_in_circle(coords):
return coords[0] * coords[0] + coords[1] * coords[1] < 1

def estimate():
iterations = 10000000
in_circle = 0
for _ in range(iterations):
if is_in_circle(generate()):
in_circle += 1
pi_over_four = in_circle / iterations
return pi_over_four * 4
</code></pre>

<p>Note that this doesn't give a perfect approximation -- we need more iterations to get a closer estimate.
We want the digits of pi up to 3 decimal places. This translates to an error of < 10^(-3). The error
scales with the square root of the number of guesses, which means we need 10^6 iterations to get to
our desired precision. If we want more precision, we'll have to crank up the iterations.</p>

<p>This problem _is_ <a>embarrassingly parallel</a>.
None of the estimations have any dependent computations, so we can parallelize this problem easily -- divide up the workload into <code>P</code> processes you have,
and then add up all the points in the circle in the end. Extra credit: make this program multi-process.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

<p>This problem was asked by Facebook.</p>

<p>Given a stream of elements too large to store in memory, pick a random element from the stream with uniform probability.</p>

A

<p>Naively, we could process the stream and store all the elements we encounter in a list, find its size,
and pick a random element from [0, size - 1]. The problem with this approach is that it would take
O(N) space for a large N.</p>

<p>Instead, let’s attempt to solve using loop invariants. On the ith iteration of our loop to pick a
random element, let’s assume we already picked an element uniformly from [0, i - 1]. In order to
maintain the loop invariant, we would need to pick the ith element as the new random element at 1 / (i + 1)
chance. For the base case where i = 0, let’s say the random element is the first one. Then we know it
works because</p>

<ul><li>For i >= 0, before the loop began, any element K in [0, i - 1] had 1 / i chance of being chosen as the random element. We want K to have 1 / (i + 1) chance of being chosen after the iteration. This is the case since the chance of having being chosen already but not getting swapped with the ith element is 1 / i <em> (1 - (1 / (i + 1))) which is 1 / i </em> i / (i + 1) or 1 / (i + 1)</li></ul>

<p>Let’s see how the code would look:</p>

<pre><code>import random

def pick(big_stream):
random_element = None

for i, e in enumerate(big_stream):
if random.randint(1, i + 1) == 1:
random_element = e
return random_element
</code></pre>

<p>Since we are only storing a single variable, this only takes up constant space!</p>

<p>By the way, this is called <a>reservoir sampling</a>!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

<p>This problem was asked by Twitter.</p>

<p>You run an e-commerce website and want to record the last <code>N</code> <code>order</code> ids in a log.
Implement a data structure to accomplish this, with the following API:</p>

<ul><li>record(order_id): adds the order_id to the log</li><li>get_last(i): gets the ith last element from the log. i is guaranteed to be smaller than or equal to N.</li></ul>

<p>You should be as efficient with time and space as possible.</p>

A

<p>It seems like an array would be the perfect fit for this problem. We can just initialize the
array to have size N, and index it in constant time. Then, when we record any orders, we can
pop off the first order and append it to the end. Getting the ith last order would then just
be indexing the array at <code>length - i</code>.</p>

<pre><code>class Log(object):
def \_\_init\_\_(self, n):
self._log = []
self.n = n

def record(self, order_id):
if len(self._log) >= self.n:
self._log.pop(0)
self._log.append(order_id)

def get_last(self, i):
return self._log[-i]
</code></pre>

<p>This is one issue with this solution, however: when we have to pop off an element when the
array is full, we have to move every other element down by 1. That means <code>record</code> takes O(N)
time. How can we improve this?</p>

<p>What we can do to avoid having to moving every element down by 1 is to keep a current index and move it up
each time we record something. For <code>get_last</code>, we can simply take <code>current - i</code> to get the appropriate element.
Now, both <code>record</code> and <code>get_last</code> should take constant time.</p>

<pre><code>class Log(object):
def \_\_init\_\_(self, n):
self.n = n
self._log = []
self._cur = 0

def record(self, order_id):
if len(self._log) == self.n:
self._log[self._cur] = order_id
else:
self._log.append(order_id)
self._cur = (self._cur + 1) % self.n

def get_last(self, i):
return self._log[self._cur - i]
</code></pre>

<p>By the way, this is called a ring buffer or <a>circular buffer</a>!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

<p>This problem was asked by Google.</p>

<p>Suppose we represent our file system by a string in the following manner:</p>

<p>The string <code>"dir\n\tsubdir1\n\tsubdir2\n\t\tfile.ext"</code> represents:</p>

<pre><code>dir
subdir1
subdir2
file.ext
</code></pre>

<p>The directory <code>dir</code> contains an empty sub-directory <code>subdir1</code> and a sub-directory <code>subdir2</code> containing a file <code>file.ext</code>.</p>

<p>The string <code>"dir\n\tsubdir1\n\t\tfile1.ext\n\t\tsubsubdir1\n\tsubdir2\n\t\tsubsubdir2\n\t\t\tfile2.ext"</code> represents:</p>

<pre><code>dir
subdir1
file1.ext
subsubdir1
subdir2
subsubdir2
file2.ext
</code></pre>

<p>The directory <code>dir</code> contains two sub-directories <code>subdir1</code> and <code>subdir2</code>. <code>subdir1</code> contains a file <code>file1.ext</code>
and an empty second-level sub-directory <code>subsubdir1</code>. <code>subdir2</code> contains a second-level sub-directory
<code>subsubdir2</code> containing a file <code>file2.ext</code>.</p>

<p>We are interested in finding the longest (number of characters) absolute path to a file within our
file system. For example, in the second example above, the longest absolute path is <code>"dir/subdir2/subsubdir2/file2.ext"</code>,
and its length is 32 (not including the double quotes).</p>

<p>Given a string representing the file system in the above format, return the length of the longest
absolute path to a file in the abstracted file system. If there is no file in the system, return 0.</p>

<p>Note:</p>

<p>The name of a file contains at least a period and an extension.</p>

<p>The name of a directory or sub-directory will not contain a period.</p>

A

<p>There are two steps in solving this question: we must first parse the string
representing the file system and then get the longest absolute path to a file.</p>

<h6>Step 1: Parsing the file system</h6>

<p>Ideally, we would initially parse the string given into a dictionary of some sort.
That would mean a string like:</p>

<pre><code>dir\n\tsubdir1\n\t\tfile1.ext\n\t\tsubsubdir1\n\tsubdir2\n\t\tsubsubdir2\n\t\t\tfile2.ext
</code></pre>

<p>would become:</p>

<pre><code>{
<span>"dir": {
<span>"subdir1": {
<span>"file1.ext": True,
<span>"subsubdir1": {}
},
<span>"subdir2": {
<span>"subsubdir2": {
<span>"file2.ext": True
}
}
}
}
</span></span></span></span></span></span></span></code></pre>

<p>where each key with a dictionary as its value represents a directory, and a key
with <code>True</code> as its value represents an actual file.</p>

<p>To achieve this, we can first split the string by the newline character, meaning each
item in our array represents a file or directory. Then, we create an empty dictionary
to represent our parsed file system and traverse the file system on each entry.
We keep track of the last path we've seen so far in <code>current_path</code> because we may need
to return to some level in that path, depending on the number of tabs. Once we are at
the correct place to put down the new directory or file, we check the name for a <code>.</code>
and set the correct value to either <code>True</code> (if file) or <code>{}</code> (if directory).</p>

<pre><code>def build_fs(input):
fs = {}
files = input.split(<span>'\n')

current_path = []
for f in files:
indentation = 0
while <span>'\t' in f[:2]:
indentation += 1
f = f[1:]

current_node = fs
for subdir in current_path[:indentation]:
current_node = current_node[subdir]

if <span>'.' in f:
current_node[f] = True
else:
current_node[f] = {}

current_path = current_path[:indentation]
current_path.append(f)

return fs
</span></span></span></code></pre>

<h6>Step 2: Computing the longest path</h6>

<p>After we've constructed a native representation of the file system, we can write a fairly
straightforward recursive function that takes the current root, recursively calculates the
<code>longest_path</code> of all the subdirectories and files under the root, and returns the longest
one. Remember that since we specifically want the longest path to a file to discard any paths
that do not have a <code>.</code> in them. And if there are no paths starting at this root, then we can
simply return the empty string.</p>

<pre><code>def longest_path(root):
paths = []
for key, node in root.items():
if node == True:
paths.append(key)
else:
paths.append(key + <span>'/' + longest_path(node))
# filter out unfinished paths
paths = [path for path in paths if <span>'.' in path]
if paths:
return max(paths, key=lambda path:len(path))
else:
return <span>''
</span></span></span></code></pre>

<h6>Step 3: Putting it together</h6>

<p>Now that the hard part is done, we just need to put the two together:</p>

<pre><code>def longest_absolute_path(s):
return len(longest_path(build_fs(s)))
</code></pre>

<p>This runs in O(n), since we iterate over the input string twice to build the
file system, and then in the worst case we go through the string again
to compute the longest path.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

<p>This problem was asked by Google.</p>

<p>Given an array of integers and a number k, where 1 <= k <= length of the array, compute the maximum values of each subarray of length k.</p>

<p>For example, given array = [10, 5, 2, 7, 8, 7] and k = 3, we should get: [10, 7, 8, 8], since:</p>

<ul><li>10 = max(10, 5, 2)</li><li>7 = max(5, 2, 7)</li><li>8 = max(2, 7, 8)</li><li>8 = max(7, 8, 7)</li></ul>

<p>Do this in O(n) time and O(k) space. You can modify the input array in-place and you do not need to store the results. You can simply print them out as you compute them.</p>

A

<p>Even though the question states O(n), in an interview it's always useful to first write out
a brute force solution, which may provide us with some insight on some deeper structure
in the problem.</p>

<p>So let's first write out a naive solution: we can simply take each subarray of k length and compute
their maxes.</p>

<pre><code>def max_of_subarrays(lst, k):
for i in range(len(lst) - k + 1):
print(max(lst[i:i + k]))
</code></pre>

<p>This takes O(n * k) time, which doesn't get us quite to where we want. How can we make this faster?</p>

<p>One possible idea is this: we could use a max-heap of size k and add the first k elements to the heap
initially, and then pop off the max and add the next element for the rest of the array. This is better,
but adding and extracting from the heap will take O(log k), so this algorithm will take O(n * log k),
which is still not enough. How can we do better?</p>

<p>Notice that, for example, the input [1, 2, 3, 4, 5, 6, 7, 8, 9] and k = 3, after evaluating the max of
first range, since 3 is at the end, we only need to check whether 4 is greater than 3. If it is, then
we can print 4 immediately, and if it isn't, we can stick with 3.</p>

<p>On the other hand, for the input [9, 8, 7, 6, 5, 4, 3, 2, 1] and k = 3, after evaluating the max of
the first range, we can't do the same thing, since we can't use 9 again. We have to look at 8 instead,
and then once we move on to the next range, we have to look at 7.</p>

<p>These two data points suggest an idea: we can keep a double-ended queue with max size k and only keep
what we need to evaluate in it. That is, if we see [1, 3, 5], then we only need to keep [5], since
we know that 1 and 3 cannot possibly be the maxes.</p>

<p>So what we can do is maintain an ordered list of indices, where we only keep the elements we care about,
that is, we will maintain the loop invariant that our queue is always ordered so that we only keep
the indices we care about (i.e, there are no elements that are greater after, since we would just pick
the greater element as the max instead).</p>

<p>It will help to go over an example. Consider our test input: [10, 5, 2, 7, 8, 7] and k = 3. Our queue at each step would look like this (recall that these are indices):</p>

<h3>Preprocessing</h3>

<p>After processing 10: [0]
After processing 5: [0, 1] # 5 is smaller than 10, and 10 is still valid until we hit the 3rd index
After processing 2: [0, 1, 2] # 2 is smaller than 5, and 10 is still valid</p>

<h3>Main Loop</h3>

<p>Print value of first element in our queue: <strong>10</strong></p>

<p>After processing 7: [4] # 10 is no longer valid (we can tell since the current index - 0 > k), so we dequeue from the front. 7 is bigger than 5 and 2, so we get rid of them from the back and replace it with the 7</p>

<p>Print value of first element in our queue: <strong>7</strong></p>

<p>After processing 8: [5] # 8 is bigger than 7, so no point in keeping 7 around. We get rid of it from the back and replace it with the 8</p>

<p>Print value of first element in our queue: <strong>8</strong></p>

<p>After processing 7: [5, 4] # 7 is smaller than 8, so we enqueue it from the back</p>

<p>Print value of first element in our queue: <strong>8</strong></p>

<h3>Code</h3>

<pre><code>from collections import deque

def max_of_subarrays(lst, k):
q = deque()
for i in range(k):
while q and lst[i] >= lst[q[-1]]:
q.pop()
q.append(i)

# Loop invariant: q is a list of indices where their corresponding values are in descending order.
for i in range(k, len(lst)):
print(lst[q[0]])
while q and q[0] <= i - k:
q.popleft()
while q and lst[i] >= lst[q[-1]]:
q.pop()
q.append(i)
print(lst[q[0]])
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

<p>This problem was asked by Facebook.</p>

<p>A builder is looking to build a row of N houses that can be of K different colors. He has
a goal of minimizing cost while ensuring that no two neighboring houses are of the same color.</p>

<p>Given an N by K matrix where the nth row and kth column represents the cost to build the nth
house with kth color, return the minimum cost which achieves this goal.</p>

A

<p>The brute force solution here would be to generate all possible combinations of houses
and colors, filter out invalid combinations, and keep track of the lowest cost seen.
This would take O(N^K) time.</p>

<p>We can solve this problem faster using dynamic programming. We can maintain a matrix cache
where every entry [i][j] represents the minimum cost of painting house i the color j,
as well as painting every house < i. We can calculate this by looking at the minimum
cost of painting each house < i - 1, and painting house i - 1 any color except j,
since that would break our constraint. We'll initialize the first row with zeroes to start.
Then, we just have to look at the smallest value in the last row of our cache, since
that represents the minimum cost of painting every house.</p>

<pre><code>def build_houses(matrix):
n = len(matrix)
k = len(matrix[0])
solution_matrix = [[0] * k]

# Solution matrix: matrix[i][j] represents the minimum cost to build house i with color j.
for r, row in enumerate(matrix):
row_cost = []
for c, val in enumerate(row):
row_cost.append(min(solution_matrix[r][i] for i in range(k) if i != c) + val)
solution_matrix.append(row_cost)
return min(solution_matrix[-1])
</code></pre>

<p>This runs in O(N <em> K^2) time and O(N </em> K) space. Can we do even better than this?</p>

<p>First off, notice that we're only ever looking at the last row when computing the next
row's cost. That suggests that we only need to keep track of one array of size K instead
of a whole matrix of size N * K:</p>

<pre><code>def build_houses(matrix):
k = len(matrix[0])
soln_row = [0] * k

for r, row in enumerate(matrix):
new_row = []
for c, val in enumerate(row):
new_row.append(min(soln_row[i] for i in range(k) if i != c) + val)
soln_row = new_row
return min(soln_row)
</code></pre>

<p>Now we're only using O(K) space! Can we improve this any more?</p>

<p>Hold on a second. When we're looking at the previous row's total cost, it looks like we're almost
computing the same thing each time: the minimum of the previous row that isn't the current index.</p>

<p>For every element that <strong>isn't</strong> that index, it will be the same value. When it <strong>is</strong> that index,
it will be the second-smallest value.</p>

<p>Now, armed with this insight, we only need to keep track of three variables:</p>

<ul><li>The lowest cost of the current row</li><li>The index of the lowest cost</li><li>The second lowest cost</li></ul>

<p>Then, when looking at the value at each row, we only need to do the following:</p>

<ul><li>Check if the index is the index of the lowest cost of the previous row. If it is, then we can't use
this color -- we'll use the second lowest cost instead. Otherwise, use the lowest cost
of the previous row</li><li>Calculate the minimum cost if we painted this house this particular color</li><li>Update our new lowest cost/index or second lowest cost if appropriate</li></ul>

<p>Now we'll always have our lowest cost in a variable, and once we've gone
through the matrix we can just return that.</p>

<pre><code>from math import inf

def build_houses(matrix):
lowest_cost, lowest_cost_index = 0, -1
second_lowest_cost = 0

for r, row in enumerate(matrix):
new_lowest_cost, new_lowest_cost_index = inf, -1
new_second_lowest_cost = inf
for c, val in enumerate(row):
prev_lowest_cost = second_lowest_cost if c == lowest_cost_index else lowest_cost
cost = prev_lowest_cost + val
if cost < new_lowest_cost:
new_second_lowest_cost = new_lowest_cost
new_lowest_cost, new_lowest_cost_index = cost, c
elif cost < new_second_lowest_cost:
new_second_lowest_cost = cost
lowest_cost = new_lowest_cost
lowest_cost_index = new_lowest_cost_index
second_lowest_cost = new_second_lowest_cost

return lowest_cost
</code></pre>

<p>Now the runtime is only O(N * K) and the space complexity is O(1) - constant, since we keep track of only three
variables!</p>

<p>Thanks to Alexander Shirkov for pointing out these optimizations!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

<p>This problem was asked by Google.</p>

<p>Given two singly linked lists that intersect at some point, find the intersecting node. The lists
are non-cyclical.</p>

<p>For example, given A = 3 -> 7 -> 8 -> 10 and B = 99 -> 1 -> 8 -> 10, return the node with value 8.</p>

<p>In this example, assume nodes with the same value are the exact same node objects.</p>

<p>Do this in O(M + N) time (where M and N are the lengths of the lists) and constant space.</p>

A

<p>We might start this problem by first ignoring the time and space constraints, in
order to get a better grasp of the problem.</p>

<p>Naively, we could iterate through one of the lists and add each node to a set
or dictionary, then we could iterate over the other list and check each node
we're looking at to see if it's in the set. Then we'd return the first node
that is present in the set. This takes O(M + N) time but also O(max(M, N))
space (since we don't know initially which list is longer). How can we reduce
the amount of space we need?</p>

<p>We can get around the space constraint with the following trick: first, get
the length of both lists. Find the difference between the two, and then keep
two pointers at the head of each list. Move the pointer of the larger list
up by the difference, and then move the pointers forward in conjunction and
check if they match.</p>

<pre><code>def length(head):
if not head:
return 0
return 1 + length(head.next)

def intersection(a, b):
m, n = length(a), length(b)
cur_a, cur_b = a, b

if m > n:
for _ in range(m - n):
cur_a = cur_a.next
else:
for _ in range(n - m):
cur_b = cur_b.next

while cur_a != cur_b:
cur_a = cur_a.next
cur_b = cur_b.next
return cur_a
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

<p>This problem was asked by Snapchat.</p>

<p>Given an array of time intervals (start, end) for classroom lectures (possibly overlapping), find the minimum number of rooms required.</p>

<p>For example, given [(30, 75), (0, 50), (60, 150)], you should return 2.</p>

A

<p>First, notice that the minimum number of classroom halls is the maximum number of overlapping intervals.</p>

<p>Now let's consider the naive approach. We could go through each interval and
check every other interval and see if it overlaps, keeping track of the largest
number of overlapping intervals.</p>

<pre><code>def overlaps(a, b):
start_a, end_a = a
start_b, end_b = b
# It doesn't overlap if it's like this:
# |start_a .... end_a| <---> |start_b ... end_b|
# or like this:
# |start_b .... end_b| <---> |start_a ... end_a|
# so return not or either of these
return not (end_a < start_b or start_a > end_b)

def max_overlapping(intervals):
current_max = 0
for interval in intervals:
num_overlapping = sum(overlaps(interval, other_interval)
for other_interval in intervals
if interval is not other_interval)
current_max = max(current_max, num_overlapping)
return current_max
</code></pre>

<p>This would take O(n^2) time, since we're checking each interval pairwise. Can we do any better?</p>

<p>One solution is to extract the start times and end times of all the intervals and sort them.
Then we can start two pointers on each list, and consider the following:</p>

<ul><li>If the current start is before the current end, then we have a new overlap. Increment the start pointer.</li><li>If the current start is after the current end, then our overlap closes. Increment the end pointer.</li></ul>

<p>All that's left to do is keep a couple variables to keep track of the maximum number of overlaps we've seen so far
and the current number of overlaps.</p>

<pre><code>def max_overlapping(intervals):
starts = sorted(start for start, end in intervals)
ends = sorted(end for start, end in intervals)

current_max = 0
current_overlap = 0
i, j = 0, 0
while i < len(intervals) and j < len(intervals):
if starts[i] < ends[j]:
current_overlap += 1
current_max = max(current_max, current_overlap)
i += 1
else:
current_overlap -= 1
j += 1
return current_max
</code></pre>

<p>This runs in O(n log n) time, since we have to sort the intervals.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

<p>This problem was asked by Microsoft.</p>

<p>Given a dictionary of words and a string made up of those words (no spaces), return the original
sentence in a list. If there is more than one possible reconstruction, return any of them. If
there is no possible reconstruction, then return null.</p>

<p>For example, given the set of words 'quick', 'brown', 'the', 'fox', and the string "thequickbrownfox",
you should return ['the', 'quick', 'brown', 'fox'].</p>

<p>Given the set of words 'bed', 'bath', 'bedbath', 'and', 'beyond', and the string "bedbathandbeyond",
return either ['bed', 'bath', 'and', 'beyond] or ['bedbath', 'and', 'beyond'].</p>

A

<p>We might be initially tempted to take a greedy approach to this problem, by
for example, iterating over the string and checking if our current string matches
so far. However, you should immediately find that that can't work: consider
the dictionary {'the', 'theremin'} and the string 'theremin': we would find
'the' first, and then we wouldn't be able to match 'remin'.</p>

<p>So this greedy approach doesn't work, since we would need to go back if we get stuck.
This gives us a clue that we might want to use <a>backtracking</a>
to help us solve this problem. We also have the following idea for a recurrence:
If we split up the string into a prefix and suffix, then we can return the prefix
extended with a list of the rest of the sentence, but only if they're both valid.
So what we can do is the following:</p>

<ul><li>Iterate over the string and split it into a prefix and suffix</li><li>If the prefix is valid (appears in the dictionary), then recursively call on the suffix</li><li>If that's valid, then return. Otherwise, continue searching.</li><li>If we've gone over the entire sentence and haven't found anything, then return empty.</li></ul>

<p>We'll need a helper function to tell us whether the string can actually be broken up
into a sentence as well, so let's define <code>find_sentence_helper</code> that also returns
whether or not the sentence is valid.</p>

<pre><code>def find_sentence(dictionary, s):
sentence, valid = find_sentence_helper(dictionary, s)
if valid:
return sentence

def find_sentence_helper(dictionary, s):
if len(s) == 0:
return [], True

result = []
for i in range(len(s) + 1):
prefix, suffix = s[:i], s[i:]
if prefix in dictionary:
rest, valid = find_sentence_helper(dictionary, suffix)
if valid:
return [prefix] + rest, True
return [], False
</code></pre>

<p>This will run in O(2^N) time, however. This is because in the worst case,
say, for example, s = "aaaaab" and dictionary = ["a", "aa", "aaa", "aaaa", "aaaaa"],
we will end up exploring every single path, or every combination of letters, and
the total number of combinations of characters is 2^N.</p>

<p>We can improve the running time by using dynamic programming to store repeated
subcomputations. This reduces the running time to just O(N^2). We'll keep a
dictionary that maps from indices to the last word that can be made up
to that index. We'll call these starts. Then, we just need to do two nested
for loops, one that iterates over the whole string and tries to find a
start at that index, and a loop that checks each start to see if a new word can
be made from that start to the current index.</p>

<p>Now we can simply take the start at the last index and build our sentence backwards:</p>

<pre><code>def find_sentence(s, dictionary):
starts = {0: <span>''}
for i in range(len(s) + 1):
new_starts = starts.copy()
for start_index, _ in starts.items():
word = s[start_index:i]
if word in dictionary:
new_starts[i] = word
starts = new_starts.copy()

result = []
current_length = len(s)
if current_length not in starts:
return None
while current_length > 0:
word = starts[current_length]
current_length -= len(word)
result.append(word)

return list(reversed(result))
</span></code></pre>

<p>Now this runs in O(N^2) time and O(N) space.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

<p>This problem was asked by Google.</p>

<p>You are given an M by N matrix consisting of booleans that represents a board.
Each True boolean represents a wall. Each False boolean represents a tile you
can walk on.</p>

<p>Given this matrix, a start coordinate, and an end coordinate, return the minimum
number of steps required to reach the end coordinate from the start. If there is
no possible path, then return null. You can move up, left, down, and right. You cannot
move through walls. You cannot wrap around the edges of the board.</p>

<p>For example, given the following board:</p>

<pre><code>[[f, f, f, f],
[t, t, f, t],
[f, f, f, f],
[f, f, f, f]]
</code></pre>

<p>and start = <code>(3, 0)</code> (bottom left) and end = <code>(0, 0)</code> (top left), the minimum number of steps
required to reach the end is 7, since we would need to go through <code>(1, 2)</code> because there is
a wall everywhere else on the second row.</p>

A

<p>The idea here is to use either BFS or DFS to explore the board, starting
from the start coordinate, and keep track of what we've seen so far as
well as the steps from the start until we find the end coordinate.</p>

<p>In our case, we'll use BFS. We'll create a queue and initialize it with our
start coordinate, along with a count of 0. We'll also initialize a <code>seen</code> set
to ensure we only add coordinates we haven't seen before.</p>

<p>Then, as long as there's something still in the queue, we'll dequeue from the
queue and first check if it's our target coordinate -- if it is, then we can
just immediately return the count. Otherwise, we'll get the valid neighbours
of the coordinate we're working with (valid means not off the board and not a
wall), and enqueue them to the end of the queue.</p>

<p>To make sure the code doesn't get too messy, we'll define some helper
functions: <code>walkable</code>, which returns whether or not a tile is valid,
and <code>get_walkable_neighbours</code> which returns the valid neighbours of a coordinate.</p>

<pre><code>from collections import deque

# Given a row and column, returns whether that tile is walkable.
def walkable(board, row, col):
if row < 0 or row >= len(board):
return False
if col < 0 or col >= len(board[0]):
return False
return not board[row][col]

# Gets walkable neighbouring tiles.
def get_walkable_neighbours(board, row, col):
return [(r, c) for r, c in [
(row, col - 1),
(row - 1, col),
(row + 1, col),
(row, col + 1)]
if walkable(board, r, c)
]

def shortest_path(board, start, end):
seen = set()
queue = deque([(start, 0)])
while queue:
coords, count = queue.popleft()
if coords == end:
return count
seen.add(coords)
neighbours = get_walkable_neighbours(board, coords[0], coords[1])
queue.extend((neighbour, count + 1) for neighbour in neighbours
if neighbour not in seen)

board = [[False, False, False, False],
[True, True, True, True],
[False, False, False, False],
[False, False, False, False]]

print(shortest_path(board, (3, 0), (0, 0)))
</code></pre>

<p>This code should run in O(M * N) time and space, since in the worst case we
need to examine the entire board to find our target coordinate.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

<p>This problem was asked by Google.</p>

<p>Implement locking in a binary tree. A binary tree node can be locked or unlocked
only if all of its descendants or ancestors are not locked.</p>

<p>Design a binary tree node class with the following methods:</p>

<ul><li><code>is_locked</code>, which returns whether the node is locked</li><li><code>lock</code>, which attempts to lock the node. If it cannot be locked, then it should return false.
Otherwise, it should lock it and return true.</li><li><code>unlock</code>, which unlocks the node. If it cannot be unlocked, then it should return false.
Otherwise, it should unlock it and return true.</li></ul>

<p>You may augment the node to add parent pointers or any other property you would like.
You may assume the class is used in a single-threaded program, so there is no need for actual locks or mutexes.
Each method should run in O(h), where h is the height of the tree.</p>

A

<p>A relatively easy way to implement this would be to augment each node with an <code>is_locked</code>
attribute as well as a parent pointer. We can then implement the methods
in a straightforward manner:</p>

<ul><li><code>is_locked</code> simply returns the node's attribute</li><li><code>lock</code> searches the node's children and parents for a true <code>is_locked</code> attribute.
If it is set to true on any of them, then return false. Otherwise, set the current
node's <code>is_locked</code> to true and return true.</li><li><code>unlock</code> simply changes the node's attribute to false. If we want to be safe,
then we should search the node's children and parents as in <code>lock</code> to make sure
we can actually unlock the node, but that shouldn't ever happen.</li></ul>

<p>While <code>is_locked</code> is O(1) time, <code>lock</code> and <code>unlock</code> will take O(m + h) time where
m is the number of nodes in the node's subtree (since we have to traverse through all its descendants)
and h is the height of the node (since we have to traverse through the node's ancestors).</p>

<p>We can improve the performance of <code>lock</code> and <code>unlock</code> by adding another field to the node
that keeps tracks of the count of locked descendants. That way, we can immediately
see whether any of its descendants are locked. This will reduce our <code>lock</code> and <code>unlock</code>
functions to only O(h). We can maintain this field by doing the following:</p>

<ul><li>When locking, if the locking succeeds, traverse the node's ancestors and increment each one's count</li><li>When unlocking, traverse the node's ancestors and decrement each one's count</li></ul>

<p>The code will look something like the following:</p>

<pre><code>class LockingBinaryTreeNode(object):
def \_\_init\_\_(self, val, left=None, right=None, parent=None):
self.val = val
self.left = left
self.right = right
self.parent = parent
self.is_locked = False
self.locked_descendants_count = 0

def _can_lock_or_unlock(self):
if self.locked_descendants_count > 0:
return False

cur = self.parent
while cur:
if cur.is_locked:
return False
cur = cur.parent
return True

def is_locked(self):
return self.is_locked


def lock(self):
if self.is_locked:
return False # node already locked

if not self._can_lock_or_unlock():
return False

# Not locked, so update is_locked and increment count in all ancestors
self.is_locked = True

cur = self.parent
while cur:
cur.locked_descendants_count += 1
cur = cur.parent
return True

def unlock(self):
if not self.is_locked:
return False # node already unlocked

if not self._can_lock_or_unlock():
return False

self.is_locked = False

# Update count in all ancestors
cur = self.parent
while cur:
cur.locked_descendants_count -= 1
cur = cur.parent
return True
</code></pre>

<p>Now, <code>is_locked</code> is still O(1), but <code>lock</code> and <code>unlock</code> are both O(h) instead of O(m + h).</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

<p>This problem was asked by Facebook.</p>

<p>Implement regular expression matching with the following special characters:</p>

<ul><li><code>.</code> (period) which matches any single character</li><li><code>*</code> (asterisk) which matches zero or more of the preceding element</li></ul>

<p>That is, implement a function that takes in a string and a valid regular expression
and returns whether or not the string matches the regular expression.</p>

<p>For example, given the regular expression "ra." and the string "ray", your function
should return true. The same regular expression on the string "raymond" should return false.</p>

<p>Given the regular expression ".*at" and the string "chat", your function should return
true. The same regular expression on the string "chats" should return false.</p>

A

<p>This problem should strike you as recursive. The string should match the regex
if we can match the head of the string with the head of the regex and the rest of the
string with the rest of the regex. The special characters <code>.</code> and <code>*</code> make implementing
this a bit trickier, however, since the <code>*</code> means we can match 0 or any number of characters
in the beginning.</p>

<p>The basic idea then is to do the following. Let's call the string we want to match <code>s</code> and
the regex <code>r</code>.</p>

<ul><li>Base case: if <code>r</code> is empty, then return whether <code>s</code> is empty or not.</li><li>Otherwise, if the first thing in <code>r</code> is not proceeded by a <code>*</code>, then match the first character
of both <code>r</code> and <code>s</code>, and if they match, return <code>match(r[1:], s[1:])</code>. If they don't, then return false.</li><li>If the first thing in <code>r</code> _is_ proceeded by a <code>*</code>, then try every suffix substring of <code>s</code> on <code>r[2:]</code>
and return true if any suffix substring works.</li></ul>

<p>The code should look something like this:</p>

<pre><code>def matches_first_char(s, r):
return s[0] == r[0] or (r[0] == <span>'.' and len(s) > 0)

def matches(s, r):
if r == <span>'':
return s == <span>''

if len(r) == 1 or r[1] != <span>'*':
# The first character in the regex is not proceeded by a *.
if matches_first_char(s, r):
return matches(s[1:], r[1:])
else:
return False
else:
# The first character is proceeded by a *.
# First, try zero length.
if matches(s, r[2:]):
return True
# If that doesn't match straight away, then try globbing more prefixes
# until the first character of the string doesn't match anymore.
i = 0
while matches_first_char(s[i:], r):
if matches(s[i+1:], r[2:]):
return True
i += 1
</span></span></span></span></code></pre>

<p>This takes O(len(s) * len(r)) time and space, since we potentially need to iterate over each suffix substring again
for each character.</p>

<p>Fun fact: Stephen Kleene introduced the <code>*</code> operator in regular expressions and as such, it
is sometimes referred to as the Kleene star.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

<p>This problem was asked by Google.</p>

<p>Given a singly linked list and an integer k, remove the kth last element from the list. k is guaranteed to be smaller than the length of the list.</p>

<p>The list is very long, so making more than one pass is prohibitively expensive.</p>

<p>Do this in constant space and in one pass.</p>

A

<p>If we didn't have the constraint of needing only to make one pass, this problem
would be trivial to implement. We could simply iterate over the whole list to
find out the total length N of the list, and then restart from the beginning
and iterate N - k steps and remove the node there. That would take constant
space as well.</p>

<p>However, given that we have the constraint of needing to make only one pass,
we have to find some way of getting the N - kth node in the list
in one shot.</p>

<p>What we can do, then, is this:</p>

<ul><li>Set up two pointers at the head of the list (let's call them <code>fast</code> and <code>slow</code>)</li><li>Move <code>fast</code> up by <code>k</code></li><li>Move both <code>fast</code> and <code>slow</code> together until <code>fast</code> reaches the end of the list</li><li>Now <code>slow</code> is at the N - kth node, remove it</li></ul>

<p>That only makes one pass and is constant time. The code should look something like
this:</p>

<pre><code>class Node:
def \_\_init\_\_(self, val, next=None):
self.val = val
self.next = next

def \_\_str\_\_(self):
current_node = self
result = []
while current_node:
result.append(current_node.val)
current_node = current_node.next
return str(result)

def remove_kth_from_linked_list(head, k):
slow, fast = head, head
for i in range(k):
fast = fast.next

prev = None
while fast:
prev = slow
slow = slow.next
fast = fast.next

prev.next = slow.next

head = Node(1, Node(2, Node(3, Node(4, Node(5)))))
print(head)
remove_kth_from_linked_list(head, 3)
print(head)
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

<p>This problem was asked by Facebook.</p>

<p>Given a string of round, curly, and square open and closing brackets, return
whether the brackets are balanced (well-formed).</p>

<p>For example, given the string "([])[]({})", you should return true.</p>

<p>Given the string "([)]" or "((()", you should return false.</p>

A

<p>In this case, it's easy to start with a simplified case of the problem, which is
dealing with only round brackets. Notice that in this case, we just need to keep
track of the current number of open brackets -- each closing bracket should be
matched with the rightmost open bracket. So we can keep a counter and increment it
for every open bracket we see and decrement it on every closing bracket.
If we get to the end of the string and have a non-zero number, then it means it's unbalanced.
A negative number would indicate more closing brackets than open ones, and a positive number
would indicate the opposite.</p>

<p>In the case of round, curly, and square brackets, we need to also keep track
of what <em>kind</em> of brackets they are as well, because we can't match a round
open bracket with a curly square. In this case, we can use a stack to keep
track of the actual bracket character and push onto it whenever we encounter
an open bracket, and pop if we encounter a matching closing bracket to the top
of the stack. If the stack is empty or it's not the correct matching bracket,
then we'll return false. If, by the end of the iteration, we have something left
over in the stack, then it means it's unbalanced -- so we'll return whether
it's empty or not.</p>

<pre><code>def balance(s):
stack = []
for char in s:
if char in [<span>"(", <span>"[", <span>"{"]:
stack.append(char)
else:
# Check character is not unmatched
if not stack:
return False

# Char is a closing bracket, check top of stack if it matches
if (char == <span>")" and stack[-1] != <span>"(") or \
(char == <span>"]" and stack[-1] != <span>"[") or \
(char == <span>"}" and stack[-1] != <span>"{"):
return False
stack.pop()

return len(stack) == 0
</span></span></span></span></span></span></span></span></span></code></pre>

<p>Fun fact: "(())" is not a palindrome, nor is "()()". "())(" is a palindrome, though.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

<p>This problem was asked by Palantir.</p>

<p>Write an algorithm to justify text. Given a sequence of words
and an integer line length k, return a list of strings which represents each
line, fully justified.</p>

<p>More specifically, you should have as many words as possible in each line.
There should be at least one space between each word. Pad extra spaces when necessary
so that each line has exactly length k. Spaces should be distributed as
equally as possible, with the extra spaces, if any, distributed starting
from the left.</p>

<p>If you can only fit one word on a line, then you should pad the right-hand side
with spaces.</p>

<p>Each word is guaranteed not to be longer than k.</p>

<p>For example, given the list of words ["the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog"] and k = 16, you should return the following:</p>

<pre><code>["the quick brown", # 1 extra space on the left
"fox jumps over", # 2 extra spaces distributed evenly
"the lazy dog"] # 4 extra spaces distributed evenly
</code></pre>

A

<p>It seems like the justification algorithm is independent from the groupings, so
immediately we should figure out two things:</p>

<ul><li>How to group lines together so that it is as close to k as possible (without going over)</li><li>Given a grouping of lines, justifying the text by appropriately distributing spaces</li></ul>

<p>To solve the first part, let's write a function <code>group_lines</code> that takes in all the
words in our input sequence as well as out target line length k, and return a list of
list of words that represents the lines that we will eventually justify. Our main
strategy will be to iterate over all the words, keep a list of words for the current
line, and because we want to fit as many words as possible per line, estimate the
current line length, assuming only one space between each word. Once we go over
<code>k</code>, then save the word and start a new line with it. So our function will look
something like this:</p>

<pre><code>def min_line(words):
return <span>' '.join(words)

def group_lines(words, k):
<span>'''
Returns groupings of |words| whose total length, including 1 space in between,
is less than |k|.
'''
groups = []
current_sum = 0
current_line = []
for i, word in enumerate(wordwordss):
# Check if adding the next word would push it over
# the limit. If it does, then add |current_line| to
# group. Also reset |current_line| properly.
if len(min_line(current_line + [word])) > k:
groups.append(current_line)
current_line = []
current_line.append(word)

# Add the last line to groups.
groups.append(current_line)
return groups
</span></span></code></pre>

<p>Then, we'll want to actually justify each line. We know for sure each line we feed
from <code>group_lines</code> is the maximum number of words we can pack into a line and no
more. What we can do is first figure out how many spaces we have available to
distribute between each word. Then from that, we can calculate how much base
space we should have between each word by dividing it by the number of words minus one.
If there are any leftover spaces to distribute, then we can keep track of that
in a counter, and as we rope in each new word we'll add the appropriate
number of spaces. We can't add more than one leftover space per word.</p>

<pre><code>def justify(words, length):
<span>'''
Precondition: |words| can fit in |length|.
Justifies the words using the following algorithm:
- Find the smallest spacing between each word (available_spaces / spaces)
- Add a leftover space one-by-one until we run out
'''
if len(words) == 1:
word = words[0]
num_spaces = length - len(word)
spaces = <span>' ' * num_spaces
return word + spaces
spaces_to_distribute = length - sum(len(word) for word in words)
number_of_spaces = len(words) - 1
smallest_space = floor(spaces_to_distribute / number_of_spaces)
leftover_spaces = spaces_to_distribute - (number_of_spaces * smallest_space)
justified_words = []
for word in words:
justified_words.append(word)
current_space = <span>' ' * smallest_space
if leftover_spaces > 0:
current_space += <span>' '
leftover_spaces -= 1
justified_words.append(current_space)
return <span>''.join(justified_words).rstrip()
</span></span></span></span></span></code></pre>

<p>The final solution should just combine our two functions:</p>

<pre><code>def justify_text(words, k):
return [justify(group, k) for group in group_lines(words, k)]
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

<p>This problem was asked by Amazon.</p>

<p>Run-length encoding is a fast and simple method of encoding strings. The basic idea is to represent
repeated successive characters as a single count and character. For example, the string "AAAABBBCCDAA" would
be encoded as "4A3B2C1D2A".</p>

<p>Implement run-length encoding and decoding. You can assume the string to be encoded have no digits
and consists solely of alphabetic characters. You can assume the string to be decoded is valid.</p>

A

<p>We can implement <code>encode</code> by iterating over our input string
and keeping a current count of whatever the current character is,
and once we encounter a different one, appending the count (as a string)
and the actual character to our result string.</p>

<pre><code>def encode(s):
if not s:
return <span>''

result = <span>''
current_char = s[0]
current_count = 1
for i, char in enumerate(s, 1):
if char == current_char:
current_count += 1
else:
result += str(current_count) + current_char
current_char = char
current_count = 1
result += str(current_count) + current_char
return result
</span></span></code></pre>

<p>We can implement <code>decode</code> by iterating over the encoded string and checking
each character for a digit. If it is, then calculate the correct count, and
once we find its corresponding character, extend the result with the character
count number of times and then reset the count.</p>

<pre><code>def decode(s):
count = 0
result = <span>''
for char in s:
if char.isdigit():
count = count * 10 + int(char)
else:
# char is alphabetic
result += char * count
count = 0
return result
</span></code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

<p>This problem was asked by Facebook.</p>

<p>You are given an array of non-negative integers that represents a two-dimensional elevation map where each element
is unit-width wall and the integer is the height. Suppose it will rain and all spots between two walls get filled up.</p>

<p>Compute how many units of water remain trapped on the map in O(N) time and O(1) space.</p>

<p>For example, given the input [2, 1, 2], we can hold 1 unit of water in the middle.</p>

<p>Given the input [3, 0, 1, 3, 0, 5], we can hold 3 units in the first index, 2 in the second,
and 3 in the fourth index (we cannot hold 5 since it would run off to the left), so we can
trap 8 units of water.</p>

A

<p>Notice that the amount of water that can be filled up at a certain index i is the
smaller of the largest height to the left and the largest height to the right minus
the actual value at that point, because it will be trapped by the smaller of the two sides.
So what we can do is to create two arrays that represent the running maximum
heights, one from the left and one from the right. Then to count the total
capacity, we can run through the both arrays and add up the smaller of the two
arrays at that index.</p>

<pre><code>def capacity(arr):
n = len(arr)
left_maxes = [0 for _ in range(n)]
right_maxes = [0 for _ in range(n)]

current_left_max = 0
for i in range(n):
current_left_max = max(current_left_max, arr[i])
left_maxes[i] = current_left_max

current_right_max = 0
for i in range(n - 1, -1, -1):
current_right_max = max(current_right_max, arr[i])
right_maxes[i] = current_right_max

total = 0
for i in range(n):
total += min(left_maxes[i], right_maxes[i]) - arr[i]
return total
</code></pre>

<p>This is O(N) time, but also O(N) space, and we want constant space. So instead,
we can do this. We can find the largest element in the array, and then when we're
looking on the left of it, we only need to keep the running total to the left
(since we know the largest element on the array is on the right). And then do a
similar thing, but starting from the right side. So the general gist is this:</p>

<ul><li>Find the maximum element in the array -- let's say it's at index i</li><li>Initialize a running maximum on the left to arr[0]</li><li>Iterate from index 1 to i. At each step, update the running maximum if necessary
and then increment a variable counter with the running maximum minus the value at that array.</li><li>Do the same thing but from len(arr) - 2 to i backwards, and keep the running maximum
on the right.</li></ul>

<pre><code>def capacity(arr):
if not arr:
return 0

total = 0
max_i = arr.index(max(arr))

left_max = arr[0]
for num in arr[1:max_i]:
total += left_max - num
left_max = max(left_max, num)

right_max = arr[-1]
for num in arr[-2:max_i:-1]:
total += right_max - num
right_max = max(right_max, num)

return total
</code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

<p>This problem was asked by Google.</p>

<p>The edit distance between two strings refers to the minimum number of character insertions, deletions,
and substitutions required to change one string to the other. For example, the edit distance between
“kitten” and “sitting” is three: substitute the “k” for “s”, substitute the “e” for “i”, and append
a “g”.</p>

<p>Given two strings, compute the edit distance between them.</p>

A

<p>First, notice that we can probably define this problem recursively. How can we notice this?
If we look at the example (kitten -> sitting) and its solution path (kitten -> sitten -> sittin -> sitting),
we can see that it's the minimum distance between sitten and sitting plus one.</p>

<p>The recurrence, then, looks like this:</p>

<ul><li>If either <code>s1</code> or <code>s2</code> are empty, then return the size of the larger of the two strings (since
we can trivially turn an empty string into a string by inserting all its characters)</li><li>Otherwise, return the minimum between:<ul><li>The edit distance between each string and the last n - 1 characters of the other plus one</li><li>If the first character in each string is the same, then the edit distance between s1[1:] and s2[1:], otherwise the same edit distance + 1</li></ul></li></ul>

<p>So, the naive recursive solution would look like this:</p>

<pre><code>def distance(s1, s2):
if len(s1) == 0 or len(s2) == 0:
return max(len(s1), len(s2))

return min(distance(s1[1:], s2) + 1,
distance(s1, s2[1:]) + 1,
distance(s1[1:], s2[1:]) if s1[0] == s2[0]
else distance(s1[1:], s2[1:]) + 1)
</code></pre>

<p>However, this runs very slowly due to repeated subcomputations. We can speed it up by using
dynamic programming and storing the subcomputations in a 2D matrix. The index at i, j will
contain the edit distance between <code>s1[:i]</code> and <code>s2[:j]</code>. Then, once we fill it up, we can
return the value of the matrix at A[-1][-1].</p>

<pre><code>def distance(s1, s2):
x = len(s1) + 1 # the length of the x-coordinate
y = len(s2) + 1 # the length of the y-coordinate

A = [[-1 for i in range(x)] for j in range(y)]
for i in range(x):
A[0][i] = i

for j in range(y):
A[j][0] = j

for i in range(1, y):
for j in range(1, x):
if s1[j- 1] == s2[i - 1]:
A[i][j] = A[i - 1][j - 1]
else:
A[i][j] = min(
A[i - 1][j] + 1,
A[i][j - 1] + 1,
A[i - 1][j - 1] + 1
)
return A[y - 1][x - 1] # return the edit distance between the two strings
</code></pre>

<p>This now takes O(N * M) time and space, where N and M are the lengths of the strings.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

<p>This problem was asked by Jane Street.</p>

<p>Suppose you are given a table of currency exchange rates, represented as a 2D array.
Determine whether there is a possible arbitrage: that is, whether there is some sequence
of trades you can make, starting with some amount A of any currency, so that you can end up
with some amount greater than A of that currency.</p>

<p>There are no transaction costs and you can trade fractional quantities.</p>

A

<p>In this question, we can model the currencies and the exchange rates
as a graph, where the nodes are the currencies and the edges are the
exchange rates between each commodity. Since our table is complete,
the graph is also complete. Then, to solve this problem, we need to
find a cycle whose edge weights product is greater than 1.</p>

<p>This seems hard to do faster than brute force, so let's try to reduce
it down to a problem we already know we can solve faster than brute force.
Hint: <code>log(a * b) = log(a) + log(b)</code>. So if we take the negative log of
the edge weights, the problem of finding a cumulative product that's
greater than 1 turns into the problem of finding a negative sum cycle.</p>

<p>The Bellman-Ford algorithm can detect negative cycles. So if we run
Bellman-Ford on our graph and discover one, then that means its
corresponding edge weights multiply out to more than 1, and thus
we can perform an arbitrage.</p>

<p>As a refresher, the Bellman-Ford algorithm is commonly used to find
the shortest path between a source vertex and each of the other vertices.
If the graph contains a negative cycle, however, it can detect it
and throw an exception (or, in our case, return true). The main idea of
Bellman-Ford is this:</p>

<p>Since the longest path in any graph has at most |V| - 1 edges, if we
take all the direct edges from our source node, then we have all the
one-edged shortest paths; once we take edges from there, we have
all the two-edged shortest paths; all the way until |V| - 1 sized paths.</p>

<p>If, after |V| - 1 iterations of this, we can still find a smaller
path, then there must be a negative cycle in the graph.</p>

<pre><code>from math import log

def arbitrage(table):
transformed_graph = [[-log(edge) for edge in row] for row in graph]

# Pick any source vertex -- we can run Bellman-Ford from any vertex and
# get the right result
source = 0
n = len(transformed_graph)
min_dist = [float(<span>'inf')] * n

min_dist[source] = 0

# Relax edges |V - 1| times
for i in range(n - 1):
for v in range(n):
for w in range(n):
if min_dist[w] > min_dist[v] + transformed_graph[v][w]:
min_dist[w] = min_dist[v] + transformed_graph[v][w]

# If we can still relax edges, then we have a negative cycle
for v in range(n):
for w in range(n):
if min_dist[w] > min_dist[v] + transformed_graph[v][w]:
return True

return False
</span></code></pre>

<p>Because of the triply-nested foor loop, this runs in O(N^3) time.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

<p>This problem was asked by Microsoft.</p>

<p>Compute the running median of a sequence of numbers. That is, given a stream of numbers,
print out the median of the list so far on each new element.</p>

<p>Recall that the median of an even-numbered list is the average of the two middle numbers.</p>

<p>For example, given the sequence [2, 1, 5, 7, 2, 0, 5], your algorithm should print out:</p>

<pre><code>2
1.5
2
3.5
2
2
2
</code></pre>

A

<p>For this problem, the trick is to use two heaps: a min-heap and a max-heap.
We keep all elements smaller than the median in the max-heap and all elements
larger than the median in the min-heap. We'll keep these heaps balanced so that
the median is always either the root of the min-heap or the max-heap (or both).</p>

<p>When we encounter a new element from the stream, we'll first add it to one of our heaps:
the max-heap if the element is smaller than the median, or the min-heap if it's bigger.
We can make the max-heap the default heap if they're equal or there are no elements.</p>

<p>Then we re-balance if necessary by moving the root of the larger heap to the smaller one.
It's only necessary if the a heap is larger than the other by more than 1 element.</p>

<p>Finally, we can print out our median: it will just be the root of the larger heap,
or the average of the two roots if they're of equal size.</p>

<p>Since Python has really terrible support for heaps, we'll pretend we have some heap
objects that have the standard interface:</p>

<pre><code>def get_median(min_heap, max_heap):
if len(min_heap) > len(max_heap):
return min_heap.find_min()
elif len(min_heap) < len(max_heap):
return max_heap.find_max()
else:
min_root = min_heap.find_min()
max_root = max_heap.find_max()
return (min_root + max_root) / 2

def add(num, min_heap, max_heap):
# If empty, then just add it to the max heap.
if len(min_heap) + len(max_heap) <= 1:
max_heap.insert(num)
return

median = get_median(min_heap, max_heap)
if num > median:
# add it to the min heap
min_heap.insert(num)
else:
max_heap.insert(num)

def rebalance(min_heap, max_heap):
if len(min_heap) > len(max_heap) + 1:
root = min_heap.extract_min()
max_heap.insert(root)
elif len(max_heap) > len(min_heap) + 1:
root = max_heap.extract_max()
min_heap.insert(root)

def print_median(min_heap, max_heap):
print(get_median(min_heap, max_heap))

def running_median(stream):
min_heap = minheap()
max_heap = maxheap()
for num in stream:
add(num, min_heap, max_heap)
rebalance(min_heap, max_heap)
print_median(min_heap, max_heap)
</code></pre>

<p>This runs in O(N) space. In terms of time, each new element takes O(log N) time to manipulate the heaps,
so this will run in O(N log N) time.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

<p>This problem was asked by Quora.</p>

<p>Given a string, find the palindrome that can be made by inserting the fewest number of characters as possible anywhere in the word.
If there is more than one palindrome of minimum length that can be made, return the lexicographically earliest one (the first one alphabetically).</p>

<p>For example, given the string "race", you should return "ecarace", since we can add three letters to it (which is the smallest amount to make a palindrome).
There are seven other palindromes that can be made from "race" by adding three letters, but "ecarace" comes first alphabetically.</p>

<p>As another example, given the string "google", you should return "elgoogle".</p>

A

<p>Notice that whenever we add a character, it should ideally match the one on the other side of the string.
We can use the following recurrence to solve this problem:</p>

<ul><li>If <code>s</code> is already a palindrome, then just return <code>s</code> -- it's already the shortest palindrome we can make</li><li>If the first character of <code>s</code> (let's call it <code>a</code>) is the same as the last, then return <code>a + make_palindrome(s[1:-1]) + a</code></li><li>If the first character of <code>s</code> is different from the last (let's call this <code>b</code>), then return the minimum between:<ul><li><code>a + make_palindrome(s[1:]) + a</code></li><li><code>b + make_palindrome(s[:-1]) + b</code>
or the lexicographically earliest one if their lengths are equal.</li></ul></li></ul>

<p>So a naive recursive solution might look like this:</p>

<pre><code>def is_palindrome(s):
return s == s[::-1]

def make_palindrome(s):
if is_palindrome(s):
return s
if s[0] == s[-1]:
return s[0] + make_palindrome(s[1:-1]) + s[-1]
else:
one = s[0] + make_palindrome(s[1:]) + s[0]
two = s[-1] + make_palindrome(s[:-1]) + s[-1]
if len(one) < len(two):
return one
elif len(one) > len(two):
return two
else:
return min(one, two)
</code></pre>

<p>Recall that the min of two strings in python will return the lexicographically earliest one!</p>

<p>However, this algorithm runs in O(2^N) time, since we could potentially make two recursive calls each time.
We can speed up using dynamic programming, as usual. We can either <a>memoize</a>
our results so that we don't duplicate any work, or use a table and do bottom-up programming.</p>

<p>Let's start with memoization. We can keep a cache and store all our results when we compute them in the cache.
If we come across a string we've seen before, then we just need to look it up in the cache.</p>

<pre><code>cache = {}

def is_palindrome(s):
return s == s[::-1]

def make_palindrome(s):
if s in cache:
return cache[s]

if is_palindrome(s):
cache[s] = s
return s
if s[0] == s[-1]:
result = s[0] + make_palindrome(s[1:-1]) + s[-1]
cache[s] = result
return result
else:
one = s[0] + make_palindrome(s[1:]) + s[0]
two = s[-1] + make_palindrome(s[:-1]) + s[-1]
cache[s] = min(one, two)
return min(one, two)
</code></pre>

<p>However, this is inefficient due to buildup in the call stack. We can build a 2D table instead.
We'll store, in each index, the shortest palindrome that can be made in the substring defined
from <code>i</code> to <code>i + j</code>. Then instead of calling ourselves recursively, we'll just look up the values in
our table:</p>

<pre><code>def make_palindrome(s):
if len(s) <= 1:
return s
table = [[<span>'' for i in range(len(s) + 1)] for j in range(len(s) + 1)]

for i in range(len(s)):
table[i][1] = s[i]

for j in range(2, len(s) + 1):
for i in range(len(s) - j + 1):
term = s[i:i + j]
first, last = term[0], term[-1]
if first == last:
table[i][j] = first + table[i + 1][j - 2] + last
else:
one = first + table[i + 1][j - 1] + first
two = last + table[i][j - 1] + last
if len(one) < len(two):
table[i][j] = one
elif len(one) > len(two):
table[i][j] = two
else:
table[i][j] = min(one, two)

return table[0][-1]
</span></code></pre>

<p>Because we store a part of our input string in each index of our matrix, the time and space complexity
for this solution is O(N^3).</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

<p>This problem was asked by Google.</p>

<p>Given an array of strictly the characters 'R', 'G', and 'B', segregate the values of the array
so that all the Rs come first, the Gs come second, and the Bs come last. You can only
swap elements of the array.</p>

<p>Do this in linear time and in-place.</p>

<p>For example, given the array ['G', 'B', 'R', 'R', 'B', 'R', 'G'], it should become ['R', 'R', 'R', 'G', 'G', 'B', 'B'].</p>

A

<p>It may be easier to first consider an easier problem: one with only two possible values, say
'R' and 'G'. Then we could maintain the following loop invariant quite easily:</p>

<ul><li>Maintain three sections of the array using two indices, <code>low</code> and <code>high</code>:<ul><li>Strictly 'R's: array[:low]</li><li>Unknown: array[low:high]</li><li>Strictly 'G's: array[high:]</li></ul></li></ul>

<p>Initially, low will be 0 and high will be <code>len(array) - 1</code>, since the whole array is unknown.
As we iterate over the array, we'll swap any 'G's we see to the third section and decrement <code>high</code>.
If we see an 'R', then we just need to increment <code>low</code>, since that's where it belongs. We can terminate
once <code>low</code> crosses <code>high</code>. So we can gradually shrink our unknown section through the following algorithm:</p>

<pre><code>def partition(arr):
low, high = 0, len(arr) - 1
while low <= high:
if arr[low] == <span>'R':
low += 1
else:
arr[low], arr[high] = arr[high], arr[low]
high -= 1
</span></code></pre>

<p>This correctly partitions our array into two separate categories. How can we extend this to three partitions?
Let's maintain four sections using 3 indices, <code>low</code>, <code>mid</code>, and <code>high</code>:</p>

<ul><li>Strictly 'R's: array[:low]</li><li>Strictly 'G's: array[low:mid]</li><li>Unknown: array[mid:high]</li><li>Strictly 'B's: array[high:]</li></ul>

<p>We'll initialize <code>low</code> and <code>mid</code> both to 0, and <code>high</code> to <code>len(array) - 1</code> so that our unknown section is the whole array, as before.
To maintain this invariant, we should do the following:</p>

<ul><li>Look at array[mid]:<ul><li>If it's <code>R</code>, then swap <code>array[low]</code> with <code>array[mid]</code> and increment <code>low</code> and <code>mid</code></li><li>If it's <code>G</code>, then just increment <code>mid</code>; it's where it should be</li><li>If it's <code>B</code>, then swap <code>array[mid]</code> with <code>array[high]</code> and decrement <code>high</code></li></ul></li></ul>

<p>Once <code>mid</code> crosses over with <code>high</code>, then our unknown section is gone and we can terminate.</p>

<p>Our solution looks like this:</p>

<pre><code>def partition(arr):
low, mid, high = 0, 0, len(arr) - 1
while mid <= high:
if arr[mid] == <span>'R':
arr[low], arr[mid] = arr[mid], arr[low]
low += 1
mid += 1
elif arr[mid] == <span>'G':
mid += 1
else:
arr[mid], arr[high] = arr[high], arr[mid]
high -= 1
</span></span></code></pre>

<p>P.S. This problem is also called the <a>Dutch national flag problem</a>!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

<p>This problem was asked by Dropbox.</p>

<p>Given the root to a binary search tree, find the second largest node in the tree.</p>

A

<p>An in-order traversal of the binary search tree would give us all the nodes of the tree
in sorted order. So the naive solution here might be do an in-order traversal of the
tree, store it in an array, and return the second-to-last element in the array.</p>

<p>This takes O(N) time and space since we have to go through and store every node in the tree.</p>

<p>We can do better. Notice that the in-order traversal explores always the left node first before
the current node. We could do something similar to that by exploring the right node first.</p>

<p>Let's do a reverse in-order traversal, where we first call ourselves recursively on the right node.
Because it's reversed, that should give us the binary tree in reverse sorted order.</p>

<p>So we can keep a counter, and once we start processing the current node we can increment the counter.
Once it hits 2, that means the current node we're looking at is the second largest, so we can stuff
it in a variable and eventually return that.</p>

<pre><code>def second_largest(root):
def inorder(node):
if not node or count[0] == 2:
return

if node.right:
inorder(node.right)

count[0] += 1
if count[0] == 2:
val.append(node.val)
return

if node.left:
inorder(node.left)

count = [0]
val = []
inorder(root)
return val[0]
</code></pre>

<p>Unfortunately because of Python's <a>demented scoping rules</a>,
we have to wrap <code>count</code> and <code>val</code> in a list. Ugly!</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

<p>This problem was asked by Google.</p>

<p>The power set of a set is the set of all its subsets. Write a function that, given a set, generates its power set.</p>

<p>For example, given the set <code>{1, 2, 3}</code>, it should return <code>{{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}</code>.</p>

<p>You may also use a list or array to represent a set.</p>

A

<p>To gain some intuition about this problem, let's try some examples:</p>

<ul><li>If we're given the empty set (<code>{}</code>), then the power set is a set with only the empty set in it: <code>{{}}</code></li><li>If we're given a set with one element in it (<code>{a}</code>), then the power set is a set with two sets: an empty set and a set with the element in it: <code>{{}, {a}}</code></li><li>If we're given a set with two elements in it (<code>{a, b}</code>), then the power is has four sets: <code>{{}, {a}, {b}, {a, b}}</code></li></ul>

<p>What's the pattern?</p>

<p>Notice that going from the empty set to <code>{a}</code>, that we still keep the empty set in our result and have another set with <code>a</code> in it.
Similarly, when going from one element to two, we keep the same result set with one element (<code>{}, {a}</code>), but we also have a duplicate set with the <code>b</code> in it (<code>{b}, {a ,b}</code>).</p>

<p>So we can use the following recursive formula to generate the power set:</p>

<ul><li>If the input set is empty, return a set with an empty set in it</li><li>Otherwise, take an element from our set. Let's call it <code>x</code>.</li><li>Generate the power set of our input set without x. Let's call it <code>result</code>, for lack of a better name.</li><li>Return the union of <code>name</code> with <code>name + x</code></li></ul>

<pre><code>def power_set(s):
if not s:
return [[]]
result = power_set(s[1:])
return result + [subset + [s[0]] for subset in result]
</code></pre>

<p>This runs in O(2^N) time and space, since that's how many subsets there are.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

<p>This problem was asked by Microsoft.</p>

<p>You have an N by N board. Write a function that, given N, returns the number of possible arrangements of the
board where N queens can be placed on the board without threatening each other, i.e. no two queens
share the same row, column, or diagonal.</p>

A

<p>If we were to attempt to solve this problem using brute force, we would quickly
find out that it would be prohibitively expensive. Consider a typical 8 by 8 board:
we have 64 spots to place 8 queens, so that's 64 choose 8 possible placements.
In general, that's factorial in runtime!</p>

<p>This problem is ripe for solving with backtracking. In backtracking, we
can visualize the search space like a tree, and we would explore it depth-first.
Each node would be a possible configuration. If the configuration contains
eight queens and is valid, then we're done and we can add it to our count.
Otherwise, we can try to place another queen somewhere on the board and
search from there. If we encounter an invalid board, then we can just prune
the entire subtree from our search -- there's no point in exploring a board
that we know won't work.</p>

<p>Notice we can pare down the search space by ensuring we only place queens
in distinct rows, since we know that two queens can never occupy the same row.</p>

<p>Now we can just represent the board as a one-dimensional array of max size N,
where each value represents which column the queen is on. For example, one
solution for N = 4 would just be [1, 3, 0, 2].</p>

<pre><code>def n_queens(n, board=[]):
if n == len(board):
return 1

count = 0
for col in range(n):
board.append(col)
if is_valid(board):
count += n_queens(n, board)
board.pop()
return count

def is_valid(board):
current_queen_row, current_queen_col = len(board) - 1, board[-1]
# Iterate over all already-placed queens and check if any of them can attack
# each other.
for row, col in enumerate(board[:-1]):
diff = abs(current_queen_col - col)
if diff == 0 or diff == current_queen_row - row:
return False
return True
</code></pre>

<p>If you're interested in optimizing this problem even further, check out <a>this paper</a>
that uses constant space by representing all columns and diagonals simply with integers! However, this depends on n being smaller than the number of bits in your integer.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

<p>This problem was asked by Dropbox.</p>

<p>Conway's Game of Life takes place on an infinite two-dimensional board of square cells.
Each cell is either dead or alive, and at each tick, the following rules apply:</p>

<ul><li>Any live cell with less than two live neighbours dies.</li><li>Any live cell with two or three live neighbours remains living.</li><li>Any live cell with more than three live neighbours dies.</li><li>Any dead cell with exactly three live neighbours becomes a live cell.</li></ul>

<p>A cell neighbours another cell if it is horizontally, vertically, or diagonally adjacent.</p>

<p>Implement Conway's Game of Life. It should be able to be initialized with a starting list of
live cell coordinates and the number of steps it should run for. Once initialized, it should print
out the board state at each step. Since it's an infinite board, print out only the relevant coordinates, i.e.
from the top-leftmost live cell to bottom-rightmost live cell.</p>

<p>You can represent a live cell with an asterisk (<code>*</code>) and a dead cell with a dot (<code>.</code>).</p>

A

<p>This is a straightforward implementation problem, so your solution may differ.
Since our board is infinite, we can't create a matrix that represents our whole board.</p>

<p>Instead, we'll represent each cell simply as a pair of cartesian coordinates (row, col).
In this solution, we keep the set of cells as a property on our class. Each
tick, we create a new set of cells that represents the next generation. We
pretty much have to do this so that changing the board doesn't affect the future
cells we process from the current generation.</p>

<p>We look at each live cell, compute the number of neighbours for each one, and
preserve it according to the rules.</p>

<p>Similarly, we look at all the neighbouring cells of all the live cells, since
any of them could potentially become alive due to rule #4. If any of them have
exactly 3 neighbours, then we should add them to the set of new cells.</p>

<p>For printing the board, we need to find the top-leftmost cell and the bottom-rightmost
cell. These are our boundaries for the board. Then we can print out each row and cell
one by one and checking if the current spot is in our set of cells.</p>

<p>It's useful to create some helper functions here. In our case, we have:</p>

<ul><li><code>get_number_of_live_neighbours</code></li><li><code>get_neighbouring_cells</code></li><li><code>get_boundaries</code></li></ul>

<pre><code>class GameOfLife:
def \_\_init\_\_(self, n, cells=set()):
# Each cell will be a tuple (row, col)
self.cells = cells
for _ in range(n):
self.print_board()
self.next()

def get_number_of_live_neighbours(self, row, col):
count = 0
for cell_row, cell_col in self.cells:
if abs(cell_row - row) > 1:
continue
if abs(cell_col - col) > 1:
continue
if cell_row == row and cell_col == col:
continue
count += 1
return count

def get_neighbouring_cells(self, row, col):
return set([
(row - 1, col - 1),
(row, col - 1),
(row + 1, col - 1),
(row - 1, col),
(row + 1, col),
(row - 1, col + 1),
(row, col + 1),
(row + 1, col + 1),
])

def next(self):
new_cells = set()
# Go through each cell, look for neighbours, decide whether to append to new list
for row, col in self.cells:
num_of_neighbours = self.get_number_of_live_neighbours(row, col)
if 2 <= num_of_neighbours <= 3:
new_cells.add((row, col))

potential_live_cells = set()
for row, col in self.cells:
potential_live_cells = potential_live_cells.union(self.get_neighbouring_cells(row, col))
potential_live_cells = potential_live_cells - self.cells

# Go through each potential live cell, get the number of neighbours, and add if = 3
for row, col in potential_live_cells:
num_of_neighbours = self.get_number_of_live_neighbours(row, col)
if num_of_neighbours == 3:
new_cells.add((row, col))

self.cells = new_cells

def get_boundaries(self):
top = min(self.cells, key=lambda cell: cell[0])[0]
left = min(self.cells, key=lambda cell: cell[1])[1]
bottom = max(self.cells, key=lambda cell: cell[0])[0]
right = max(self.cells, key=lambda cell: cell[1])[1]
return top, left, bottom, right

def print_board(self):
top, left, bottom, right = self.get_boundaries()
print(<span>'--------------------------------------')
for i in range(top, bottom + 1):
for j in range(left, right + 1):
if (i, j) in self.cells:
print(<span>'*', end=<span>'')
else:
print(<span>'.', end=<span>'')
print(<span>'')
print(<span>'--------------------------------------')
</span></span></span></span></span></span></span></code></pre>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

<p>This problem was asked by Google.</p>

<p>Given an array of integers where every integer occurs three times except for one integer, which only occurs once,
find and return the non-duplicated integer.</p>

<p>For example, given [6, 1, 3, 3, 3, 6, 6], return 1. Given [13, 19, 13, 13], return 19.</p>

<p>Do this in O(N) time and O(1) space.</p>

A

<p>We can find the unique number in an array of <em>two</em> duplicates by XORing all the numbers
in the array. What this does is cancel out all the bits that have an even number of 1s,
leaving only the unique (odd) bits out.</p>

<p>Let's try to extend this technique to three duplicates. Instead of cancelling out all the
bits with an even number of bits, we want to cancel those out that have a number of bits
that are multiple of three.</p>

<p>Let's assume all integers fit in 32 bits. Then let's create an array 32 zeroes long, and
when iterating over each number in our array, we can add up all the bits to its proper spot in the
array. Finally, we'll go over each bit in the array and make it equal to itself modulo 3. This
means that any bit that has been set some multiple of 3 times will effectively be cleared, leaving
only the bit from the unique number.</p>

<pre><code>def find_unique(arr):
result_arr = [0] * 32
for num in arr:
for i in range(32):
bit = num >> i & 1
result_arr[i] = (result_arr[i] + bit) % 3

result = 0
for i, bit in enumerate(result_arr):
if bit:
result += 2 ** i

return result
</code></pre>

<p>This runs in linear time, since we iterate over the array once, and in constant space, since we initialize an array of constant size.</p>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

<p>This problem was asked by Facebook.</p>

<p>Given an unordered list of flights taken by someone, each represented as (origin, destination) pairs, and a starting airport,
compute the person's itinerary. If no such itinerary exists, return null. If there are multiple possible itineraries, return
the lexicographically smallest one. All flights must be used in the itinerary.</p>

<p>For example, given the list of flights [('SFO', 'HKO'), ('YYZ', 'SFO'), ('YUL', 'YYZ'), ('HKO', 'ORD')] and starting airport 'YUL',
you should return the list ['YUL', 'YYZ', 'SFO', 'HKO', 'ORD'].</p>

<p>Given the list of flights [('SFO', 'COM'), ('COM', 'YYZ')] and starting airport 'COM', you should return null.</p>

<p>Given the list of flights [('A', 'B'), ('A', 'C'), ('B', 'C'), ('C', 'A')] and starting airport 'A',
you should return the list ['A', 'B', 'C', 'A', 'C'] even though ['A', 'C', 'A', 'B', 'C'] is also a valid
itinerary. However, the first one is lexicographically smaller.</p>

A

<p>This problem is similar to the N queens problem a few days ago: we have a desired
final state (all the flights are used up), we can construct partial itineraries and
reject them, and at each step we have potentially multiple avenues to explore. That
suggests that backtracking is again a very likely candidate for solving our problem.</p>

<p>In particular, we can do the following:</p>

<ul><li>Keep a list of itinerary candidates</li><li>Keep a current itinerary initialized with our starting airport</li><li>Then, recursively:<ul><li>Iterate over all the flights that start from the last airport in our
itinerary</li><li>For each flight, temporarily add the destination to our itinerary and remove
it from the flight list. Then call ourselves recursively with the new itinerary and flight list.</li><li>Concatenate all the results to our list of itinerary candidates.</li></ul></li><li>Sort our itinerary candidates and pick the lexicographically smallest one.</li></ul>

<p>To speed this up, we'll store all the flights into a dictionary with the origin
as a key and a list of flight destinations from that origin as the value.
Then we can look up our options in O(1) time instead of O(N) time.</p>

<pre><code>from collections import defaultdict

def get_itinerary(flights, start):
# Store all the flights into a dictionary key:origin -> val:list of destinations
flight_map = defaultdict(list)
for origin, destination in flights:
flight_map[origin] += [destination]

def visit(flight_map, total_flights, current_itinerary):
# If our itinerary uses up all the flights, we're done here.
if len(current_itinerary) == total_flights + 1:
return [current_itinerary[:]]

last_stop = current_itinerary[-1]
# If we haven't used all the flights yet but we have no way
# of getting out of this airport, then we're stuck. Backtrack out.
if not flight_map[last_stop]:
return []

# Otherwise, let's try all the options out of the current stop recursively.
# We temporarily take them out of the mapping once we use them.
potential_itineraries = []
for i, flight in enumerate(flight_map[last_stop]):
flight_map[last_stop].pop(i)
current_itinerary.append(flight)
potential_itineraries.extend(visit(flight_map, total_flights, current_itinerary))
flight_map[last_stop].insert(i, flight)
current_itinerary.pop()
return potential_itineraries

valid_itineraries = visit(flight_map, len(flights), [start])
if valid_itineraries:
return sorted(valid_itineraries)[0]
</code></pre>

42
Q

<p>This problem was asked by Google.</p>

<p>Given a list of integers S and a target number k, write a function that returns a subset of S that adds up to k.
If such a subset cannot be made, then return null.</p>

<p>Integers can appear more than once in the list. You may assume all numbers in the list are positive.</p>

<p>For example, given S = [12, 1, 61, 5, 9, 2] and k = 24, return [12, 9, 2, 1] since it sums up to 24.</p>

A

<p>Let's consider the brute force method: selecting all subsets, summing them, and checking if they equal k.
That would take O(2^N * N) time, since generating all subsets takes O(2^N) and we need to sum everything
in the subset.</p>

<p>We can do a little better by implicitly computing the sum. That is, for each call, we can basically choose
whether to pick some element (let's say the last) in our set and recursively looking for <code>k - last</code> in the remaining part
of the list, or exclude the last element and keep on looking for <code>k</code> in the remaining part of the list recursively.</p>

<pre><code>def subset_sum(nums, k):
if k == 0:
return []
if not nums and k != 0:
return None

nums_copy = nums[:]
last = nums_copy.pop()

with_last = subset_sum(nums_copy, k - last)
without_last = subset_sum(nums_copy, k)
if with_last is not None:
return with_last + [last]
if without_last is not None:
return without_last
</code></pre>

<p>This will run in O(2^N) theoretically, but practically, since we copy the whole array on each call, it's still O(2^N * N), which
is worse than exponential.</p>

<p>Let's try to improve the running time by using dynamic programming. We have the recursive formula nailed down.
How can we use bottom-up dynamic programming to improve the runtime?</p>

<p>We can construct a table <code>A</code> that's size <code>len(nums) + 1</code> by <code>k + 1</code>. At each index <code>A[i][j]</code>, we'll keep a subset of the list from <code>0..i</code> (including lower, excluding upper bound)
that can add up to <code>j</code>, or null if no list can be made. Then we will fill up the table using pre-computed values and once we're done,
we should be able to just return the value at <code>A[-1][-1]</code>. Let's first initialize the list:</p>

<pre><code>A = [[None for _ in range(k + 1)] for _ in range(len(nums) + 1)]
</code></pre>

<p>To begin, we can initialize each element of the first row (<code>A[i][0] for i in range(len(nums) + 1)</code>) with the empty list, since any subset of the list can make 0: just don't pick anything!</p>

<pre><code>for i in range(len(nums) + 1):
A[i][0] = []
</code></pre>

<p>Each element of the first column (<code>A[0][j]</code> for j in range(1, len(nums))`) starting from the first row should be null, since we can't make anything other than 0 with the empty set.
Since we've initialized our whole table to be null, then we don't need to do anything here.</p>

<pre><code>[], [], [], [], [], ...
None, None, None, ...
None, None, None, ...
None, None, None, ...
...
</code></pre>

<p>Now we can start populating the table. Iterating over each row starting at 1, and then each column starting at 1, we can use
the following formula to compute <code>A[i][j]</code>:</p>

<ul><li><p>First, let's consider the last element of the list we're looking at: <code>nums[i - 1]</code>. Let's call this <code>last</code>.</p></li><li><p>If <code>last</code> is greater than <code>j</code>, then we definitely can't make <code>j</code> with <code>nums[:i]</code> including <code>last</code> (since it would obviously go over). So let's just copy
over whatever we had from <code>A[i - 1][j]</code>. If we can make <code>j</code> without <code>last</code>, then we can still make <code>j</code>. If we can't, then we still can't.</p></li><li>If <code>last</code> smaller than or equal to <code>j</code>, then we still might be able to make <code>j</code> using <code>last</code><ul><li>If we can make <code>j</code> without <code>last</code> by looking up the value at <code>A[i - 1][j]</code> and if it's not null, then use that.</li><li>Else, if we can't make <code>j</code> without <code>last</code>, check if we can make it <em>with</em> <code>last</code> by looking up the value at <code>A[i - 1][j - last]</code>. If we can, then copy over the list from there and append the last element to it.</li><li>Else, we can't make it with or without <code>j</code>, so set <code>A[i][j]</code> to null.</li></ul></li></ul>

<pre><code>for i in range(1, len(nums) + 1):
for j in range(1, k + 1):
last = nums[i - 1]
if last > j:
A[i][j] = A[i - 1][j]
else:
if A[i - 1][j] is not None:
A[i][j] = A[i - 1][j]
elif A[i - 1][j - last] is not None:
A[i][j] = A[i - 1][j - last] + [last]
else:
A[i][j] = None
</code></pre>

<p>Putting it all together:</p>

<pre><code>def subset_sum(nums, k):
A = [[None for _ in range(k + 1)] for _ in range(len(nums) + 1)]

for i in range(len(nums) + 1):
A[i][0] = []

for i in range(1, len(nums) + 1):
for j in range(1, k + 1):
last = nums[i - 1]
if last > j:
A[i][j] = A[i - 1][j]
else:
if A[i - 1][j] is not None:
A[i][j] = A[i - 1][j]
elif A[i - 1][j - last] is not None:
A[i][j] = A[i - 1][j - last] + [last]
else:
A[i][j] = None

return A[-1][-1]
</code></pre>

<p>This runs in O(k * N) time and space.</p>

43
Q

<p>This problem was asked by Amazon.</p>

<p>Implement a stack that has the following methods:</p>

<ul><li>push(val), which pushes an element onto the stack</li><li>pop(), which pops off and returns the topmost element of the stack. If there are no elements in the stack, then it should throw an error or return null.</li><li>max(), which returns the maximum value in the stack currently. If there are no elements in the stack, then it should throw an error or return null.</li></ul>

<p>Each method should run in constant time.</p>

A

<p>Implementing the stack part (push and pop) of this problem is easy -- we can just use a typical list to
implement the stack with <code>append</code> and <code>pop</code>. However, getting the max in constant time is a little trickier.
We could obviously do it in linear time if we popped off everything on the stack while keeping track of the
maximum value, and then put everything back on.</p>

<p>We can use a secondary stack that <em>only</em> keeps track of the max values at any time. It will be in sync
with our primary stack, as in it will have the exact same number of elements as our primary stack
at any point in time, but the top of the stack will always contain the maximum value of the stack.</p>

<p>We can then, when pushing, check if the element we're pushing is greater than the max value of the
secondary stack (by just looking at the top), and if it is, then push that instead. If not, then
maintain the previous value.</p>

<pre><code>class MaxStack:
def \_\_init\_\_(self):
self.stack = []
self.maxes = []

def push(self, val):
self.stack.append(val)
if self.maxes:
self.maxes.append(max(val, self.maxes[-1]))
else:
self.maxes.append(val)

def pop(self):
if self.maxes:
self.maxes.pop()
return self.stack.pop()

def max(self):
return self.maxes[-1]
</code></pre>

<p>Everything should run in O(1) time.</p>

44
Q

<p>This problem was asked by Google.</p>

<p>We can determine how "out of order" an array A is by counting the number of
inversions it has. Two elements <code>A[i]</code> and <code>A[j]</code> form an inversion if <code>A[i] > A[j]</code> but <code>i < j</code>.
That is, a smaller element appears after a larger element.</p>

<p>Given an array, count the number of inversions it has. Do this faster than O(N^2) time.</p>

<p>You may assume each element in the array is distinct.</p>

<p>For example, a sorted list has zero inversions. The array [2, 4, 1, 3, 5] has three inversions: (2, 1), (4, 1), and (4, 3).
The array [5, 4, 3, 2, 1] has ten inversions: every distinct pair forms an inversion.</p>

A

<p>The brute force solution here should come naturally from the definition: we can
run a doubly nested for loop over all pairs, and increment a counter whenever
we encounter a larger element before a smaller element. That would look like this:</p>

<pre><code>def count_inversions(arr):
count = 0
for i in range(len(arr) - 1):
for j in range(i + 1, len(arr)):
if arr[i] > arr[j]:
count += 1
return count
</code></pre>

<p>However, this would run in O(N^2), and we want something faster. We can use the following
recursive, divide-and-conquer algorithm to count the number of inversions in O(N log N) time.</p>

<ul><li>First, let's separate our input array into two halves A and B</li><li>Count the number of inversions in each list recursively</li><li>Count the inversions <em>between</em> A and B</li><li>Return the sum of all three counts</li></ul>

<p>If we are able to count all the inversions between A and B in linear time, then according to
the <a>master theorem for divide-and-conquer recurrences</a>),
our algorithm will run in O(N log N) time, since we have the same recurrence relationship as merge sort.</p>

<p>How can we count the inversions between A and B in linear time? If we expand our <code>count_inversions</code>
function to also sort the array as well, we can use a similar technique to merge sort to merge
and also count the inversions between A and B. To be specific, assuming A and B are sorted, we
can construct a helper function that does the following:</p>

<ul><li>Scan A and B from left to right, with two pointers <code>i</code> and <code>j</code></li><li>Compare <code>A[i]</code> and <code>B[j]</code><ul><li>If <code>A[i]</code> is smaller than <code>B[j]</code>, then <code>A[i]</code> is not inverted with anything from <code>B</code>, since it's expected that everything in <code>A</code> would be smaller than everything in <code>B</code> if <code>A + B</code> was sorted.</li><li>If <code>A[i]</code> is greater than <code>B[j]</code>, then <code>B[j]</code> is inverted with everything from <code>A[i:]</code>, since <code>A</code> is sorted. Increment our counter by the number of elements in <code>A[i:]</code>.</li></ul></li><li>Append the smaller element between <code>A[i]</code> and <code>B[j]</code> to our sorted list</li><li>Return the sorted list</li></ul>

<pre><code>def count_inversions(arr):
count, _ = count_inversions_helper(arr)
return count

def count_inversions_helper(arr):
if len(arr) <= 1:
return 0, arr
mid = len(arr) // 2
a = arr[:mid]
b = arr[mid:]
left_count, left_sorted_arr = count_inversions_helper(a)
right_count, right_sorted_arr = count_inversions_helper(b)
between_count, sorted_arr = merge_and_count(left_sorted_arr, right_sorted_arr)
return left_count + right_count + between_count, sorted_arr

def merge_and_count(a, b):
count = 0
sorted_arr = []
i, j = 0, 0
while i < len(a) and j < len(b):
if a[i] < b[j]:
sorted_arr.append(a[i])
i += 1
elif a[i] > b[j]:
sorted_arr.append(b[j])
count += len(a) - i
j += 1
sorted_arr.extend(a[i:])
sorted_arr.extend(b[j:])
return count, sorted_arr
</code></pre>

45
Q

<p>This problem was asked by Two Sigma.</p>

<p>Using a function <code>rand5()</code> that returns an integer from 1 to 5 (inclusive) with uniform
probability, implement a function <code>rand7()</code> that returns an integer from 1 to 7 (inclusive).</p>

A

<p>We can solve this by computing <code>rand5()</code> twice. This gives us more than 7 options to choose
from. However, we must be careful not to take the sum or product of the results -- this can
skew the probability distribution. Consider that there's only one way to make 2 from two
<code>rand5</code>s but two ways to make 3.</p>

<p>So we must consider each distinct pair of <code>rand5()</code> results. This gives us 5^2 = 25 different
ways to pick from, each uniformly distributed. Ideally, we would divide these by 7, but no
power of 5 is also a multiple of 7 (consider the prime factorization of 5^N), so we will have
to make do. For our solution, we'll make a table of results:</p>

1
2
3
4
5

1
1
1
1
6
7

2
2
2
2
6
7

3
3
3
3
6
7

4
4
4
4
R
R

5
5
5
5
R
R

<p>R means we need to reroll.</p>

<pre><code>def rand7():
r1, r2 = rand5(), rand5()
if r2 <= 3:
return r1
elif r2 == 4:
if r1 <= 3:
return 6
else:
return rand7()
else: # r2 == 5
if r1 <= 3:
return 7
else:
return rand7()
</code></pre>

<p>This method has a potentially infinite runtime, since it's possible that we always
roll the cases where we need to reroll.</p>

46
Q

<p>This problem was asked by Amazon.</p>

<p>Given a string, find the longest palindromic contiguous substring. If there are more than
one with the maximum length, return any one.</p>

<p>For example, the longest palindromic substring of "aabcdcb" is "bcdcb". The longest
palindromic substring of "bananas" is "anana".</p>

A

<p>We can compute the longest palindromic contiguous substring in O(N^3) using brute force.
We can just iterate over each substring of the array and check if it's a palindrome.</p>

<pre><code>def is_palindrome(s):
return s[::-1] == s

def longest_palindrome(s):
longest = <span>''
for i in range(len(s) - 1):
for j in range(1, len(s)):
substring = s[i:j]
if is_palindrome(substring) and len(substring) > len(longest):
longest = substring
return longest
</span></code></pre>

<p>We can improve the running time of this algorithm by using dynamic programming to store
any repeated computations. Let's keep an N by N table <code>A</code>, where N is the length of the
input string. Then, at each index <code>A[i][j]</code> we'll keep whether or not the substring made
from <code>s[i:j]</code> is a palindrome. We'll make use of the following relationships:</p>

<ul><li>All strings of length 1 are palindromes</li><li><code>s</code> is a palindrome if <code>s[1:-1]</code> is a palindrome and the first and last character of <code>s</code> are the same</li></ul>

<p>So, when we fill up our table, we can do the following:</p>

<ul><li>First, set each item along the diagonal `A[i:i] to true, since strings of length 1 are always palindromes</li><li>Then, check <code>A[i:i+1]</code> and set it to true if <code>A[i] == A[i + 1]</code> and false otherwise (check all strings of length 2)</li><li>Finally, iterate over the matrix from top to bottom, left to right, only examining the upper diagonal. Note that it
doesn't make sense for <code>j</code> to be smaller than <code>i</code>, so that's why we only need to deal with half of the matrix.
Set <code>A[i][j]</code> to true only if <code>A[i + 1][j - 1]</code> is true and <code>A[i]</code> == <code>A[j]</code>.</li><li>Keep track of the longest palindromic substring we've seen so far.</li></ul>

<p>Let's go through an example with the word "bananas".</p>

b
a
n
a
n
a
s

b
t
f
f
f
f
f
f

a

t
f
t
f
t
f

n

t
f
t
f
f

a

t
f
f
f

n

t
f
f

a

t
f

s

t

<p>We can see from this table that the longest palindromic substring here is "ananas", since
<code>A[1:5]</code> is the longest substring that's true in the table.</p>

<pre><code>def longest_palindrome(s):
if not s:
return <span>''

longest = s[0]
A = [[None for _ in range(len(s))] for _ in range(len(s))]

# Set all substrings of length 1 to be true
for i in range(len(s)):
A[i][i] = True

# Try all substrings of length 2
for i in range(len(s) - 1):
A[i][i + 1] = s[i] == s[i + 1]

i, k = 0, 3
while k <= len(s):
while i < (len(s) - k + 1) :
j = i + k - 1
A[i][j] = A[i + 1][j - 1] and s[i] == s[j]
# Update longest if necessary
if A[i][j] and len(s[i:j + 1]) > len(longest):
longest = s[i:j + 1]
i += 1
k += 1
i = 0
return longest
</span></code></pre>

<p>This runs in O(N^2) time and space.</p>

47
Q

<p>This problem was asked by Facebook.</p>

<p>Given a array of numbers representing the stock prices of a company in chronological order,
write a function that calculates the maximum profit you could have made from buying
and selling that stock once. You must buy before you can sell it.</p>

<p>For example, given [9, 11, 8, 5, 7, 10], you should return 5, since you could buy the stock
at 5 dollars and sell it at 10 dollars.</p>

A

<p>The brute force solution here is to iterate over our list of stock prices, and for each
price, find the largest profit you can make from selling that stock price (with the formula future - current), and keep
track of the largest profit we can find. That would look like this:</p>

<pre><code>def buy_and_sell(arr):
max_profit = 0
for i in range(len(arr) - 1):
for j in range(i, len(arr)):
buy_price, sell_price = arr[i], arr[j]
max_profit = max(max_profit, sell_price - buy_price)
return max_profit
</code></pre>

<p>This would take O(N2). Can we speed this up?</p>

<p>The maximum profit comes from the greatest difference between the highest price and lowest
price, where the higher price must come after the lower one. But if we see a high price x
and then a higher price y afterwards, then we can always discard x. So, if we keep track
of the highest price in the future for each variable, we can immediately find how much profit
buying at that price can make.</p>

<p>That means we can look at the array backwards and always keep track of the highest price
we've seen so far. Then, at each step, we can look at the current price and check how much
profit we would have made buying at that price by comparing with our maximum price in the future.
Then we only need to make one pass!</p>

<pre><code>def buy_and_sell(arr):
current_max, max_profit = 0, 0
for price in reversed(arr):
current_max = max(current_max, price)
potential_profit = current_max - price
max_profit = max(max_profit, potential_profit)
return max_profit
</code></pre>

<p>This runs in O(N) time and O(1) space.</p>

48
Q

<p>This problem was asked by Google.</p>

<p>Given pre-order and in-order traversals of a binary tree, write a function to
reconstruct the tree.</p>

<p>For example, given the following preorder traversal:</p>

<p>[a, b, d, e, c, f, g]</p>

<p>And the following inorder traversal:</p>

<p>[d, b, e, a, f, c, g]</p>

<p>You should return the following tree:</p>

<pre><code> a
/ \
b c
/ \ / \
d e f g
</code></pre>

A

<p>Recall the definitions of preorder and inorder traversals:</p>

<p>For preorder:</p>

<pre><code>- Evaluate root node
- Evaluate left node recursively
- Evaluate right node recursively
</code></pre>

<p>For inorder:</p>

<pre><code>- Evaluate left node recursively
- Evaluate root node
- Evaluate right node recursively
</code></pre>

<p>It's helpful to go over an example. Consider the following tree:</p>

<pre><code> a
/ \
b c
/ \ / \
d e f g
</code></pre>

<p>The preorder traversal for this tree would be [a, b, d, e, c, f, g].</p>

<p>The inorder traversal for this tree would be [d, b, e, a, f, c, g].</p>

<p>Notice that because we always evaluate the root node first in a preorder traversal,
the first element in the preorder traversal will always be the root. The second element
is then either the root of the left node if there is one, or the root of the right node.
But how do we know?</p>

<p>We can look at the inorder traversal.</p>

<p>Because we look at the left node first in an inorder traversal, all the elements up until
the root will be part of the left subtree. All elements after the root will be the right
subtree.</p>

<pre><code>Preorder:
[a, b, d, e, c, f, g]
| r | left | right |

Inorder:
[d, b, e, a, f, c, g]
| left | r | right |

(r = root)
</code></pre>

<p>This gives us an idea for how to solve the problem:</p>

<ul><li>Find the root by looking at the first element in the preorder traversal</li><li>Find out how many elements are in the left subtree and right subtree by searching for the index of the root in the inorder traversal</li><li>Recursively reconstruct the left subtree and right subtree</li></ul>

<p>The code for this problem would look like this:</p>

<pre><code>def reconstruct(preorder, inorder):
if not preorder and not inorder:
return None
if len(preorder) == len(inorder) == 1:
return preorder[0]

root = preorder[0]
root_i = inorder.index(root)
root.left = reconstruct(preorder[1:1 + root_i], inorder[0:root_i])
root.right = reconstruct(preorder[1 + root_i:], inorder[root_i + 1:])
return root
</code></pre>

49
Q

<p>This problem was asked by Amazon.</p>

<p>Given an array of numbers, find the maximum sum of any contiguous subarray of the array.</p>

<p>For example, given the array [34, -50, 42, 14, -5, 86], the maximum sum would be 137, since we
would take elements 42, 14, -5, and 86.</p>

<p>Given the array [-5, -1, -8, -9], the maximum sum would be 0, since we would not take any elements.</p>

<p>Do this in O(N) time.</p>

A

<p>The brute force approach here would be to iterate over every contiguous subarray
and calculate its sum, keeping track of the largest one seen.</p>

<pre><code>def max_subarray_sum(arr):
current_max = 0
for i in range(len(arr) - 1):
for j in range(i, len(arr)):
current_max = max(current_max, sum(arr[i:j]))
return current_max
</code></pre>

<p>This would run in O(N^3) time. How can we make this faster?</p>

<p>We can work backwards from our desired solution by iterating over the array and
looking at the maximum possible subarray that can be made ending at each index.
At each index, either we can include that element in our sum or we exclude it.</p>

<p>We can then keep track of the maximum subarray we've seen so far in a variable
<code>max_so_far</code>, compute the maximum subarray that includes <code>x</code> at each iteration,
and choose whichever one is bigger.</p>

<pre><code>def max_subarray_sum(arr):
max_ending_here = max_so_far = 0
for x in arr:
max_ending_here = max(x, max_ending_here + x)
max_so_far = max(max_so_far, max_ending_here)
return max_so_far
</code></pre>

<p>This algorithm is known as Kadane's algorithm, and it runs in O(N) time and O(1) space.</p>

50
Q

<p>This problem was asked by Microsoft.</p>

<p>Suppose an arithmetic expression is given as a binary tree. Each leaf is an integer and each internal node is one of '+', '−', '∗', or '/'.</p>

<p>Given the root to such a tree, write a function to evaluate it.</p>

<p>For example, given the following tree:</p>

<pre><code> *
/ \
\+ +
/ \ / \
3 2 4 5
</code></pre>

<p>You should return 45, as it is (3 + 2) * (4 + 5).</p>

A

<p>This problem should be straightforward from the definition. It will be recursive. We check the
value of the root node. If it's one of our arithmetic operators, then we take
the evaluated value of our node's children and apply the operator on them.</p>

<p>If it's not an arithmetic operator, it has to be a raw number, so we can just return that.</p>

<pre><code>class Node:
def \_\_init\_\_(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right

PLUS = <span>"+"
MINUS = <span>"-"
TIMES = <span>"*"
DIVIDE = <span>"/"
def evaluate(root):
if root.val == PLUS:
return evaluate(root.left) + evaluate(root.right)
elif root.val == MINUS:
return evaluate(root.left) - evaluate(root.right)
elif root.val == TIMES:
return evaluate(root.left) * evaluate(root.right)
elif root.val == DIVIDE:
return evaluate(root.left) / evaluate(root.right)
else:
return root.val
</span></span></span></span></code></pre>

<p>This runs in O(N) time and space.</p>

51
Q

<p>This problem was asked by Facebook.</p>

<p>Given a function that generates perfectly random numbers between 1 and k (inclusive),
where k is an input, write a function that shuffles a deck of cards represented as an
array using only swaps.</p>

<p>It should run in O(N) time.</p>

<p>Hint: Make sure each one of the 52! permutations of the deck is equally likely.</p>

A

<p>The most common mistake people make when implementing this shuffle is something like this:</p>

<ul><li>Iterate through the array with an index <code>i</code></li><li>Generate a random index <code>j</code> between 0 and <code>n - 1</code></li><li>Swap <code>A[i]</code> and <code>A[j]</code></li></ul>

<p>That code would look something like this:</p>

<pre><code>def shuffle(arr):
n = len(arr)
for i in range(n):
j = randint(0, n - 1)
arr[i], arr[j] = arr[j], arr[i]
return arr
</code></pre>

<p>This looks like it would reasonably shuffle the array. However, the issue with
this code is that it slightly biases certain outcomes. Consider the following
array: [a, b, c]. At each step <code>i</code>, we have three different possible outcomes:
switching the element at <code>i</code> with any other index in the array. Since we swap
up to three times, we have 3^3 = 27 possible (and equally likely) outcomes. But there are only 6 outcomes,
and they all need to be equally likely:</p>

<ul><li>[a, b, c]</li><li>[a, c, b]</li><li>[b, a, c]</li><li>[b, c, a]</li><li>[c, a, b]</li><li>[c, b, a]</li></ul>

<p>6 doesn't divide into 26 evenly, so it must be the case that some outcomes are over-represented. Indeed,
if we run this algorithm a million times, we see some skew:</p>

<pre><code>(2, 1, 3): 184530
(1, 3, 2): 185055
(3, 2, 1): 148641
(2, 3, 1): 185644
(3, 1, 2): 147995
(1, 2, 3): 148135
</code></pre>

<p>Recall that we want every permutation to be equally likely:
in other words, any element should have a <code>1 / n</code> probability to end up in any spot.
To make sure each element has <code>1 / n</code> probability of ending up in any spot, we can do
the following:</p>

<ul><li>Iterate through the array with an index <code>i</code></li><li>Generate a random index <code>j</code> between <code>i</code> and <code>n - 1</code></li><li>Swap <code>A[i]</code> and <code>A[j]</code></li></ul>

<p>Why does this generate a uniform distribution? Let's use a loop invariant to prove this.</p>

<p>Our loop invariant will be the following: at each index <code>i</code> of our loop, all indices before i
have an equally random probability of being any element from our array.</p>

<p>Consider <code>i = 1</code>. Since we are swapping <code>A[0]</code> with an index that spans the entire array,
<code>A[0]</code> has an equally uniform probability of being any element in the array. So our invariant
is true in this case.</p>

<p>Assume our loop invariant is true until <code>i</code> and consider the loop at <code>i + 1</code>. Then we should
calculate the probability of some element ending up at index <code>i + 1</code>. That's equal to the probability of
not picking that element up until <code>i</code> and then choosing it.</p>

<p>All the remaining prospective elements must not have been picked yet, which means it avoided being
picked from 0 to <code>i</code>. That's a probability of <code>(n - 1 / n) * (n - 2 / n - 1) * ... * (n - i - 1 / n - i)</code>.</p>

<p>Finally, we need to actually choosing it. Since there are <code>n - i</code> remaining elements to choose from, that's a probability of <code>1 / (n - i)</code>.</p>

<p>Putting them together, we have a probability of <code>(n - 1 / n) * (n - 2 / n - 1) * ... * (n - i - 1 / n - i) * (1 / n - i)</code>. Notice that
everything beautifully cancels out and we are left with a probability of <code>1 / n</code>!</p>

<p>Here's what the code looks like:</p>

<pre><code>def shuffle(arr):
n = len(arr)
for i in range(n - 1):
j = randint(i, n - 1)
arr[i], arr[j] = arr[j], arr[i]
return arr
</code></pre>

<p>P.S. This algorithm is called the Fisher-Yates shuffle.</p>

52
Q

<p>This problem was asked by Google.</p>

<p>Implement an LRU (Least Recently Used) cache. It should be able to be initialized with a cache size <code>n</code>,
and contain the following methods:</p>

<ul><li><code>set(key, value)</code>: sets <code>key</code> to <code>value</code>. If there are already <code>n</code> items in the cache and we are adding a new
item, then it should also remove the least recently used item.</li><li><code>get(key)</code>: gets the value at <code>key</code>. If no such key exists, return null.</li></ul>

<p>Each operation should run in O(1) time.</p>

A

<p>To implement both these methods in constant time, we'll need to use a hash table along
with a linked list. The hash table will map keys to nodes in the linked list, and the
linked list will be ordered from least recently used to most recently used. Then, for <code>set</code>:</p>

<ul><li>First look at our current capacity. If it's < n, then create a node with the val,
set it as the head, and add it as an entry in the dictionary.</li><li>If it's equal to n, then add our node as usual, but also evict the least frequently used
node by deleting the tail of our linked list and also removing the entry from our dictionary.
We'll need to keep track of the key in each node so that we know which entry to evict.</li></ul>

<p>For <code>get</code>:</p>

<ul><li>If the key doesn't exist in our dictionary, then return null.</li><li>Otherwise, look up the relevant node through the dictionary. Before returning it, update
the linked list by moving the node to the front of the list.</li></ul>

<p>To help us out, we can use the following tricks:</p>

<ul><li>Using dummy nodes for the head and tail of our list, which will simplify creating the list when nothing's initialized.</li><li>Implementing the helper class LinkedList to reuse code when adding and removing nodes to our linked list.</li><li>When we need to bump a node to the back of list (like when we fetch it), we can just remove it and readd it.</li></ul>

<p>In the end, the code would look like this:</p>

<pre><code>class Node:
def \_\_init\_\_(self, key, val):
self.key = key
self.val = val
self.prev = None
self.next = None

class LinkedList:
def \_\_init\_\_(self):
# dummy nodes
self.head = Node(None, <span>'head')
self.tail = Node(None, <span>'tail')
# set up head <-> tail
self.head.next = self.tail
self.tail.prev = self.head

def get_head(self):
return self.head.next

def get_tail(self):
return self.tail.prev

def add(self, node):
prev = self.tail.prev
prev.next = node
node.prev = prev
node.next = self.tail
self.tail.prev = node

def remove(self, node):
prev = node.prev
nxt = node.next
prev.next = nxt
nxt.prev = prev

class LRUCache:
def \_\_init\_\_(self, n):
self.n = n
self.dict = {}
self.list = LinkedList()

def set(self, key, val):
if key in self.dict:
self.dict[key].delete()
n = Node(key, val)
self.list.add(n)
self.dict[key] = n
if len(self.dict) > self.n:
head = self.list.get_head()
self.list.remove(head)
del self.dict[head.key]

def get(self, key):
if key in self.dict:
n = self.dict[key]
# bump to the back of the list by removing and adding the node
self.list.remove(n)
self.list.add(n)
return n.val
</span></span></code></pre>

<p>All operations run in O(1) time.</p>

53
Q

<p>This problem was asked by Apple.</p>

<p>Implement a queue using two stacks. Recall that a queue is a FIFO (first-in, first-out) data structure with the
following methods: <code>enqueue</code>, which inserts an element into the queue, and <code>dequeue</code>, which removes it.</p>

A

<p>We can implement this by noticing that while a stack is LIFO (last in first out),
if we empty a stack one-by-one into another stack, and then pop from the other stack
we can simulate a FIFO (first in first out) list.</p>

<p>Consider enqueuing three elements: 1, 2, and then 3:</p>

<pre><code>stack1: [1, 2, 3]
stack2: []
</code></pre>

<p>Then emptying stack1 into stack2:</p>

<pre><code>stack1: []
stack2: [3, 2, 1]
</code></pre>

<p>Then dequeuing three times:</p>

<pre><code>1
2
3
</code></pre>

<p>Which retains the original order. So when enqueuing, we can simply push to our first stack.
When dequeuing, we'll first check our second stack to see if any residual elements are there
from a previous emptying, and if not, we'll empty all of stack one into stack two immediately so that
the order of elements is correct (we shouldn't empty some elements into stack two, pop only some of them,
and then empty some more, for example).</p>

<pre><code>class Queue:
def \_\_init\_\_(self):
self.s1 = []
self.s2 = []

def enqueue(self, val):
self.s1.append(val)

def dequeue(self):
if self.s2:
return self.s2.pop()
if self.s1:
# empty all of s1 into s2
while self.s1:
self.s2.append(self.s1.pop())
return self.s2.pop()
return None
</code></pre>

54
Q

<p>This problem was asked by Dropbox.</p>

<p>Sudoku is a puzzle where you're given a partially-filled 9 by 9 grid with digits.
The objective is to fill the grid with the constraint that every row, column, and
box (3 by 3 subgrid) must contain all of the digits from 1 to 9.</p>

<p>Implement an efficient sudoku solver.</p>

A

<p>Trying brute force on a sudoku board will take a really long time: we will need to
try every permutation of the numbers 1-9 for all the non-empty squares.</p>

<p>Let's try using backtracking to solve this problem instead. What we can do is try
filling each empty cell one by one, and backtrack once we hit an invalid state.</p>

<p>To do this, we'll need an <code>valid_so_far</code> function that tests the board for its
validity by checking all the rows, columns, and squares. Then we'll backtrack
as usual:</p>

<pre><code>X = None # Placeholder empty value

def sudoku(board):
if is_complete(board):
return board

r, c = find_first_empty(board)
# set r, c to a val from 1 to 9
for i in range(1, 10):
board[r][c] = i
if valid_so_far(board):
result = sudoku(board)
if is_complete(result):
return result
board[r][c] = X
return board

def is_complete(board):
return all(all(val is not X for val in row) for row in board)

def find_first_empty(board):
for i, row in enumerate(board):
for j, val in enumerate(row):
if val == X:
return i, j
return False

def valid_so_far(board):
if not rows_valid(board):
return False
if not cols_valid(board):
return False
if not blocks_valid(board):
return False
return True

def rows_valid(board):
for row in board:
if duplicates(row):
return False
return True

def cols_valid(board):
for j in range(len(board[0])):
if duplicates([board[i][j] for i in range(len(board))]):
return False
return True

def blocks_valid(board):
for i in range(0, 9, 3):
for j in range(0, 9, 3):
block = []
for k in range(3):
for l in range(3):
block.append(board[i + k][j + l])
if duplicates(block):
return False
return True

def duplicates(arr):
c = {}
for val in arr:
if val in c and val is not X:
return True
c[val] = True
return False
</code></pre>

55
Q

<p>This problem was asked by Microsoft.</p>

<p>Implement a URL shortener with the following methods:</p>

<ul><li><code>shorten(url)</code>, which shortens the url into a six-character alphanumeric string, such as <code>zLg6wl</code>.</li><li><code>restore(short)</code>, which expands the shortened string into the original url. If no such shortened string exists, return null.</li></ul>

<p>Hint: What if we enter the same URL twice?</p>

A

<p>Clearly, we need a random string generator for this problem. If you're in an interview
and you don't know how to generate a random string by heart, that's fine -- you can
just assume you have access to a function that generate N random characters. In this
case, we'll create a helper function called <code>_generate_short</code> that does it for us.</p>

<p>The idea for this problem is to generate a shortened url and store it in a dictionary
where the shortened url is the key and the actual url is the value. Then, when retrieving
the actual url we can just look it up in the dictionary.</p>

<p>However, we need to be careful in that we don't accidentally overwrite an existing entry
when shortening a url. So what we'll do is continuously generate urls until we find one
that doesn't already exist, and then use that one. We do that in the helper function
<code>generate_unused_short</code>.</p>

<pre><code>import random
import string

class URLShortener:
def \_\_init\_\_(self):
self.short_to_url = {}

def _generate_short(self):
return <span>''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(6))

def _generate_unused_short(self):
t = self._generate_short()
while t in self.short_to_url:
t = self._generate_short()
return t

def shorten(self, url):
short = self._generate_unused_short()
self.short_to_url[short] = url
return short

def restore(self, short):
return self.short_to_url.get(short, None)
</span></code></pre>

<p>We can improve this a bit. What if we shorten the same url twice? We could potentially re-use
the existing shortened url, but we don't know how to access it without querying all values
in our dict!</p>

<p>So we can create a second dict that maps urls to shortened urls and update that appropriately.
When we see a url we've seen before, we can just then just re-use that shortened url.</p>

<pre><code>import random
import string

class URLShortener:
def \_\_init\_\_(self):
self.short_to_url = {}
self.url_to_short = {}

def _generate_short(self):
return <span>''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(6))

def _generate_unused_short(self):
t = self._generate_short()
while t in self.short_to_url:
t = self._generate_short()
return t

def shorten(self, url):
short = self._generate_unused_short()
if url in self.url_to_short:
return self.url_to_short[url]
self.short_to_url[short] = url
self.url_to_short[url] = short
return short

def restore(self, short):
return self.short_to_url.get(short, None)
</span></code></pre>

56
Q

<p>This problem was asked by Google.</p>

<p>Given an undirected graph represented as an adjacency matrix and an integer k,
write a function to determine whether each vertex in the graph can be colored
such that no two adjacent vertices share the same color using at most k colors.</p>

A

<p>We can use backtracking to solve this problem. More specifically, we start at
vertex 0, try out every color from 0 to <code>k - 1</code>, and then see if we can recursively
paint the rest of the graph without any conflicting colors. We'll create a helper
function <code>valid(graph, colors)</code> that looks at the last colored vertex and all its
neighbours to see if it conflicts with any of its neighbours (i.e. has the same
color). We can skip over all uncolored vertices here.</p>

<p>To represent the colors, we can just keep a separate colors list that maps 1-to-1
with the vertices. You can also convert the graph into nodes and add a color property
as well.</p>

<pre><code>def valid(graph, colors):
last_vertex, last_color = len(colors) - 1, colors[-1]
colored_neighbors = [i
for i, has_edge
in enumerate(graph[last_vertex])
if has_edge and i < last_vertex]
for neighbor in colored_neighbors:
if colors[neighbor] == last_color:
return False
return True

def colorable(graph, k, colors=[]):
if len(colors) == len(graph):
return True

for i in range(k):
colors.append(i)
if valid(graph, colors):
if colorable(graph, k, colors):
return True
colors.pop()

return False
</code></pre>

<p>This runs in O(k^N) time and O(k) space, where N is the number of vertices, since
we're iterating over k colors and we are backtracking over N vertices.</p>

57
Q

<p>This problem was asked by Amazon.</p>

<p>Given a string s and an integer k, break up the string into multiple lines such
that each line has a length of k or less. You must break it up so that words don't
break across lines. Each line has to have the maximum possible amount of words.
If there's no way to break the text up, then return null.</p>

<p>You can assume that there are no spaces at the ends of the string and that there
is exactly one space between each word.</p>

<p>For example, given the string "the quick brown fox jumps over the lazy dog" and
k = 10, you should return: ["the quick", "brown fox", "jumps over", "the lazy", "dog"].
No string in the list has a length of more than 10.</p>

A

<p>We can break up the string greedily. First we'll break up <code>s</code> into an array words.
Then, we can use a buffer as a current string and tentatively add words to it, checking
that the newly-added-to line can fit within <code>k</code>. If we overflow with the new word, then
we flush out the current string into an array <code>all</code> and restart it with the new word.</p>

<p>Notice that if any word is longer than <code>k</code>, then there's no way to break up the text, so
we should return <code>None</code>. It's helpful to define a helper function that returns the length of a list of words
with spaces added in between.</p>

<p>Finally, we return <code>all</code>, which should contain the texts we want.</p>

<pre><code>def break(s, k):
words = s.split()

if not words:
return []

current = []
all = []

for i, word in enumerate(words):
if length(current + [word]) <= k:
current.append(word)
elif length([word]) > k:
return None
else:
all.append(current)
current = [word]
all.append(current)

return all

def length(words):
if not words:
return 0
return sum(len(word) for word in words) + (len(words) - 1)
</code></pre>

58
Q

<p>This problem was asked by Amazon.</p>

<p>An sorted array of integers was rotated an unknown number of times.</p>

<p>Given such an array, find the index of the element in the array in faster than linear time. If the element
doesn't exist in the array, return null.</p>

<p>For example, given the array [13, 18, 25, 2, 8, 10] and the element 8, return 4 (the index of 8 in the array).</p>

<p>You can assume all the integers in the array are unique.</p>

A

<p>We can obviously do this problem in linear time if we looked at each element in the array.
However, we need to do it faster than linear time. A big clue should be that the array of
integers was previously sorted, and then rotated. If it was just sorted, we could do a binary
search. However, this array was also rotated, so we can't do a regular binary search. We can
modify it slightly to get to where we want it, however.</p>

<p>In our solution, we first find the rotation point using binary search. We do this by:</p>

<ul><li>Checking the midpoint for the rotation point (by comparing it to the previous number and
seeing if it's larger)</li><li>Moving our check up or down the array:<ul><li>If the number we're looking at is larger than the first item in the array, then the rotation must
occur later, so add <code>dist</code></li><li>If not, then it must occur before, so subtract <code>dist</code></li></ul></li><li>And then update dist by dividing it by 2 and taking its floor (so it's proper binary search).</li></ul>

<p>Then, once we have the rotation point, we can do binary search as usual by remembering
to offset the correct amount.</p>

<p>The code would look like this:</p>

<pre><code>def shifted_array_search(lst, num):
# First, find where the breaking point is in the shifted array
i = len(lst) // 2
dist = i // 2
while True:
if lst[0] > lst[i] and lst[i - 1] > lst[i]:
break
elif dist == 0:
break
elif lst[0] <= lst[i]:
i = i + dist
elif lst[i - 1] <= lst[i]:
i = i - dist
else:
break
dist = dist // 2

# Now that we have the bottom, we can do binary search as usual,
# wrapping around the rotation.
low = i
high = i - 1
dist = len(lst) // 2
while True:
if dist == 0:
return None

guess_ind = (low + dist) % len(lst)
guess = lst[guess_ind]
if guess == num:
return guess_ind

if guess < num:
low = (low + dist) % len(lst)
if guess > num:
high = (len(lst) + high - dist) % len(lst)

dist = dist // 2
</code></pre>

<p>This solution runs in O(log n). However, this is definitely not the only solution!
There are many other possible ways to implement this, but as long as you have the
idea of doing binary search, you've got it.</p>

59
Q

<p>This problem was asked by Google.</p>

<p>Implement a file syncing algorithm for two computers over a low-bandwidth network.
What if we know the files in the two computers are mostly the same?</p>

A

<p>If the files on the two computers are radically different, then we have basically no choice:
we must make the sender send over the whole file. We can compress it to save some space,
but that's about it.</p>

<p>We can do a bit more if the files are similar. Ideally, we would like to just send over
the deltas, i.e. the differences between the two files. However, the problem here is we
don't know what's different, so we don't know what to send! So we're back to sending over
the whole file.</p>

<p>We know that we can definitely send over deltas -- after all, it's the basis of utilities
like <code>rsync</code>, and is also widely used for patching games and software! How is it done?</p>

<p>The basic idea is to have the receiver compute a small checksum or fingerprint for non-overlapping blocks
of the file it has, and send that over. Then, the sender can just verify, using the same process, which
blocks are different, and then send only the data required for those. Now, if the files are identical,
we no longer need to send the whole file! We only need to send the fingerprints for the file over, which
should be tiny.</p>

<p>Sounds great! But there's one problem: what if the files are of different lengths? Or worse: they're different lengths,
and the appended section is at the beginning of the file. Then all the checksums will be off, and we'll need to send
over the whole file again, even if we just appended one section!</p>

<p>The solution to this problem is to change how we're matching blocks. After the receiver sends all the checksums,
the sender can compute the checksum at every possible offset to find one that matches. If it does, then we just
send all the data from the last point to the beginning of the current block, as well as some sort of signal that
the block matched.</p>

<p>P.S. This algorithm is how the <code>rsync</code> utility is implemented, and was first described <a>here</a>.
It's surprisingly short and easy to read! A few things in the paper that aren't mentioned here:</p>

<ul><li>Rolling checksums for efficiently computing checksums at every possible offset.</li><li>Using a weak (rolling) checksum and a strong one for efficiency</li><li>Storing the checksums in a hashtable for easier lookup</li></ul>

60
Q

<p>This problem was asked by Facebook.</p>

<p>Given a multiset of integers, return whether it can be partitioned into two subsets
whose sums are the same.</p>

<p>For example, given the multiset <code>{15, 5, 20, 10, 35, 15, 10}</code>, it would return true, since we
can split it up into <code>{15, 5, 10, 15, 10}</code> and <code>{20, 35},</code> which both add up to <code>55</code>.</p>

<p>Given the multiset <code>{15, 5, 20, 10, 35}</code>, it would return false, since we can't split
it up into two subsets that add up to the same sum.</p>

A

<p>The naive, brute force solution would be to try every combination of two subsets
and check their sums. We could do this by trying to generate each subset of our
input set, and then checking the sum of that subset with the sum of everything
not in the subset.</p>

<p>To speed this up, notice that we really only need to find a subset that adds up
to half of the total sum of all the integers. This is because of the pigeonhole
principle: if one subset adds up to half of the sum, then the rest of the sum
must be made up of the rest of the set.</p>

<p>So, we can generate the powerset of our set and check if any of them sum to <code>k / 2</code>,
where <code>k</code> is the sum of the set. We know immediately that if <code>k</code> is odd, then we can't
partition the sets, so we can immediately return False.</p>

<p>We did powerset in Daily Coding Problem #37, so let's reuse that:</p>

<pre><code>def power_set(s):
if not s:
return [[]]
result = power_set(s[1:])
return result + [subset + [s[0]] for subset in result]
</code></pre>

<p>Then partition will just be:</p>

<pre><code>def partition(s):
k = sum(s)
if k % 2 != 0:
return False
powerset = power_set(s)
for subset in powerset:
if sum(subset) == k / 2:
return True
return False
</code></pre>

<p>This will run in O(N * 2^N) time though, since we must generate every subset and sum them up. Can we make this any faster?</p>

<p>Notice that we've reduced the problem into finding a subset of integers that add up to <code>k / 2</code>, which is exactly the same
Daily Coding Problem #42: finding a subset of integers that sum up to <code>k</code> (a different k).</p>

<p>Recall that we solved that problem by created a matrix of size <code>len(nums) + 1</code> by <code>k + 1</code>, and then using dynamic
programming to fill up the matrix. We can something similar here, except we'll use our <code>k / 2</code> as our target.</p>

<p>Each entry <code>A[i][j]</code> in our matrix will represent whether or not we can make the integer <code>i</code> with the elements
of our set from 0 to j. So we'll do the following:</p>

<ul><li>Create a matrix of size <code>k + 1</code> by <code>len(s) + 1</code> of booleans (all initialized to False).</li><li>Initialize the top row to True, since we can make 0 with anything (by not picking anything)</li><li>Initialize the left column to False (except for the one in the first row), since we can't make anything other than 0 with nothing</li><li>Iterate over the matrix from top-to-bottom, then left-to-right:<ul><li>At each index <code>A[i][j]</code>, look at <code>A[i][j - 1]</code> or <code>A[i - last][j - 1]</code> and set to True if any are true.</li></ul></li><li>Return the value at the bottom-right of the matrix.</li></ul>

<pre><code>def partition(s):
k = sum(s)
if k % 2 != 0:
return False
k_over_two = k // 2

A = [[False for _ in range(len(s) + 1)] for _ in range(k_over_two + 1)]

for j in range(len(s) + 1):
A[0][j] = True

for i in range(1, k_over_two + 1):
A[i][0] = False

for i in range(1, k_over_two + 1):
for j in range(1, len(s) + 1):
using_last = i - s[j - 1]
if using_last >= 0:
A[i][j] = A[i][j - 1] or A[using_last][j - 1]
else:
A[i][j] = A[i][j - 1]
return A[-1][-1]
</code></pre>

<p>This will take O(K * N) time and space, just like in the knapsack problem.</p>

61
Q

<p>This problem was asked by Google.</p>

<p>Implement integer exponentiation. That is, implement the <code>pow(x, y)</code> function, where <code>x</code> and <code>y</code> are integers and returns <code>x^y</code>.</p>

<p>Do this faster than the naive method of repeated multiplication.</p>

<p>For example, <code>pow(2, 10)</code> should return 1024.</p>

A

<p>Implementing exponention naively is quite straightforward. We can either do it iteratively:</p>

<pre><code>def power(x, y):
if y < 0:
base = 1 / x
exponent = -y
else:
base = x
exponent = y
result = 1
for _ in range(exponent):
result *= base
return result
</code></pre>

<p>or recursively:</p>

<pre><code>def power(x, y):
if y < 0:
return power(1 / x, -y)
elif y == 0:
return 1
else:
return x * power(x, y - 1)
</code></pre>

<p>Just remember to deal with negative exponents!</p>

<p>However, we need to do faster than just naive multiplication. How can we do this?</p>

<p>Notice that the main bottleneck in performance here is doing multiplications <code>y</code> times. Since the process of multiplication takes about the same amount of time
regardless of the actual sizes of the numbers, we should look at trying to move some of the work from the exponent to the base.</p>

<p>We can rewrite <code>x^y</code> as the following.</p>

<ul><li>If <code>y</code> is even, then <code>x^y</code> = <code>(x^2) ^ (y/2)</code></li><li>If <code>y</code> is odd, then <code>x^y</code> = <code>x * ((x^2) ^ ((y - 1) / 2))</code></li></ul>

<p>Now, by squaring the base, we have half as many multiplications to do! Let's go through an example. Say we want to compute <code>2^20</code>. We can then do it like this:</p>

<ul><li><code>2^20 = 4^10 = 16^5 = 16 * (256)^2 = 16 * 256 * 256</code></li></ul>

<p>We've reduced the number of multiplications we need to do from 20 to 4. Let's code it up.</p>

<p>Again, we can do this iteratively:</p>

<pre><code>def power(x, y):
base = x
exponent = y
if y < 0:
base = 1 / x
exponent = -y
coeff = 1
while y > 1:
if y % 2 == 0:
base *= base
y = y // 2
else:
coeff *= base
base *= base
y = (y - 1) // 2
return coeff * base
</code></pre>

<p>Or recursively, although it takes up more space on the call stack:</p>

<pre><code>def power(x, y):
if y < 0:
return power(1 / x, -y)
elif y == 0:
return 1
elif y == 1:
return x
elif y % 2 == 0:
return power(x * x, y // 2)
else: # y is odd
return x * power(x * x, y // 2)
</code></pre>

<p>Since we're nearly halving the number of multiplications we need to do at each step, this will run in O(log y) time.</p>

62
Q

<p>This problem was asked by Facebook.</p>

<p>There is an N by M matrix of zeroes. Given N and M, write a function to count the number of ways of starting at the top-left corner and getting to the bottom-right corner. You can only move right or down.</p>

<p>For example, given a 2 by 2 matrix, you should return 2, since there are two ways to get to the bottom-right:</p>

<ul><li>Right, then down</li><li>Down, then right</li></ul>

<p>Given a 5 by 5 matrix, there are 70 ways to get to the bottom-right.</p>

A

<p>Notice that, to get to any cell, we only have two ways: either directly from above,
or from the left, unless we can't go up or left anymore, in which case there's only
one way. This leads to the following recurrence:</p>

<ul><li>If either N or M is 1, then return 1</li><li>Otherwise, <code>f(n, m) = f(n - 1, m) + f(n, m - 1)</code></li></ul>

<p>This is very similar to the staircase problem from Daily Coding Problem #12.</p>

<p>The recursive solution would look like this:</p>

<pre><code>def num_ways(n, m):
if n == 1 or m == 1:
return 1
return num_ways(n - 1, m) + num_ways(n, m - 1)
</code></pre>

<p>However, just like in the staircase problem (or fibonacci), we will have a lot of repeated
subcomputations. So, let's use bottom-up dynamic programming to store those results.</p>

<p>We'll initialize an N by M matrix A, and each entry <code>A[i][j]</code>, will contain the number of
ways we can get to that entry from the top-left. Then, once we've filled up the matrix
using our recurrence (by checking directly above or directly left), we can just look at
the bottom-right value to get our answer.</p>

<pre><code>def num_ways(n, m):
A = [[0 for _ in range(m)] for _ in range(n)]
for i in range(n):
A[i][0] = 1
for j in range(m):
A[0][j] = 1
for i in range(1, n):
for j in range(1, m):
A[i][j] = A[i - 1][j] + A[i][j - 1]
return A[-1][-1]
</code></pre>

<p>This runs in O(N * M) time and space.</p>

63
Q

<p>This problem was asked by Microsoft.</p>

<p>Given a 2D matrix of characters and a target word, write a function that returns
whether the word can be found in the matrix by going left-to-right, or up-to-down.</p>

<p>For example, given the following matrix:</p>

<pre><code>[['F', 'A', 'C', 'I'],
['O', 'B', 'Q', 'P'],
['A', 'N', 'O', 'B'],
['M', 'A', 'S', 'S']]
</code></pre>

<p>and the target word 'FOAM', you should return true, since it's the leftmost column.
Similarly, given the target word 'MASS', you should return true, since it's the last row.</p>

A

<p>This problem should be quite straightforward: we can go through each entry in the array,
try to create the word going right and down, and check if the word matches our word.
To make bounds checking simple, we'll just try to grab everything from where we're looking
at till the end, and then use the slice operator to just get what we want.</p>

<pre><code>def build_word_right(matrix, r, c, length):
return <span>''.join([matrix[r][i] for i in range(c, len(matrix[0]))])[:length]

def build_word_down(matrix, r, c, length):
return <span>''.join([matrix[i][c] for i in range(r, len(matrix))])[:length]

def word_search(matrix, word):
for r in range(len(matrix)):
for c in range(len(matrix[0])):
word_right = build_word_right(matrix, r, c, len(word))
word_down = build_word_down(matrix, r, c, len(word))
if word in (word_right, word_down):
return True
return False
</span></span></code></pre>

<p>However, if the matrix was really big, then we would be grabbing the whole row or column
just to shorten it. We can improve our <code>build_word_right</code> and <code>build_word_down</code> functions
a bit by just taking what we need, which is whichever is shorter between the length of
the word and the end of the row or column:</p>

<pre><code>def build_word_right(matrix, r, c, length):
row_len = len(matrix[0])
return <span>''.join([matrix[r][i] for i in range(c, min(row_len, length))])

def build_word_down(matrix, r, c, length):
col_len = len(matrix)
return <span>''.join([matrix[i][c] for i in range(r, min(col_len, length))])
</span></span></code></pre>

<p>However, let's say our word were both really big. If we notice, when we're checking the current
row or column, that the first few letters are off, then we can quit the search early.</p>

<p>The Python built-in function <code>zip</code> is very handy:</p>

<pre><code>def check_word_right(matrix, r, c, word):
word_len = len(word)
row_len = len(matrix[0])
if word_len != row_len - c:
return False
for c1, c2 in zip(word, (matrix[r][i] for i in range(c, row_len))):
if c1 != c2:
return False
return True

def check_word_down(matrix, r, c, word):
word_len = len(word)
col_len = len(matrix)
if word_len != col_len - r:
return False
for c1, c2 in zip(word, (matrix[i][c] for i in range(r, col_len))):
if c1 != c2:
return False
return True

return <span>''.join([matrix[i][c] for i in range(r, min(col_len, length))])

def word_search(matrix, word):
for r, row in enumerate(matrix):
for c, val in enumerate(row):
if check_word_right(matrix, r, c, word):
return True
if check_word_down(matrix, r, c, word):
return True
return False
</span></code></pre>

64
Q

<p>This problem was asked by Google.</p>

<p>A knight's tour is a sequence of moves by a knight on a chessboard such that all squares are visited once.</p>

<p>Given N, write a function to return the number of knight's tours on an N by N chessboard.</p>

A

<p>The brute force solution is here to try every possible permutation of moves and see if
they're valid. That would be pretty much computationally infeasible, since we have
N * N possible spots. That would be N^2!</p>

<p>We can improve the performance on this using backtracking, similar to the N queen problem
(#38) or the flight itinerary problem (#41). The basic idea is this:</p>

<ul><li>For every possible square, initialize a knight there, and then:</li><li>Try every valid move from that square.</li><li>Once we've hit every single square, we can add to our count.</li></ul>

<p>We'll represent the tour as just a sequence of tuples <code>(r, c)</code>. To speed things up and
to avoid having to look at the whole tour to check whether a space has been used before,
we can create an N by N board to mark whether we've seen it already.</p>

<pre><code>def is_valid_move(board, move, n):
r, c = move
return 0 <= r < n and 0 <= c < n and board[r][c] is None

def valid_moves(board, r, c, n):
deltas = [
(2, 1),
(1, 2),
(1, -2),
(-2, 1),
(-1, 2),
(2, -1),
(-1, -2),
(-2, -1),
]
all_moves = [(r + r_delta, c + c_delta) for r_delta, c_delta in deltas]
return [move for move in all_moves if is_valid_move(board, move, n)]

def knights_tours(n):
count = 0
for i in range(n):
for j in range(n):
board = [[None for _ in range(n)] for _ in range(n)]
board[i][j] = 0
count += knights_tours_helper(board, [(i, j)], n)
return count

def knights_tours_helper(board, tour, n):
if len(tour) == n * n:
return 1
else:
count = 0
last_r, last_c = tour[-1]
for r, c in valid_moves(board, last_r, last_c, n):
tour.append((r, c))
board[r][c] = len(tour)
count += knights_tours_helper(board, tour, n)
tour.pop()
board[r][c] = None
return count
</code></pre>

<p>This takes O(N * N) space and potentially O(8^(N^2)) time, since at each step we have
potentially 8 moves to check, and we have to do this for each square.</p>

65
Q

<p>This problem was asked by Amazon.</p>

<p>Given a N by M matrix of numbers, print out the matrix in a clockwise spiral.</p>

<p>For example, given the following matrix:</p>

<pre><code>[[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]]
</code></pre>

<p>You should print out the following:</p>

<pre><code>1
2
3
4
5
10
15
20
19
18
17
16
11
6
7
8
9
14
13
12
</code></pre>

A

<p>As you might imagine, there are many possible solutions for this problem. Ours involves
keeping track of our current position and direction. As we move along and print each
value, we set it to None. Then once we've either hit the edge or another None value
(indicating we've seen it before), we change directions counterclockwise and keep on
going.</p>

<p>We use an enum to define the directions, and some helper functions <code>next_direction</code>,
<code>next_position</code>, and <code>should_change_direction</code> to help us lay out the code cleanly.</p>

<pre><code>UP = 0
RIGHT = 1
DOWN = 2
LEFT = 3

DIRECTIONS = [RIGHT, DOWN, LEFT, UP]

def next_direction(direction):
if direction == RIGHT:
return DOWN
elif direction == DOWN:
return LEFT
elif direction == LEFT:
return UP
elif direction == UP:
return RIGHT

def next_position(position, direction):
if direction == RIGHT:
return (position[0], position[1] + 1)
elif direction == DOWN:
return (position[0] + 1, position[1])
elif direction == LEFT:
return (position[0], position[1] - 1)
elif direction == UP:
return (position[0] - 1, position[1])

def should_change_direction(M, r, c):
in_bounds_r = 0 <= r < len(M)
in_bounds_c = 0 <= c < len(M[0])
return not in_bounds_r or not in_bounds_c or M[r][c] is None

def matrix_spiral_print(M):
remaining = len(M) * len(M[0])
current_direction = RIGHT
current_position = (0, 0)
while remaining > 0:
r, c = current_position
print(M[r][c])
M[r][c] = None
remaining -= 1

possible_next_position = next_position(current_position, current_direction)
if should_change_direction(M, possible_next_position[0], possible_next_position[1]):
current_direction = next_direction(current_direction)
current_position = next_position(current_position, current_direction)
else:
current_position = possible_next_position
</code></pre>

<p>This takes O(M * N) time.</p>

66
Q

<p>This problem was asked by Square.</p>

<p>Assume you have access to a function <code>toss_biased()</code> which returns
0 or 1 with a probability that's not 50-50 (but also not 0-100 or 100-0).
You do not know the bias of the coin.</p>

<p>Write a function to simulate an unbiased coin toss.</p>

A

<p>Since we don't know the bias of the coin, it sounds like we need to roll
the coin more than once and do some calculations to find something with
a 50-50 chance of occurring. Let's draw out the probability chart for tossing our coin
twice. Let's say the probability of getting heads is <code>p</code>, so tails is <code>1 - p</code>:</p>

<ul><li>HH: p * p</li><li>HT: p * (1 - p)</li><li>TH: (1 - p) * p</li><li>TT: (1 - p) * (1 - p)</li></ul>

<p>Since multiplication is commutative, we find that flipping heads and then tails has the
same probability of flipping tails, then heads! Then, our strategy looks like this:</p>

<ul><li>Toss our coin twice.</li><li>If we get heads and then tails, return heads. (It doesn't really matter which as long as the inverse one is opposite)</li><li>If we get heads and then tails, return tails.</li><li>Otherwise if we get the same outcome for both coins, re-toss.</li></ul>

<pre><code>from random import random

BIAS = 0.66

def toss_biased():
return random() > BIAS

def toss_fair():
t1, t2 = toss_biased(), toss_biased()
if t1 and not t2:
return True
elif not t1 and t2:
return False
else:
return toss_fair()
</code></pre>

<p>Testing this seems to bear it out:</p>

<pre><code>from collections import defaultdict
c = defaultdict(int)
for i in range(1000000):
c[toss_fair()] += 1
print(c)
</code></pre>

<pre><code>defaultdict(, {False: 500104, True: 499896})
</code></pre>

<p>Because there's a possibility that we always roll the same two values,
there is a possibility that this function never terminates.</p>

67
Q

<p>This problem was asked by Google.</p>

<p>Implement an LFU (Least Frequently Used) cache. It should be able to be initialized with a cache size <code>n</code>,
and contain the following methods:</p>

<ul><li><code>set(key, value)</code>: sets <code>key</code> to <code>value</code>. If there are already <code>n</code> items in the cache and we are adding a new
item, then it should also remove the least frequently used item. If there is a tie, then the least recently
used key should be removed.</li><li><code>get(key)</code>: gets the value at <code>key</code>. If no such key exists, return null.</li></ul>

<p>Each operation should run in O(1) time.</p>

A

<p>This problem is similar to the LRU cache problem (Problem #52), but requires a different
perspective. In that problem, we used a doubly linked list of nodes and a hash table that
mapped keys to the nodes. When we evicted from the cache, we just had to look at the head
of the linked list.</p>

<p>In this solution, we keep two dictionaries: one mapping from keys to values (and their frequencies), and another
mapping from frequency counts to a deque of keys.</p>

<p>When we set a key, we first check if we need to evict another key. If we do,
then we'll look at the entry in our frequency map with the lowest frequency
and pop from the left (since we'll be appending, the left will be the least
recently used entry). Then we can add our mapping to the dicts: we'll add
our key and value (along with a frequency of one) to our value mapping,
and also to our frequency mapping at key 1.</p>

<p>If we're updating a key (the key already exists), then it's a different story.
Here, we will need to basically only update the value mapping by setting a new
value and increment the frequency. For the frequency mapping, we'll need to
move our key to the next frequency bucket, creating it if necessary via <code>defaultdict</code>.</p>

<p>Getting a key has similar logic to updating it, without actually updating it.</p>

<pre><code>from collections import defaultdict
from collections import deque

class LFUCache:
def \_\_init\_\_(self, capacity):
self.capacity = capacity
self.val_map = {}
self.freq_map = defaultdict(deque)
self.min_freq = 0

def get(self, key):
# If key doesn't exist, return None.
if key not in self.val_map:
return None

# First, we look up the val and frequency in our val_map.
val, freq = self.val_map[key]

# We need to then increment the frequency of our key,
# so we'll take it out of the current bucket and put it
# into the next frequency's bucket. If it was the last thing
# in the current bucket and the lowest frequency, (e.g. 1 to 2),
# then we'll make sure to update our min_freq so we can keep
# track of what to evict.
self.freq_map[freq].remove(key)
if not self.freq_map[freq]:
del self.freq_map[freq]
if self.min_freq == freq:
self.min_freq += 1

# Update our dicts as usual.
self.val_map[key] = (val, freq + 1)
self.freq_map[freq + 1].append(key)
return val

def set(self, key, val):
if self.capacity == 0:
return

if key not in self.val_map:
# Evict the least frequently used key by popping left
# from the lowest-frequency key, since it's ordered by
# time (because we use append).
if len(self.val_map) >= self.capacity:
to_evict = self.freq_map[self.min_freq].popleft()
if not self.freq_map[self.min_freq]:
del self.freq_map[self.min_freq]
del self.val_map[to_evict]

# Add our key to val_map and freq_map
self.val_map[key] = (val, 1)
self.freq_map[1].append(key)
self.min_freq = 1
else:
# Update the entry and increase the frequency of the key,
# updating the minimum frequency if necessary.
_, freq = self.val_map[key]
self.freq_map[freq].remove(key)
if not self.freq_map[freq]:
if freq == self.min_freq:
self.min_freq += 1
del self.freq_map[freq]
self.val_map[key] = (val, freq + 1)
self.freq_map[freq + 1].append(key)
</code></pre>

<p>These operations should run in O(1) time.</p>

68
Q

<p>This problem was asked by Google.</p>

<p>On our special chessboard, two bishops attack each other if they share the same diagonal. This includes bishops that have another bishop located between them, i.e. bishops can attack through pieces.</p>

<p>You are given N bishops, represented as (row, column) tuples on a M by M chessboard.
Write a function to count the number of pairs of bishops that attack each other. The ordering of the pair doesn't matter: (1, 2) is considered the same as (2, 1).</p>

<p>For example, given M = 5 and the list of bishops:</p>

<ul><li>(0, 0)</li><li>(1, 2)</li><li>(2, 2)</li><li>(4, 0)</li></ul>

<p>The board would look like this:</p>

<pre><code>[b 0 0 0 0]
[0 0 b 0 0]
[0 0 b 0 0]
[0 0 0 0 0]
[b 0 0 0 0]
</code></pre>

<p>You should return 2, since bishops 1 and 3 attack each other, as well as bishops 3 and 4.</p>

A

<p>One approach would be to iterate through each bishop and find all the other
attacking bishops, incrementing the count when we find a pair.</p>

<p>We can define a helper function <code>is_attacking</code> that returns whether or not two
bishops are attacking each other:</p>

<pre><code>def is_attacking(bishop0, bishop1):
r0, c0 = bishop0
r1, c1 = bishop1
return abs(r1 - r0) == abs(c1 - c0)

def pairs(bishops, m):
count = 0
for i, bishop0 in enumerate(bishops):
for j, bishop1 in enumerate(bishops[i + 1:]):
count += is_attacking(bishop0, bishop1)
return count
</code></pre>

<p>This would take O(N^2). Can we make this any faster?</p>

<p>If we know how many bishops are in each diagonal, then we can know how many
pairs are attacking: for each diagonal, it's the number of bishops choose 2,
since each bishop makes a pair with every other bishop.</p>

<p>So, if we go through each bishop and bucket them into each separate diagonal,
we can just run (b choose 2) on the number of bishops on each diagonal and
sum them up. Recall that (n choose 2) is equivalent to <code>n * (n - 1) / 2</code>.</p>

<p>Each <em>bucket</em> is represented by a tuple <code>top_left_row, top_left_column, direction</code>.
(Or right row if it's the other way.) Then we can quickly figure out which bucket
a bishop belongs to by moving up each diagonal until we hit a border.</p>

<pre><code>from collections import defaultdict

TOP_LEFT_TO_BOTTOM_RIGHT = 0
TOP_RIGHT_TO_BOTTOM_LEFT = 1

def combos(num):
return num * (num - 1) / 2

def pairs(bishops, m):
counts = defaultdict(int)
for r, c in bishops:
top_lr, top_lc = (r - min(r, c), c - min(r, c))
top_rr, top_rc = (r - min(r, m - c), c + min(r, m - c))

counts[top_lr, top_lc, TOP_LEFT_TO_BOTTOM_RIGHT] += 1
counts[top_rr, top_rc, TOP_RIGHT_TO_BOTTOM_LEFT] += 1
return sum(combos(c) for c in counts.values())
</code></pre>

<p>This runs in O(N) time and space.</p>

69
Q

<p>This problem was asked by Facebook.</p>

<p>Given a list of integers, return the largest product that can be made by multiplying any three integers.</p>

<p>For example, if the list is <code>[-10, -10, 5, 2]</code>, we should return <code>500</code>, since that's <code>-10 * -10 * 5</code>.</p>

<p>You can assume the list has at least three integers.</p>

A

<p>If all the integers were positive, then we would simply need to take the three largest numbers of the array.
Then, we can just sort it and return the last three elements.</p>

<p>However, we need to account for negative numbers in the array. If the largest product that can be made
includes a negative number, we would need to have two so as to cancel out the negatives. So, we can
take the larger of:</p>

<ul><li>The three largest numbers</li><li>The two smallest (most negative) numbers, and the largest number</li></ul>

<pre><code>def maximum_product_of_three(lst):
lst.sort()
third_largest, second_largest, first_largest = lst[-3], lst[-2], lst[-1]
first_smallest, second_smallest = lst[0], lst[1]
return max(third_largest * second_largest * first_largest,
first_largest * first_smallest * second_smallest)
</code></pre>

<p>This runs in O(N log N) time since we have to sort the input array.</p>

<p>We could also do this in O(N) time by using select or looking for the largest elements manually.</p>

<pre><code>from math import inf

def maximum_product_of_three(lst):
max1, max2, max3, min1, min2 = -inf, -inf, -inf, inf, inf

for x in lst:
if x > max1:
max3 = max2
max2 = max1
max1 = x
elif x > max2:
max3 = max2
max2 = x
elif x > max3:
max3 = x

if x < min1:
min2 = min1
min1 = x
elif x < min2:
min2 = x

return max(max1 * max2 * max3, max1 * min1 * min2)
</code></pre>

70
Q

<p>This problem was asked by Microsoft.</p>

<p>A number is considered perfect if its digits sum up to exactly 10.</p>

<p>Given a positive integer <code>n</code>, return the <code>n</code>-th perfect number.</p>

<p>For example, given 1, you should return 19. Given 2, you should return 28.</p>

A

<p>There's no faster way than simply iterating over all the
numbers and keeping track of the current perfect number until we hit <code>n</code>. So
that's what we'll do:</p>

<pre><code>def sum_of_digits(n):
current_sum = 0
while n > 0:
current_sum += n % 10
n = n // 10
return current_sum

def perfect(n):
i, current = 0, 0
while current < n:
i += 1
if sum_of_digits(i) == 10:
current += 1
return i
</code></pre>

<p>This will run in O(N) time.</p>

71
Q

<p>This problem was asked by Two Sigma.</p>

<p>Using a function <code>rand7()</code> that returns an integer from 1 to 7 (inclusive) with uniform
probability, implement a function <code>rand5()</code> that returns an integer from 1 to 5 (inclusive).</p>

A

<p>This problem is simpler than going from <code>rand5()</code> to <code>rand7()</code>. We'll do something similar:</p>

<ul><li>Roll <code>rand7()</code>.</li><li>If the result is between 1 and 5, then return it.</li><li>If the result is 6 or 7, then reroll.</li></ul>

<p>Since the distribution of <code>[1, 5]</code> is uniform -- each has a 1/7 chance of being rolled,
then our <code>rand5()</code> will be uniform.</p>

<pre><code>def rand5():
r = rand7()
if 1 <= r <= 5:
return r
return rand5()
</code></pre>

<p>This could potentially take forever (although with a very low probability), if we keep on getting 6 or 7s.</p>

72
Q

<p>This problem was asked by Google.</p>

<p>In a directed graph, each node is assigned an uppercase letter. We define a path's value as the number of most frequently-occurring letter
along that path. For example, if a path in the graph goes through "ABACA", the value of the path is 3, since there are 3 occurrences of
'A' on the path.</p>

<p>Given a graph with <code>n</code> nodes and <code>m</code> directed edges, return the largest value path of the graph. If the largest value is infinite, then
return null.</p>

<p>The graph is represented with a string and an edge list. The <code>i</code>-th character represents the uppercase letter of the <code>i</code>-th node.
Each tuple in the edge list <code>(i, j)</code> means there is a directed edge from the <code>i</code>-th node to the <code>j</code>-th node. Self-edges are possible,
as well as multi-edges.</p>

<p>For example, the following input graph:</p>

<pre><code>ABACA
</code></pre>

<pre><code>[(0, 1),
(0, 2),
(2, 3),
(3, 4)]
</code></pre>

<p>Would have maximum value 3 using the path of vertices <code>[0, 2, 3, 4]</code>, <code>(A, A, C, A)</code>.</p>

<p>The following input graph:</p>

<pre><code>A
</code></pre>

<pre><code>[(0, 0)]
</code></pre>

<p>Should return null, since we have an infinite loop.</p>

A

<p>The naive solution here would be to try every single path from every vertex,
count up each path's value and keep track of the maximum value we've seen.</p>

<p>To do this, we can use DFS to try every path as well as return null if we come across
a cycle. The Counter module in Python is quite handy in this case:</p>

<pre><code>from collections import Counter

def max_path(s, lst):
adj = [[] for v in s]
# Build adjacency list
for u, v in lst:
adj[u].append(v)

maximum_path = 0
# Try every path from node v.
for v in range(len(s)):
stack = [(s[v], set([v]), v)] # Every item in the stack has form (path_string, visited, current_node)
while stack:
path_string, visited, current_node = stack.pop()
# Count value of current path and update maximum_path if necessary
cnt = Counter(path_string)
_, path_val = cnt.most_common(1)[0]
maximum_path = max(maximum_path, path_val)
for neighbour in adj[current_node]:
if neighbour in visited:
# There is a cycle.
return None
stack.append((path_string + s[neighbour], visited.union([neighbour]), neighbour))
return maximum_path
</code></pre>

<p>However, this would be terribly slow. DFS is O(V + E), where V and E are the sizes of the vertices and edges. Since we
also evaluate the current path each time, our algorithm is O(V * (V + E)). Let's try to improve this runtime.</p>

<p>Notice that we're recomputing the whole path on each iteration. This is inefficient since, for example, only one character
could change, so we should only need to increment that one character. This sounds like a good problem for dynamic programming.</p>

<p>Furthermore, notice that since we're using the alphabet of uppercase characters, we have a fixed number (26) of potential values
that contribute to the longest chain.</p>

<p>Let's keep a matrix of size N by 26. <code>A[i][j]</code> will contain the maximum value of the path that can be made from the character <code>i</code> (where
<code>i</code> will index into the alphabet, so A = 0, B = 1, etc. Then we'll use the following recurrence to keep track of the path with the largest value:</p>

<ul><li>When we get to a node <code>v</code>, we'll do DFS on all its neighbours.</li><li>Then <code>A[v][j]</code> will be the maximum of all <code>A[neighbour][j]</code> for all its neighbours.</li><li>Then, we also need to count the current node too, so increment <code>A[v][current_char]</code> by one, where <code>current_char</code> is the current node's assigned letter.</li></ul>

<p>We will use DFS, like before, to actually search the graph as well as determining if we have a cycle.</p>

<pre><code>VISITED = 0
UNVISITED = 1
VISITING = 2

def max_path(s, lst):
adj = [[] for v in s]
# Build adjacency list
for u, v in lst:
adj[u].append(v)

# Create matrix cache
dp = [[0 for _ in range(26)] for _ in range(len(s))]
state = {v: UNVISITED for v in range(len(s))}

def dfs(v):
state[v] = VISITING
for neighbour in adj[v]:
if state[neighbour] == VISITING:
# We have a cycle
return True
dfs(neighbour)
for i in range(26):
dp[v][i] = dp[neighbour][i]
current_char = ord(s[v]) - ord(<span>'A')
dp[v][current_char] += 1
state[v] = VISITED

# Run DFS on graph
for v in range(len(s)):
if state[v] == UNVISITED:
has_cycle = dfs(v)
if has_cycle:
return None

return max(max(v for v in node) for node in dp)
</span></code></pre>

<p>This will now just run in O(V + E) time, same as DFS.</p>

73
Q

<p>This problem was asked by Google.</p>

<p>Given the head of a singly linked list, reverse it in-place.</p>

A

<p>We can do this recursively and cleverly, using Python's default argument feature.
Basically, we call reverse on the node's next, but not before cleaning up some pointers first:</p>

<pre><code>def reverse(head, prev=None):
if not head:
return prev
tmp = head.next
head.next = prev
return reverse(tmp, head)
</code></pre>

<p>This runs in O(N) time. But it also runs in O(N) space, since Python doesn't do <a>tail-recursion elimination</a>.</p>

<p>We can improve the space by doing this iteratively, and keeping track of two things: a prev pointer and a current pointer. The current pointer
will iterate over through the list and the prev pointer will follow, one node behind. Then, as we move along the list, we'll fix up the current node's
next to point to the previous node. Then we update prev and current.</p>

<pre><code>def reverse(head):
prev, current = None, head
while current is not None:
tmp = current.next
current.next = prev
prev = current
current = tmp
return prev
</code></pre>

<p>Now this only uses constant space!</p>

74
Q

<p>This problem was asked by Apple.</p>

<p>Suppose you have a multiplication table that is N by N. That is, a 2D array where the value at the <code>i</code>-th row
and <code>j</code>-th column is <code>(i + 1) * (j + 1)</code> (if 0-indexed) or <code>i * j</code> (if 1-indexed).</p>

<p>Given integers N and X, write a function that returns the number of times X appears as a value in an N by N multiplication table.</p>

<p>For example, given N = 6 and X = 12, you should return 4, since the multiplication table looks like this:</p>

<p>| 1 | 2 | 3 | 4 | 5 | 6 |</p>

<p>| 2 | 4 | 6 | 8 | 10 | 12 |</p>

<p>| 3 | 6 | 9 | 12 | 15 | 18 |</p>

<p>| 4 | 8 | 12 | 16 | 20 | 24 |</p>

<p>| 5 | 10 | 15 | 20 | 25 | 30 |</p>

<p>| 6 | 12 | 18 | 24 | 30 | 36 |</p>

<p>And there are 4 12's in the table.</p>

A

<p>We can do this naively in O(N^2) time by actually trying out all the possible combinations,
and incrementing a counter each time we see one:</p>

<pre><code>def multi_tables(n, x):
count = 0
for i in range(1, n + 1):
for j in range(1, n + 1):
if i * j == x:
count += 1
return count
</code></pre>

<p>We can do this faster, though. Notice, in our example, that if we look at the rows or columns
of the cells that matched with 12, that they are all factors of 12. There can also only be one
matching cell per row. So, we can determine whether a particular row will match X if:</p>

<ol><li>It is a factor of X</li><li>Its corresponding factor is less than N (so it's still in the matrix).</li></ol>

<pre><code>def multi_tables(n, x):
count = 0
for i in range(1, n + 1):
if x % i == 0 and x / i <= n:
count += 1
return count
</code></pre>

<p>This only takes O(N) time.</p>

75
Q

<p>This problem was asked by Microsoft.</p>

<p>Given an array of numbers, find the length of the longest increasing subsequence in the array. The subsequence
does not necessarily have to be contiguous.</p>

<p>For example, given the array [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15], the longest increasing subsequence
has length 6: it is 0, 2, 6, 9, 11, 15.</p>

A

<p>This naive, brute force way to solve this is to generate each possible subsequence,
testing each one for monotonicity and keeping track of the longest one. That would
be prohibitively expensive: generating each subsequence already takes O(2^N)!</p>

<p>Instead, let's try to tackle this problem using recursion and then optimize it with
dynamic programming.</p>

<p>Assume that we already have a function that gives us the length of the longest increasing subsequence. Then we'll try to feed
some part of our input array back to it and try to extend the result. Our base cases are: the empty list, returning 0, and
an array with one element, returning 1.</p>

<p>Then,</p>

<ul><li>For every index <code>i</code> up until the second to last element, calculate <code>longest_increasing_subsequence</code> up to there.</li><li>We can only extend the result with the last element if our last element is greater than <code>arr[i]</code> (since otherwise,
it's not increasing).</li><li>Keep track of the largest result.</li></ul>

<pre><code>def longest_increasing_subsequence(arr):
if not arr:
return 0
if len(arr) == 1:
return 1

max_ending_here = 0
for i in range(len(arr)):
ending_at_i = longest_increasing_subsequence(arr[:i])
if arr[-1] > arr[i - 1] and ending_at_i + 1 > max_ending_here:
max_ending_here = ending_at_i + 1
return max_ending_here
</code></pre>

<p>This is really slow due to repeated subcomputations (exponential in time). So, let's use dynamic programming to
store values to recompute them for later.</p>

<p>We'll keep an array <code>A</code> of length N, and <code>A[i]</code> will contain the length of the longest increasing subsequence ending at <code>i</code>. We can then use the same recurrence but look it up in the array instead:</p>

<pre><code>def longest_increasing_subsequence(arr):
if not arr:
return 0
cache = [1] * len(arr)
for i in range(1, len(arr)):
for j in range(i):
if arr[i] > arr[j]:
cache[i] = max(cache[i], cache[j] + 1)
return max(cache)
</code></pre>

<p>This now runs in O(N^2) time and O(N) space.</p>

76
Q

<p>This problem was asked by Google.</p>

<p>You are given an N by M 2D matrix of lowercase letters. Determine the minimum number of columns that can be removed to ensure
that each row is ordered from top to bottom lexicographically. That is, the letter at each column is lexicographically later
as you go down each row. It does not matter whether each row itself is ordered lexicographically.</p>

<p>For example, given the following table:</p>

<pre><code>cba
daf
ghi
</code></pre>

<p>This is not ordered because of the a in the center. We can remove the second column to make it ordered:</p>

<pre><code>ca
df
gi
</code></pre>

<p>So your function should return 1, since we only needed to remove 1 column.</p>

<p>As another example, given the following table:</p>

<pre><code>abcdef
</code></pre>

<p>Your function should return 0, since the rows are already ordered (there's only one row).</p>

<p>As another example, given the following table:</p>

<pre><code>zyx
wvu
tsr
</code></pre>

<p>Your function should return 3, since we would need to remove all the columns to order it.</p>

A

<p>For this question, we can look over each column, check that each one is
ordered, and remove them if the column is not sorted:</p>

<pre><code>def bad_cols(board):
num_bad_cols = 0
num_cols = len(board[0])
i = 0
while i < num_cols:
if is_sorted_up_to(board, i):
i += 1
continue
else:
remove_col(board, i)
num_bad_cols += 1
num_cols -= 1

return num_bad_cols

def remove_col(board, i):
for row in board:
row.pop(i)

def is_sorted_up_to(board, i):
<span>'''Returns whether the table is sorted in lexicographic order up to column i.'''
return all(board[r][:i + 1] <= board[r + 1][:i + 1] for r in range(len(board) - 1))
</span></code></pre>

<p>Recall that we have M rows and N columns. We're iterating over each column, and checking all the rows
are sorted up to that column, so this runs in O(N^2 * M) time.</p>

77
Q

<p>This problem was asked by Snapchat.</p>

<p>Given a list of possibly overlapping intervals, return a new list of intervals where all overlapping intervals have been merged.</p>

<p>The input list is not necessarily ordered in any way.</p>

<p>For example, given [(1, 3), (5, 8), (4, 10), (20, 25)], you should return [(1, 3), (4, 10), (20, 25)].</p>

A

<p>We can do this by sorting all the intervals by their start time. This way, when looking at
the current interval, if it overlaps with the previous one we can just combine them.</p>

<pre><code>def merge(intervals):
result = []
for start, end in sorted(intervals, key=lambda i: i[0]):
# If current interval overlaps with the previous one, combine them
if result and start <= result[-1][1]:
prev_start, prev_end = result[-1]
result[-1] = (prev_start, max(end, prev_end))
else:
result.append((start, end))
return result
</code></pre>

<p>Since we have to sort the intervals, this runs in O(N log N) time.</p>

78
Q

<p>This problem was asked by Google.</p>

<p>Given k sorted singly linked lists, write a function to merge all the lists into one sorted singly linked list.</p>

A

<p>A brute force solution here might be to gather all the values of the linked lists
into one large array, sort the array, and then recreate a linked list with the
values from the array. That would look like this:</p>

<pre><code>def merge(lists):
# Combine all nodes into an array
arr = []
for head in lists:
current = head
while current:
arr.append(current.val)
current = current.next

new_head = current = Node(-1) # dummy head
for val in sorted(arr):
current.next = Node(val)
current = current.next

return new_head.next
</code></pre>

<p>This would take O(KN log KN) time and O(KN) space, where K is the number of lists and N is the number of elements in the largest list.</p>

<p>A better way would be to take advantage of the inherent sortedness of the input lists. We can keep track, using pointers, of where we
are at each list, and pick the minimum of all the pointers. Once we've picked one, we can move that pointer up. This would run in
O(KN * K) time and O(K) space.</p>

<pre><code>def merge(lists):
new_head = current = Node(-1)
while all(lst is not None for lst in lists):
# Get min of all non-None lists
current_min, i = min((lst.val, i) for i, lst in enumerate(lists) if lst is not None)
lists[i] = lists[i].next
current.next = Node(current_min)
current = current.next
return new_head.next
</code></pre>

<p>An even faster way would be to use a heap to keep track of all the pointers instead. Then we can do this in O(KN * logK) time, since
we'll be using the heapsort ordering to figure out the min in O(log K) time instead of O(K) time.</p>

<pre><code>def merge(lists):
new_head = current = Node(-1)
heap = [(lst.val, i) for i, lst in enumerate(lists)]
heapq.heapify(heap)
while heap:
current_min, i = heapq.heappop(heap)
# Add next min to merged linked list.
current.next = Node(current_min)
current = current.next
# Add next value to heap.
if lists[i] is not None:
heapq.heappush(heap, (lists[i].val, i))
lists[i] = lists[i].next
return new_head.next
</code></pre>

79
Q

<p>This problem was asked by Facebook.</p>

<p>Given an array of integers, write a function to determine whether the array could become non-decreasing by modifying at most 1 element.</p>

<p>For example, given the array [10, 5, 7], you should return true, since we can modify the 10 into a 1 to make the array non-decreasing.</p>

<p>Given the array [10, 5, 1], you should return false, since we can't modify any one element to get a non-decreasing array.</p>

A

<p>In this problem, we can count each time an element goes down. Then, if it has went down more than twice,
we can return False right away. But if count is one, and the element is one that cannot be erased by modifying
only one endpoint of that downtick, then we should return False as well.</p>

<pre><code>def check(lst):
count = 0
for i in range(len(lst) - 1):
if lst[i] > lst[i + 1]:
if count > 0:
return False
if i - 1 >= 0 and i + 2 < len(lst) and lst[i] > lst[i + 2] and lst[i + 1] < lst[i - 1]:
return False
count += 1
return True
</code></pre>

80
Q

<p>This problem was asked by Google.</p>

<p>Given the root of a binary tree, return a deepest node. For example, in the following tree, return d.</p>

<pre><code> a
/ \
b c
/
d
</code></pre>

A

<p>Base case for this question actually can’t be null, because it’s not a real result that can be combined (null is not a node). Here we should use the leaf node as the base case and return itself.</p>

<p>The recursive step for this problem is a little bit tricky because we can’t actually use the results of the left and right subtrees directly. So we need to ask, what other information do we need to solve this question? It turns out if we tagged with each subresult node their depths, we could get the final solution by picking the higher depth leaf and then incrementing it:</p>

<pre><code>def deepest(node):
if node and not node.left and not node.right:
return (node, 1) # Leaf and its depth

if not node.left: # Then the deepest node is on the right subtree
return increment_depth(deepest(node.right))
elif not node.right: # Then the deepest node is on the left subtree
return increment_depth(deepest(node.left))

return increment_depth(
max(deepest(node.left), deepest(node.right),
key=lambda x: x[1])) # Pick higher depth tuple and then increment its depth

def increment_depth(node_depth_tuple):
node, depth = node_depth_tuple
return (node, depth + 1)
</code></pre>

81
Q

<p>This problem was asked by Yelp.</p>

<p>Given a mapping of digits to letters (as in a phone number), and a digit string, return all possible letters the number could represent.
You can assume each valid number in the mapping is a single digit.</p>

<p>For example if {“2”: [“a”, “b”, “c”], 3: [“d”, “e”, “f”], …} then “23” should return [“ad”, “ae”, “af”, “bd”, “be”, “bf”, “cd”, “ce”, “cf"].</p>

A

<p>There is a relatively straight forward substructure to this problem.</p>

<p>Let's assume that we knew the result of the function with the same digits except the first character. Then, we
could reconstruct the final result by: for each character the first digit maps to, prepend that character to
each permutation from the recursive call.</p>

<p>For example, if the digits are '12', and the mapping is {'1': ['a', 'b', 'c'], '2': ['d', 'e', 'f']} then without
the first digit, the result would be ['d', 'e', 'f']. If we prepend 'a', 'b', and 'c', to each permutation,
we would get 'ad', 'ae', 'af', 'bd', 'be', 'bf', 'cd', 'ce', 'cf'.</p>

<pre><code>def get_permutations(digits, mapping):
digit = digits[0]

if len(digits) == 1:
return mapping[digit]

result = []
for char in mapping[digit]:
for perm in get_permutations(digits[1:], mapping):
result.append(char + perm)
return result
</code></pre>

82
Q

<p>This problem was asked Microsoft.</p>

<p>Using a read7() method that returns 7 characters from a file, implement readN(n) which reads n characters.</p>

<p>For example, given a file with the content “Hello world”, three read7() returns “Hello w”, “orld” and then “”.</p>

A

<p>In this problem, it's easy to get tricked into using numbers to do accounting.</p>

<p>A simpler way to solve is to keep track of a <code>remainder</code> string. This string will represent the leftover
text that couldn't be used in the previous <code>readN()</code> operation. Then all we'd need to do is call <code>read7()</code>
until we have the desired <code>n</code> length. To make sure we handle the edge case of being end of file, we also
need to exit if the call to <code>read7()</code> results in a text that's less than five characters:</p>

<pre><code>class Reader:
def \_\_init\_\_(self):
self.remainder = <span>''

def readN(self, n):
result = self.remainder
text = None

while len(result) < n and (text is None or len(text) >= 5):
text = read7()
result += text

self.remainder = result[n:]

return result[:n]
</span></code></pre>

83
Q

<p>This problem was asked by Google.</p>

<p>Invert a binary tree.</p>

<p>For example, given the following tree:</p>

<pre><code> a
/ \
b c
/ \ /
d e f
</code></pre>

<p>should become:</p>

<pre><code> a
/ \
c b
\ / \
f e d
</code></pre>

A

<p>Assuming we could invert the current node's left and right subtrees, all we'd need to do is then switch
the left to now become right, and right to become left. The base case is when the node is None and we
can just return None for that case. Then we know this works for the leaf node case since switching
left and right subtrees doesn't do anything (since they're both None).</p>

<pre><code>def invert(node):
if not node:
return node

left = invert(node.left)
right = invert(node.right)

node.left, node.right = right, left
return node
</code></pre>

84
Q

<p>This problem was asked by Amazon.</p>

<p>Given a matrix of 1s and 0s, return the number of "islands" in the matrix. A 1 represents land and 0 represents water,
so an island is a group of 1s that are neighboring whose perimeter is surrounded by water.</p>

<p>For example, this matrix has 4 islands.</p>

<pre><code>1 0 0 0 0
0 0 1 1 0
0 1 1 0 0
0 0 0 0 0
1 1 0 0 1
1 1 0 0 1
</code></pre>

A

<p>This problem can be solved by keeping track of a <code>visited</code> table that keeps track of the
land we've visited. Then, every time we see a piece of land that hasn't been visited, we
can floodfill explore.</p>

<p>This takes O(N) (where N is the number of cells) since each cell is only visited twice:
once in our outer for loop and then once in our <code>fill</code>.</p>

<pre><code>def num_islands(board):
num_rows = len(board)
num_cols = len(board[0])
count = 0

visitied = [[False for _ in range(num_cols)] for _ in range(num_rows)]
for row in range(len(board)):
for col in range(len(board[row])):
if board[row][col] == 1 and not visitied[row][col]:
fill(board, visitied, row, col)
count += 1
return count

def fill(board, visitied, row, col):
moves = [(0, 1),
(0, -1),
(1, 0),
(-1, 0)]
visitied[row][col] = True

for move_row, move_col in moves:
new_row, new_col = (row + move_row, col + move_col)
if (inside_board(board, new_row, new_col) and
board[new_row][new_col] == 1 and
not visitied[new_row][new_col]):

fill(board, visitied, new_row, new_col)

def inside_board(board, row, col):
return 0 <= row < len(board) and 0 <= col < len(board[0])
</code></pre>

85
Q

<p>This problem was asked by Facebook.</p>

<p>Given three 32-bit integers x, y, and b, return x if b is 1 and y if b is 0, using only mathematical or bit operations. You can assume
b can only be 1 or 0.</p>

A

<p>We can solve this problem by seeing that if we multiply <code>x</code> with <code>b</code>, it solves half the problem.
Since we want <code>y</code> to behave in opposite, we can get the same behavior by multiplying <code>y</code> with <code>(1 - b)</code>.</p>

<p>Now, <code>(x * b)</code> gives <code>x</code> when <code>b</code> is <code>1</code> and <code>0</code> otherwise. Similarly, <code>(y * (1 - b))</code> gives <code>y</code> when <code>b</code>
is <code>0</code> and <code>0</code> otherwise. We can just combine the two formulas with either a <code>+</code> or <code>|</code>,</p>

<pre><code>def switch(x, y, b):
return (x * b) | (y * (1 - b))
</code></pre>

86
Q

<p>This problem was asked by Google.</p>

<p>Given a string of parentheses, write a function to compute the minimum number of parentheses to be removed to make the string valid (i.e. each open parenthesis is eventually closed).</p>

<p>For example, given the string "()())()", you should return 1. Given the string ")(", you should return 2, since we must remove all of them.</p>

A

<p>For a string to be considered valid, each open parenthesis should eventually be closed. Other parentheses that don't satisfy this condition should be counted as invalid.</p>

<p>Whenever we encounter an unmatched closing parenthesis, we can count it as invalid. After that, we also add the number of unmatched opening parentheses to our invalid count. This runs in O(N).</p>

<pre><code>def count_invalid_parenthesis(string):
opened = 0
invalid = 0
for c in string:
if c == <span>'(':
opened += 1
elif c == <span>')':
if opened > 0:
opened -= 1
else:
invalid += 1
# Count as invalid all unclosed parenthesis
invalid += opened
return invalid
</span></span></code></pre>

87
Q

<p>This problem was asked by Uber.</p>

<p>A rule looks like this:</p>

<p><code>A NE B</code></p>

<p>This means this means point <code>A</code> is located northeast of point <code>B</code>.</p>

<p><code>A SW C</code></p>

<p>means that point <code>A</code> is southwest of <code>C</code>.</p>

<p>Given a list of rules, check if the sum of the rules validate. For example:</p>

<pre><code>A N B
B NE C
C N A
</code></pre>

<p>does not validate, since <code>A</code> cannot be both north and south of <code>C</code>.</p>

<pre><code>A NW B
A N B
</code></pre>

<p>is considered valid.</p>

A

<p>First, let's break down what it means for a list of rules to be invalid. Consider the following list of rules:</p>

<pre><code>A N B
B N A
</code></pre>

<p>The second rule is obviously invalid, since the first node already stated that <code>B</code> is north of <code>A</code>. We can also see that the following list is equivalent and also invalid:</p>

<pre><code>A N B
A S B
</code></pre>

<p>So, we can see that two rules invalidate each other if they relate the same pair of points and are in opposite directions. However, two rules do <em>not</em> invalidate each other if they are in the same or direction or are orthogonal to each other:</p>

<pre><code>A N B
A E B
</code></pre>

<p>In this case, we see that it is valid for A to be North of and East of B at the same time.</p>

<pre><code>A N B
C N B
</code></pre>

<p>In this case, the relative position of <code>A</code> and <code>C</code> is ambiguous, other than that they are both north of <code>B</code>.</p>

<p>Let's take a look at another example, similar to the first provided example:</p>

<pre><code>A N B
B N C
C N A
</code></pre>

<p>In this case, we see that <code>C</code> cannot be north of <code>A</code> because it is implied that <code>C</code> is south of <code>A</code> by the previous two rules. We could have re-written the first two rules into the following, so that the contradiction is obvious:</p>

<pre><code>A N B (original)
B S A
B N C (original)
A N C (through B)
C S B
C S A (through B)
</code></pre>

<p>Then, it is obvious that <code>C N A</code> and <code>C S A</code> are contradictory. We will perform this expansion and check for contradiction in our algorithm.</p>

<p>Next, we need to figure out how to deal with the diagonal cardinal directions (e.g. <code>NE</code>, <code>NW</code>, <code>SW</code>, <code>SE</code>). Let's take a look at the case where there are two rules relating the same two points, and the directions are orthogonal (perpendicular) to each other:</p>

<pre><code>A N B
A E B
</code></pre>

<p>We also notice that these two rules can be simplified into one: <code>A NE B</code>.
Similarly, we can break down any diagonal direction into the two simple directions (<code>N</code>, <code>E</code>, <code>S</code>, <code>W</code>) that it is composed of.</p>

<p>Now, we can model the relationships between points as a graph. For each point in the graph, there will be a corresponding vertex in the graph.
To represent the cardinal directions, each vertex will have a list of edges, once for each of the four directions. In our solution, we will use directed edges with the convention that an edge <code>fromVertex DIR toVertex</code> means <code>toVertex</code> is "<code>DIR</code> <em>of</em>" <code>fromVertex</code>. For example, the rule <code>A N B</code> will be parsed into an <code>N</code> edge from <code>B</code> pointing to <code>A</code>, meaning <code>A</code> is <code>North</code> <em>of</em> <code>B</code>.</p>

<p>When we add a new relationship, we should add a bi-directional edge between the two vertices -- one for the direction in the rule, and one for the opposite. For example, if the rule is <code>A N B</code>, we should add an <code>N</code> edge from <code>B</code> to <code>A</code>, and an <code>S</code> edge from <code>A</code> to <code>B</code>.</p>

<p>To add diagonal relationships, we simply parse the two directions into single directions, and treat them as two separate rules.</p>

<p>To validate an rule, we need to check if any existing edges conflict with the new edge(s) we are adding. We compute the relationships between all existing vertices and the new <code>toVertex</code>, and cache these within the graph.</p>

<p>Then, we simply check all the neighbors of the <code>fromVertex</code>, and return <code>false</code> if the neighbor's relationship with <code>toVertex</code> is contradictory to the new relationship (i.e. <code>N</code> vs <code>S</code>, <code>E</code> vs <code>W</code>).</p>

<p>When we add a new rule, we need to similarly add the relationship to all neighbors of the fromNode. For example, say <code>A</code> is already north of <code>B</code> (and <code>B</code> is already south of <code>A</code>). If we add the relationship <code>C</code> south of <code>B</code>, we also add the relationship <code>C</code> south of <code>A</code> (and <code>A</code> north of <code>C</code>). If we add the relationship <code>C</code> west of <code>B</code>, we also add the relationship <code>C</code> west of <code>A</code> (and <code>A</code> east of <code>C</code>). However, we do not add a relationship to the neighbors in the same direction as the new relationship, as mentioned in an example above.</p>

<p>Time complexity: <code>O(N * |V|) = O(N^2)</code>, where <code>N</code> is the number of rules.</p>

<p>Space complexity: <code>O(|V| + |E|) = O(|V| + |V|^2) = O(N^2)</code>, since we are creating a densely-connected graph.</p>

<pre><code>class Solution {
public static void main(String[] args) {
test1();
test2();
test3();
}

private static void test1() {
String[] rules = {"A N B",
"C SE B",
"C N A"};
System.out.println(validate(rules));
}

private static void test2() {
String[] rules = {"A NW B",
"A N B"};
System.out.println(validate(rules));
}

private static void test3() {
String[] rules = {"A N B",
"C N B"};
System.out.println(validate(rules));
}

static class Node {
List> edges = new ArrayList<>();
char val;
public Node(char val) {
this.val = val;
for (int i = 0; i < 4; i++)
edges.add(new HashSet<>());
}
}

public static final int N = 0;
public static final int E = 1;
public static final int S = 2;
public static final int W = 3;
public static final int[] DIRS = {N, E, S, W};
public static final Map charToDir = new HashMap<>();;
static {
charToDir.put('N', N);
charToDir.put('E', E);
charToDir.put('S', S);
charToDir.put('W', W);
}

public static boolean validate(String[] rules) {
Map map = new HashMap<>();

for (String line : rules) {
String[] rule = line.split(" ");
System.out.println("Rule " + rule[0] + " " + rule[1] + " " + rule[2]);
char fromVal = rule[2].charAt(0);
char toVal = rule[0].charAt(0);

if (!map.containsKey(fromVal)) {
Node n = new Node(fromVal);
map.put(fromVal, n);
}

if (!map.containsKey(toVal)) {
Node n = new Node(toVal);
map.put(toVal, n);
}

Node from = map.get(fromVal);
Node to = map.get(toVal);

/* Decompose diagonal (two-char) directions to single directions */
for (char dirChar : rule[1].toCharArray()) {
int dir = charToDir.get(dirChar);
if (!isValid(map, from, to, dir))
return false;
addEdges(map, from, to, dir);
System.out.println(from.edges.get(dir));
System.out.println(to.edges.get(opposite(dir)));
}

}

return true;
}

private static int opposite(int dir) {
return (dir + 2) % 4;
}

private static boolean isValid(Map map,
Node from,
Node to,
int newDir) {
int oppositeDir = opposite(newDir);
if (from.edges.get(oppositeDir).contains(to))
return false;

return true;
}

private static void addEdges(Map map,
Node from,
Node to,
int newDir) {
/* Get the direct opposite direction, e.g. S from N */
int oppositeDir = opposite(newDir);

/* Add the immediate edge between the nodes, using bi-directional edges. */
from.edges.get(newDir).add(to);
to.edges.get(oppositeDir).add(from);

for (int dir : DIRS) {
/* Relationships in the same direction are ambiguous.
For example, if A is north of B, and we are adding
C north of B, we cannot say C is north of A. */
if (dir == newDir)
continue;

for (Node neighbor : from.edges.get(dir)) {
/* No need to add self-edges */
if (neighbor == to)
continue;
/* Add bi-directional edges */
neighbor.edges.get(newDir).add(to);
to.edges.get(oppositeDir).add(neighbor);
}
}
}
}
</code></pre>

88
Q

<p>This question was asked by ContextLogic.</p>

<p>Implement division of two positive integers without using the division, multiplication, or modulus operators. Return the quotient as an integer, ignoring the remainder.</p>

A

<p>We can start by trying the simplest solution. Define <code>x</code> as the dividend and <code>y</code> as the divisor. To get the quotient, we need to ask how many times we can subtract <code>y</code> from <code>x</code> until the remainder is less than <code>y</code>. The number of times we subtract is the resulting quotient <code>x/y</code>. The time complexity of this brute force approach is on the order of <code>x / y</code>, which can be very high, for example if <code>x</code> is <code>2^31 - 1</code> and <code>y</code> is <code>1</code>.</p>

<p>Let's instead think about how to perform division on paper. Recall grade-school <a>long division</a>, where we take consider the left-most digit that can be divided by the divisor. At each step, the quotient becomes the first digit of the result, and we subtract the product from the dividend to get the remainder. The remainder is initially the value <code>x</code>. We can abstract this process into subtracting the largest multiple of <code>y * 10^d</code> from the remainder, where <code>d</code> is the place of the digit (<code>d=0</code> for the zeros place). Then we add the multiple times <code>10^d</code> to our result.</p>

<p>This process would be straightforward if we had the modulus or multiplication operators. However, we instead can take advantage of the bit shift operators in order to multiply by powers of two, since <code>a< results in <code>a</code> multiplied by <code>2^z</code> (e.g. <code>3<<2 = 12</code>. Now, we can find the largest <code>y * 2^d</code> that fits within the remainder. As we do in long division, we decrease the possible value of <code>d</code> in each iteration. We start by finding the largest value of <code>y * 2^d <= x</code>, then test <code>y * 2^d, y * 2^(d-1), ...</code> until the remainder is less than <code>y</code>.</code></p>

<p>For example:</p>

<pre><code>x = 31, y = 3 => x = 1001, y = 0011
11111 - 0011 <<3 = 0111, quotient = 1<<3
0111 - 0011<<1 = 0001, quotient = 1<<3 + 1<<1
1<<3 + 1<<1 = 1010 = 10
</code></pre>

<p>Here is the Python implementation:</p>

<pre><code>def divide(x, y):
if y == 0:
raise ZeroDivisionError(<span>'division by zero')

quotient = 0
power = 32 # Assume 32-bit integer
yPower = y << power # Initial y^d value is y^32
remainder = x # Initial remainder is x
while remainder >= y:
while yPower > remainder:
yPower >>= 1
power -= 1
quotient += 1 << power
remainder -= yPower

return quotient
</span></code></pre>

<p>The time complexity of this solution is <code>O(N)</code>, where <code>N</code> is the number of bits used to represent <code>x/y</code>, assuming shift and add operations take <code>O(1)</code> time.</p>

89
Q

<p>This problem was asked by LinkedIn.</p>

<p>Determine whether a tree is a valid binary search tree.</p>

<p>A binary search tree is a tree with two children, <code>left</code> and <code>right</code>, and satisfies the constraint that
the key in the <code>left</code> child must be less than or equal to the root and the key in the <code>right</code> child
must be greater than or equal to the root.</p>

A

<p>To solve this problem, we need to recall the definition of a binary search tree. Each node of a BST has the following properties:</p>

<ul><li>A node's left subtree contains only nodes with keys less than the nodes' key.</li><li>A node's right subtree contains only nodes with keys greater than the nodes' key.</li><li>Both the left and right subtrees must be valid BSTs.</li></ul>

<p>From the properties above, we can construct a recursive solution. It's tempting to write a solution which checks whether the left node's key is less than the current node's key and the right node's key is greater than the current node's key. However, we have to make sure that the property holds for the entire subtree, not just the children. For example, the following binary tree would be considered valid if we only checked the children:</p>

<pre><code> 2
/ \
1 3
\
4
</code></pre>

<p>We can iterate through the entire left and right subtrees to determine whether the keys are valid. However, the work would be doing can be simplified into one recursive method. </p>

<p>Let's call our recursive method <code>is_bst()</code>. At each call in our recursive method, we can maintain a range of valid values for the node's keys -- we'll call the lower bound <code>min_key</code> and upper bound <code>max_key</code>. If the current node's key is <em>outside</em> the range of <code>min_key</code> to <code>max_key</code>, then return <code>false</code>. Otherwise, we call the method on the left and right child nodes, returning <code>true</code> if both calls return <code>true</code>. If a node is <code>null</code>, we should return <code>true</code>.</p>

<p>When we call <code>is_bst()</code> on the children, we limit the range of valid keys based on our current key. If we call <code>is_bst()</code> on the <em>left</em> node, then <code>min_key</code> should remain the same, while <code>max_key</code> should be updated to the current node's key. Similarly, if we call <code>is_bst()</code> on the <em>right</em> node, then <code>max_key</code> should remain the same, while <code>min_key</code> should be updated to the current node's key. </p>

<pre><code>class TreeNode:
def \_\_init\_\_(self, key):
self.left = None
self.right = None
self.key = key

def is_bst(root):
def is_bst_helper(root, min_key, max_key):
if root is None:
return True
if root.key <= min_key or root.key >= max_key:
return False
return is_bst_helper(root.left, min_key, root.key) and \
is_bst_helper(root.right, root.key, max_key)

return is_bst_helper(root, float(<span>'-inf'), float(<span>'inf'))
</span></span></code></pre>

<p>The time complexity of this solution is <code>O(N)</code>, as it requires visiting every node in the tree.</p>

90
Q

<p>This question was asked by Google.</p>

<p>Given an integer <code>n</code> and a list of integers <code>l</code>, write a function that randomly generates a number from <code>0</code> to <code>n-1</code> that isn't in <code>l</code> (uniform).</p>

A

<p>One way we can approach this problem is by using <a>rejection sampling</a>. First, we generate a (uniformly) random integer between <code>0</code> and <code>n-1</code> (inclusive). Then, we check whether the random integer is found within the list <code>l</code>. If it is found, then generate repeat the process. If it is not found, then return that number. To make sure we can check for the presence of a number in <code>O(1)</code> time, we can put all integers in <code>l</code> in a set.</p>

<p>Since this solution involves repeatedly generating a new random number, it could have up to infinite worst-case runtime. The initial call also incurs <code>O(N)</code> to convert list into a set. The probability of selecting a random number depends on the the ratio of numbers in <code>l</code> that are within the bounds <code>0</code> to <code>n-1</code>, versus the number <code>n</code>. </p>

<p>Another way we can approach this problem is by generating a random number strictly from the numbers available. We can construct the list of numbers of available by subtracting the set of integers in <code>l</code> from the set of integers in the range <code>0</code> to <code>n-1</code>. Then, we can simply generate a random number within <code>0</code> and the length of this new list, and return the integer at that index.</p>

<p>This solution takes <code>O(N)</code> time to pre-process the list, and <code>O(1)</code> time to generate a random integer.</p>

<pre><code>from random import randrange

def process_list(n, l):
all_nums_set = set()
for i in range(n):
all_nums_set.add(i)

l_set = set(l)
nums_set = all_nums_set - l_set
return list(nums_set)

def random_number_excluing_list(n, l):
nums_list = process_list(n, l)
idx = randrange(0, len(nums_list))
return nums_list[idx]

print(random_number_excluing_list(4, [1, 2, 5]))
</code></pre>

91
Q

<p>This problem was asked by Dropbox.</p>

<p>What does the below code snippet print out? How can we fix the anonymous functions to behave as we'd expect?</p>

<pre><code>functions = []
for i in range(10):
functions.append(lambda : i)

for f in functions:
print(f())
</code></pre>

A

<p>At a first glance, it seems like the snippet should print out from 0 to 9. But actually, it prints 9 10 times.</p>

<p>The problem is that the functions have closure and have access to the non-local variable <code>i</code>. We instead want the value of <code>i</code> when the functions are declared.</p>

<p>In order to solve this issue, we should capture the value <code>i</code> when the funcionts are declared. This would make <code>i</code> a local variable
inside the anonymous functions.</p>

<pre><code>functions = []
for i in range(10):
functions.append(lambda i=i: i)

for f in functions:
print(f())
</code></pre>

92
Q

<p>This problem was asked by Airbnb.</p>

<p>We're given a hashmap associating each <code>courseId</code> key with a list of <code>courseIds</code> values, which represents that the prerequisites of <code>courseId</code> are <code>courseIds</code>. Return a sorted ordering of courses such that we can finish all courses.</p>

<p>Return null if there is no such ordering.</p>

<p>For example, given {'CSC300': ['CSC100', 'CSC200'], 'CSC200': ['CSC100'], 'CSC100': []}, should return ['CSC100', 'CSC200', 'CSCS300'].</p>

A

<p>This is a classic topological sorting question. One way to think about this problem is to think about how would you go
about solving it manually? We can divide it into these two steps:</p>

<ol><li>Put all courses with no pre-requisites into our todo list.</li><li>For each course C in the todo list, find each course D which have C as a prerequisite and remove C from its list. Add D to the todo list.</li></ol>

<p>If in the end we couldn't take some courses, this means that were was a circular dependency.</p>

<pre><code>def courses_to_take(course_to_prereqs):
# Copy list values into a set for faster removal.
course_to_prereqs = {c: set(p) for c, p in course_to_prereqs.items()}

todo = [c for c, p in course_to_prereqs.items() if not p]

# Used to find courses D which have C as a prerequiste
prereq_to_coures = {}
for course in course_to_prereqs:
for prereq in course_to_prereqs[course]:
if prereq not in prereq_to_coures:
prereq_to_coures[prereq] = []

prereq_to_coures[prereq].append(course)

result = [] # courses we need to take in order

while todo:
prereq = todo.pop()
result.append(prereq)

# Find which courses are now free to take

for c in prereq_to_coures.get(prereq, []):
course_to_prereqs[c].remove(prereq)
if not course_to_prereqs[c]:
todo.append(c)

# Cicrcular dependency
if len(result) < len(course_to_prereqs):
return None
return result
</code></pre>

93
Q

<p>This problem was asked by Apple.</p>

<p>Given a tree, find the largest tree/subtree that is a BST.</p>

<p>Given a tree, return the size of the largest tree/subtree that is a BST.</p>

A

<p>One way we can solve this problem is by using the solution to the "determine whether a tree is a valid binary search tree" problem with modifications. </p>

<p>Let's recap the properties of a valid BST. Each node of a BST has the following properties:</p>

<ul><li>A node's left subtree contains only nodes with keys less than the nodes' key.</li><li>A node's right subtree contains only nodes with keys greater than the nodes' key.</li><li>Both the left and right subtrees must be valid BSTs.</li></ul>

<p>Our solution to <code>is_bst()</code> looked like this:</p>

<pre><code>class TreeNode:
def \_\_init\_\_(self, key):
self.left = None
self.right = None
self.key = key

def is_bst(root):
def is_bst_helper(root, min_key, max_key):
if root is None:
return True
if root.key <= min_key or root.key >= max_key:
return False
return is_bst_helper(root.left, min_key, root.key) and \
is_bst_helper(root.right, root.key, max_key)

return is_bst_helper(root, float(<span>'-inf'), float(<span>'inf'))
</span></span></code></pre>

<p>We can use this method at each node in our tree, starting at the leaves and returning the size of the tree upwards:</p>

<pre><code>def size(root):
if root is None:
return 0
return size(root.left) + size(root.right) + 1

def largest_bst_subtree(root):
def helper(root):
# Returns a tuple of (size, root) of the largest subtree.
if is_bst(root):
return (size(root), root)
return max(helper(root.left), helper(root.right), key=lambda x: x[0])

return helper(root)[1]
</code></pre>

<p>The time complexity of this solution is <code>O(N^2)</code> in the worse case, since we are doing an <code>O(N)</code> traversal for each of the nodes in the tree.</p>

<p>We can improve upon this solution by using a single method to check the validity and find the size of a subtree. To do this, we can revisit the definition of a BST in our <code>is_bst()</code> method. Instead of passing the range of valid keys down to the children of the current node, we can return the range of valid keys up to the parent. At the current node, we check whether the key is <em>less than</em> the <code>max_key</code> of the left subtree or <em>greater than</em> the <code>min_key</code> of the right subtree. In this way, we can determine both the size and validity in a bottom-up fashion.</p>

<pre><code>def largest_bst_subtree(root):
max_size = 0
max_root = None
def helper(root):
# Returns a tuple of (size, min_key, max_key) of the subtree.
nonlocal max_size
nonlocal max_root
if root is None:
return (0, float(<span>'inf'), float(<span>'-inf'))
left = helper(root.left)
right = helper(root.right)
if root.key > left[2] and root.key < right[1]:
size = left[0] + right[0] + 1
if size > max_size:
max_size = size
max_root = root
return (size, min(root.key, left[1]), max(root.key, right[2]))
else:
return (0, float(<span>'-inf'), float(<span>'inf'))

helper(root)
return max_root
</span></span></span></span></code></pre>

<p>Our solution now has a worst-case time complexity of <code>O(N)</code>, where <code>N</code> is the number of nodes in the tree.</p>

94
Q

<p>This problem was asked by Google.</p>

<p>Given a binary tree of integers, find the maximum path sum between two nodes. The path must go through at least one node, and does not need to go through the root.</p>

A

<p>We can solve this problem recursively. There are three cases that the max-sum path can fall under:</p>

<ol><li>The path includes the root value</li><li>The path is the max-sum path of the left subtree</li><li>The path is the max-sum path of the right subtree</li></ol>

<p>Our algorithm will return the maximum of these three values. However, solving for the maximum sum alone in cases 2 and 3 does not solve case 1. In order for the root to join the left subtree's path, a path must end at the root of the subtree. The same applies to the right subtree. Thus, we not only return the maximum sum of the path, but also the height of the tallest path ending at the root of the subtree. The base case is when the node is a single-node tree, which would mean that the maximum sum can only be its own value, and the height is 1.</p>

<pre><code>def max_path_sum(self, root):
def helper(root):
if root is None:
return (float(<span>'-inf'), 0)

left_max_sum, left_path = helper(root.left)
right_max_sum, right_path = helper(root.right)
# Calculates the maximum path through the root
root_max_sum = max(0, left_path) + root.val + max(0, right_path)
# Find the maximum path, including or excluding the root
max_sum = max(left_max_sum, root_max_sum, right_max_sum)
# Find the maximum path including and ending at the root
root_path = max(left_path, right_path, 0) + root.val

return (max_sum, root_path)

# Return only the maximum path
return helper(root)[0]
</span></code></pre>

<p>Since our algorithm is similar to a DFS search on a binary tree, the solution has a time complexity of <code>O(N)</code>, and uses up to <code>O(N)</code> space on the call stack. </p>

95
Q

<p>This problem was asked by Palantir.</p>

<p>Given a number represented by a list of digits, find the next greater permutation of a number, in terms of lexicographic ordering. If there is not greater permutation possible, return the permutation with the lowest value/ordering.</p>

<p>For example, the list <code>[1,2,3]</code> should return <code>[1,3,2]</code>. The list <code>[1,3,2]</code> should return <code>[2,1,3]</code>. The list <code>[3,2,1]</code> should return <code>[1,2,3]</code>.</p>

<p>Can you perform the operation without allocating extra memory (disregarding the input memory)?</p>

A

<p>The brute force approach to this problem would be to generate all permutations of the number/list. One we have all of the permutations, we choose the one that comes right after our input list, by taking the minimum of their distances. This approach would take <code>O(N!)</code> time to generate the permutations and find the one that is the closest to our given number. </p>

<p>We can instead observe a pattern in how the next permutation is obtained. First, consider the case where the entire sequence is decreasing (e.g. <code>[3,2,1]</code>). Regardless of the values, we cannot generate a permutation where the value is greater. We can see this since, if there were two possible values <code>a</code> and <code>b</code> that could be swapped to obtain a higher value of <code>a</code>, <code>b</code> must be greater than <code>a</code>. Since <code>b</code> comes after <code>a</code> in the sequence and the sequence is decreasing, we have a contradiction. Next, we have to generate the lowest possible value from the sequence. By similar logic, the smallest number we can generate is from an increasing sequence -- or the reverse of the decreasing sequence. </p>

<p>Now, we can break down other problems in terms of this subproblem. We will start from the right side of the list, and try to find the smallest digit that we can swap and obtain a higher number. While the sublist is a decreasing sequence, we cannot get a higher permutation. Instead, we try adding to the sublist by appending the digit to the left. Once we get a digit that is less than the one to the right, we swap it with the smallest digit that is higher than it. Then, we must make sure the remaining digits to the right are in their lowest-value permutation. We can do this by ordering the remaining digits on the right from least-to-greatest. Since our swap preserves the fact that the right sublist is decreasing, we simply reverse it.
For example:</p>

<pre><code>[1, 3, 5, 4, 2]
^ decreasing sublist
[1, 3, 5, 4, 2]
^ ^ decreasing sublist
[1, 3, 5, 4, 2]
^ ^ decreasing sublist
[1, 3, 5, 4, 2]
^ ^ non-decreasing, so swap with next-highest digit
[1, 4, 5, 3, 2]
^ ^ reverse digits to the right, sorting least-to-greatest
[1, 4, 2, 3, 5]
</code></pre>

<pre><code>def nextPermutation(self, nums):
def swap(nums, a, b):
# Perform an in-place swap
nums[a], nums[b] = nums[b], nums[a]

def reverse(nums, a, b):
# Reverses elements at index a to b (inclusive) in-place
nums[a:b+1] = reversed(nums[a:b+1])

# Find first index where nums[idx] < nums[idx + 1]
pivot = len(nums) - 2
while pivot >= 0 and nums[pivot] >= nums[pivot + 1]:
pivot -= 1

if pivot >= 0:
# Find the next-largest number to swap with
successor = len(nums) - 1
while (successor > 0 and nums[successor] <= nums[pivot]):
successor -= 1
swap(nums, pivot, successor)

reverse(nums, pivot + 1, len(nums) - 1)
</code></pre>

96
Q

<p>This problem was asked by Microsoft.</p>

<p>Given a number in the form of a list of digits, return all possible permutations.</p>

<p>For example, given <code>[1,2,3]</code>, return <code>[[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]</code>.</p>

A

<p>There are a few ways to do this, and most solutions will have the same run-time. We will need to generate all <code>N!</code> permutations, so our algorithm will have <code>O(N!)</code> run time. </p>

<p>The most straightforward method is to use recursion. We can think of the problem in terms of subproblems, where we can generate permutations of a sublist. A permutation of a single digit (e.g. <code>[1]</code>) would return simply the single digit. To get permutations of size <code>n</code>, we get all permutations of size <code>n-1</code> and add the next character within each position (index <code>0</code> to <code>n</code>). For example, one permutation of the sublist <code>[2,3]</code> is <code>[2,3]</code>. We add <code>1</code> to three positions to obtain <code>[1,2,3]</code>, <code>[2,1,3]</code>, and <code>[2,3,1]</code>.</p>

<pre><code>def permute(nums):
if (len(nums) == 1):
return [nums]

output = []
for l in permute(nums[1:]):
for idx in range(len(nums)):
output.append(l[:idx] + [nums[0]] + l[idx:])
return output
</code></pre>

<p>An alternative way we can formulate the recursion is by generating all permutations of length <code>n-1</code>, but with all digits allowed. The permutations of size 1 would return the input array (e.g. <code>[[1],[2],[3]]</code>). Then, we append the <code>nth</code> digit to the front of the permutations.</p>

<pre><code>def permute(nums):
def helper(nums, index, output):
if index == len(nums) - 1:
output.append(nums.copy())
for i in range(index, len(nums)):
nums[index], nums[i] = nums[i], nums[index]
helper(nums, index + 1, output)
nums[index], nums[i] = nums[i], nums[index]

output = []
helper(nums, 0, output)
return output
</code></pre>

<p>Both solutions run in <code>O(N!)</code> time and space, where <code>N</code> is the size of the input list.</p>

97
Q

<p>This problem was asked by Stripe.</p>

<p>Write a map implementation with a get function that lets you retrieve the value of a key at a particular time.</p>

<p>It should contain the following methods:</p>

<ul><li><code>set(key, value, time)</code>: sets key to value for t = time.</li><li><code>get(key, time)</code>: gets the key at t = time.</li></ul>

<p>The map should work like this. If we set a key at a particular time, it will maintain that value forever
or until it gets set at a later time. In other words, when we get a key at a time, it should return the value
that was set for that key set at the most recent time.</p>

<p>Consider the following examples:</p>

<pre><code>d.set(1, 1, 0) # set key 1 to value 1 at time 0
d.set(1, 2, 2) # set key 1 to value 2 at time 2
d.get(1, 1) # get key 1 at time 1 should be 1
d.get(1, 3) # get key 1 at time 3 should be 2
</code></pre>

<pre><code>d.set(1, 1, 5) # set key 1 to value 1 at time 5
d.get(1, 0) # get key 1 at time 0 should be null
d.get(1, 10) # get key 1 at time 10 should be 1
</code></pre>

<pre><code>d.set(1, 1, 0) # set key 1 to value 1 at time 0
d.set(1, 2, 0) # set key 1 to value 2 at time 0
d.get(1, 0) # get key 1 at time 0 should be 2
</code></pre>

A

<p>One possible way to solve this question is using a map of maps, where each key has its own map of time-value pairs. That would mean something like:</p>

<pre><code>{
key: {
time: value,
time: value,
...
},
key: {
time: value,
time: value,
...
},
...
}
</code></pre>

<p>Also, if a particular time does not exist on the time-value map, we must be able to get the value of the nearest previous time (or null if doesn't have one). A sorted map would fit the bill, but python standard library doesn't have one. So, let's see how this map would look:</p>

<pre><code>class TimeMap:
def \_\_init\_\_(self):
self.map = dict()
self.sorted_keys_cache = None

def get(self, key):
value = self.map.get(key)
if value is not None:
return value
if self.sorted_keys_cache is None:
self.sorted_keys_cache = sorted(self.map.keys())
i = bisect.bisect_left(self.sorted_keys_cache, key)
if i == 0:
return None
else:
return self.map.get(self.sorted_keys_cache[i - 1])

def set(self, key, value):
self.sorted_keys_cache = None
self.map[key] = value
</code></pre>

<p>This is a map with a list of sorted keys. To find out the nearest previous time we use the binary search algorithm provided by the <a>bisect</a>.</p>

<p>Any write operation on this map wipes the key's cache, causing a full sort of the keys on the next <code>get</code> call, which in python's <a>TimSort</a> averages as O(n log n) complexity.</p>

<p>For mixed workloads, a more suitable approach is to use arrays under the hood. Something like this:</p>

<pre><code>class TimeMap:
def \_\_init\_\_(self):
self.keys = []
self.values = []

def get(self, key):
if self.keys is None:
return None
i = bisect.bisect_left(self.keys, key)
if len(self.keys) == i:
return self.values[i - 1]
elif self.keys[i] == key:
return self.values[i]
elif i == 0:
return None
else:
return self.values[i - 1]

def set(self, key, value):
i = bisect.bisect_left(self.keys, key)
if len(self.keys) == i:
self.keys.append(key)
self.values.append(value)
elif self.keys[i] == key:
self.values[i] = value
else:
self.keys.insert(i + 1, key)
self.values.insert(i + 1, value)
</code></pre>

<p>In this way, both <code>get</code> and <code>set</code> behave more predictable from the performance standpoint, it's just a binary search, and for <code>set</code> two array reallocations in the worst case.</p>

<p>The last missing part to solve this question is the first level map, which the code would look this:</p>

<pre><code>class MultiTimeMap:
def \_\_init\_\_(self):
self.map = defaultdict(TimeMap)

def set(self, key, value, time):
self.map[key].set(time, value)

def get(self, key, time):
time_map = self.map.get(key)
if time_map is None:
return None
else:
return time_map.get(time)
</code></pre>

<p>Now each key can have its own TimeMap, initialized by <a>defaultdict</a> when needed.</p>

98
Q

<p>This problem was asked by Coursera.</p>

<p>Given a 2D board of characters and a word, find if the word exists in the grid.</p>

<p>The word can be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once.</p>

<p>For example, given the following board:</p>

<pre><code>[
['A','B','C','E'],
['S','F','C','S'],
['A','D','E','E']
]
</code></pre>

<p><code>exists(board, "ABCCED")</code> returns <code>true</code>,
<code>exists(board, "SEE")</code> returns <code>true</code>,
<code>exists(board, "ABCB")</code> returns <code>false</code>.</p>

A

<p>We can view the provided board as a graph, where the characters are nodes, whose edges point to adjacent characters. We have the choice of performing a depth-first or breadth-first search. DFS is usually simpler to implement, so we will make that our choice. Also, since we can exit the search once we've reached the length of the word, the recursive depth will be limited by the length.</p>

<pre><code>def search(board, row, col, word, index, visited):
def is_valid(board, row, col):
return row >= 0 and row < len(board) and col >= 0 and col < len(board[0])

if not is_valid(board, row, col):
return False
if visited.contains((row, col)):
return False
if board[row][col] != word[index]:
return False
if index == len(word) - 1:
return True

visited.add((row, col))

for d in ((0, -1), (0, 1), (-1, 0), (1, 0)):
if search(board, row + d[0], col + d[1], word, index + 1):
return True

visited.remove((row, col)) # Backtrack

return False

def find_word(board, word):
int M = len(board)
int N = len(board[0])

for row in range(M):
for col in range(N):
visited = set()
if search(board, row, col, word, 0, visited):
return True
</code></pre>

<p>The worst-case time complexity of this solution is <code>O(MN * 4^L)</code> where <code>L</code> is the length of the word and M and N are the dimensions of the board.</p>

99
Q

<p>This problem was asked by Microsoft.</p>

<p>Given an unsorted array of integers, find the length of the longest consecutive elements sequence.</p>

<p>For example, given <code>[100, 4, 200, 1, 3, 2]</code>, the longest consecutive element sequence is <code>[1, 2, 3, 4]</code>. Return its length: <code>4</code>.</p>

<p>Your algorithm should run in <code>O(n)</code> complexity.</p>

A

<p>We can see that if the array of integers was sorted, we'd be able to find the
longest consecutive sequence fairly easily. First, we sort the array in
ascending order. Then, we traverse the array, keeping the count of the current
sequence. If the next element is one higher than the current element, then we
increment the count. Otherwise, we start the count over at 1. We simply return
the maximum count we've seen overall. The overall run-time of this solution
would be <code>O(nlogn)</code>, since we have to sort the input array. Depending on the
sorting implementation and whether or not we are able to modify the array, the
space complexity would be <code>O(n)</code> or <code>O(1)</code>.</p>

<p>We can improve our solution by using extra space to cache the bounds of
sequences we've seen so far. If we get a new number, then we start with a
sequence length of 1. If we get a number we've seen already, we can ignore it
since none of the bounds should change.</p>

<pre><code>def longest_consecutive(self, nums):
max_len = 0
bounds = dict()
for num in nums:
if num in bounds:
continue
left_bound, right_bound = num, num
if num - 1 in bounds:
left_bound = bounds[num - 1][0]
if num + 1 in bounds:
right_bound = bounds[num + 1][1]
bounds[num] = left_bound, right_bound
bounds[left_bound] = left_bound, right_bound
bounds[right_bound] = left_bound, right_bound
max_len = max(right_bound - left_bound + 1, max_len)

return max_len
</code></pre>

<p>This solution has a time complexity of <code>O(N)</code>, and a space complexity of <code>O(N)</code>.</p>

100
Q

<p>This problem was asked by Google.</p>

<p>You are in an infinite 2D grid where you can move in any of the 8 directions:</p>

<pre><code> (x,y) to
(x+1, y),
(x - 1, y),
(x, y+1),
(x, y-1),
(x-1, y-1),
(x+1,y+1),
(x-1,y+1),
(x+1,y-1)
</code></pre>

<p>You are given a sequence of points and the order in which you need to cover the points. Give the minimum number of steps in which you can achieve it. You start from the first point.</p>

<p>Example:</p>

<pre><code>Input: [(0, 0), (1, 1), (1, 2)]
Output: 2
</code></pre>

<p>It takes 1 step to move from <code>(0, 0)</code> to <code>(1, 1)</code>. It takes one more step to move from <code>(1, 1)</code> to <code>(1, 2)</code>.</p>

A

<p>We can see that the minimum number of steps would be to walk as many diagonal steps as possible, and then walk directly to the second point. If we were to walk directly vertically and then horizontally, it would be a greater number of steps. We can break down the diagonal and vertical/horizontal components by taking the adding the minimum of the vertical or horizontal differences with the remaining distance.</p>

<pre><code>// X and Y co-ordinates of the points in order.
// Each point is represented by (X.get(i), Y.get(i))
public int coverPoints(ArrayList X, ArrayList Y) {
int totalDistance = 0;
for (int i = 1; i < X.size(); i++) {
totalDistance += getDistance(X.get(i - 1), Y.get(i - 1), X.get(i), Y.get(i));
}
return totalDistance;
}

private int getDistance(int x1, int y1, int x2, int y2) {
/* Get diagonal distance component */
int dist1 = (int)Math.min(Math.abs(x2 - x1), Math.abs(y2 - y1));
/* Get horizontal/vertical distance component */
int dist2 = (int)Math.max(Math.abs(x2 - x1), Math.abs(y2 - y1)) - dist1;
return dist1 + dist2;
}
</code></pre>

<p>Or, we can simply take the maximum of the vertical and horizontal distances -- this is the total distance.</p>

<pre><code>// X and Y co-ordinates of the points in order.
// Each point is represented by (X.get(i), Y.get(i))
public int coverPoints(ArrayList X, ArrayList Y) {
int totalDistance = 0;
for (int i = 1; i < X.size(); i++) {
totalDistance += getDistance(X.get(i - 1), Y.get(i - 1), X.get(i), Y.get(i));
}
return totalDistance;
}

private int getDistance(int x1, int y1, int x2, int y2) {
return (int)Math.max(Math.abs(x2 - x1), Math.abs(y2 - y1));
}
</code></pre>