Facebook Sample Questions Flashcards
This is a sample of questions based on real-life ones from Facebook. This is not intended to be comprehensive (after all, the company intentionally switches up their questions to avoid memorization), but it should give you a good sense for the kind of concepts that are important to keep in mind.
Word Cloud Data
You want to build a word cloud, an infographic where the size of a word corresponds to how often it appears in the body of text.
To do this, you’ll need data. Write code that takes a long string and builds its word cloud data in a dictionary, where the keys are words and the values are the number of times the words occurred.
Think about capitalized words. For example, look at these sentences:
'After beating the eggs, Dana read the next step:'
'Add milk and eggs, then add flour and sugar.'
What do we want to do with “After”, “Dana”, and “add”? In this example, your final dictionary should include one “Add” or “add” with a value of 2. Make reasonable (not necessarily perfect) decisions about cases like “After” and “Dana”.
Assume the input will only contain words and standard punctuation.
In our solution, we make three decisions:
- We use a class. This allows us to tie our methods together, calling them on instances of our class instead of passing references.
- To handle duplicate words with different cases, we choose to make a word uppercase in our
dictionary
only if it is always uppercase in the original string. While this is a reasonable approach, it is imperfect (consider proper nouns that are also lowercase words, like “Bill” and “bill”). -
We build our own
split_words()
method instead of using a built-in one. This allows us to pass each word to ouradd_word_to_dictionary()
method as it was split, and to split words and eliminate punctuation in one iteration.
To split the words in the input string and populate a dictionary of the unique words to the number of times they occurred, we:
- Split words by spaces, em dashes, and ellipses—making sure to include hyphens surrounded by characters. We also include all apostrophes (which will handle contractions nicely but will break possessives into separate words).
- Populate the words in our dictionary as they are identified, checking if the word is already in our dictionary in its current case or another case.
If the input word is uppercase and there’s a lowercase version in the dictionary, we increment the lowercase version’s count. If the input word is lowercase and there’s an uppercase version in the dictionary, we “demote” the uppercase version by adding the lowercase version and giving it the uppercase version’s count.
Runtime and memory cost are both O(n)
Merge Sorted Arrays
In order to win the prize for most cookies sold, my friend Alice and I are going to merge our Girl Scout Cookies orders and enter as one unit.
Each order is represented by an “order id” (an integer).
We have our lists of orders sorted numerically already, in lists. Write a function to merge our lists of orders into one sorted list.
For example:
my_list = [3, 4, 6, 10, 11, 15]
alices_list = [1, 5, 8, 12, 14, 19]
# Prints [1, 3, 4, 5, 6, 8, 10, 11, 12, 14, 15, 19]
print(merge_lists(my_list, alices_list))
First, we allocate our answer list, getting its size by adding the size of my_list
and alices_list
.
We keep track of a current index in my_list
, a current index in alices_list
, and a current index in merged_list
. So at each step, there’s a “current item” in alices_list
and in my_list
. The smaller of those is the next one we add to the merged_list
!
But careful: we also need to account for the case where we exhaust one of our lists and there are still elements in the other. To handle this, we say that the current item in my_list
is the next item to add to merged_list
only if my_list
is not exhausted AND, either:
-
alices_list
is exhausted, or - the current item in
my_list
is less than the current item inalices_list
Complexity: O(n) time and O(n) additional space, where n is the number of items in the merged list.
Compress URL List
I’m making a search engine called MillionGazillion™.
I wrote a crawler that visits web pages, stores a few keywords in a database, and follows links to other web pages. I noticed that my crawler was wasting a lot of time visiting the same pages over and over, so I made a set, visited
, where I’m storing URLs I’ve already visited. Now the crawler only visits a URL if it hasn’t already been visited.
Thing is, the crawler is running on my old desktop computer in my parents’ basement (where I totally don’t live anymore), and it keeps running out of memory because visited
is getting so huge.
How can I trim down the amount of space taken up by visited
?
We can use a trie. If you’ve never heard of a trie, think of it this way:
Let’s make visited a nested dictionary where each map has keys of just one character. So we would store ‘google.com’ as visited['g']['o']['o']['g']['l']['e']['.']['c']['o']['m']['*'] = True
.
The ‘*’ at the end means ‘this is the end of an entry’. Otherwise we wouldn’t know what parts of visited are real URLs and which parts are just prefixes. In the example above, ‘google.co’ is a prefix that we might think is a visited URL if we didn’t have some way to mark ‘this is the end of an entry.’
Now when we go to add ‘google.com/maps’ to visited, we only have to add the characters ‘/maps’, because the ‘google.com’ prefix is already there. Same with ‘google.com/about/jobs’.
We can visualize this as a tree, where each character in a string corresponds to a node.
A trie is a type of tree.
To check if a string is in the trie, we just descend from the root of the tree to a leaf, checking for a node in the tree corresponding to each character in the string.
How could we implement this structure? There are lots of ways! We could use nested dictionaries, nodes and pointers, or some combination of the two. Evaluating the pros and cons of different options and choosing one is a great thing to do in a programming interview.
In our implementation, we chose to use nested dictionaries. To determine if a given site has been visited, we just call add_word()
, which adds a word to the trie if it’s not already there.
If you used a ternary search tree, that’s a great answer too. A bloom filter also works—specially if you use run-length encoding.
Complexity: How much space does this save?
So for all URLs of length n or fewer, we have 26^n + 26^{(n-1)} + … + 26^1
This is O(26^n). We’ve shaved off a factor of n.
Bonus trivia: although the HTTP spec allows for unlimited URL length, in practice many web browsers won’t support URLs over 2,000 characters.
BST Checker
Write a function to check that a binary tree is a valid binary search tree.
Here’s a sample binary tree node class:
class BinaryTreeNode(object):
` def __init__(self, value):`
` self.value = value`
` self.left = None`
` self.right = None`
` def insert_left(self, value):`
` self.left = BinaryTreeNode(value)`
` return self.left`
` def insert_right(self, value):`
` self.right = BinaryTreeNode(value)`
` return self.right`
We do a depth-first walk through the tree, testing each node for validity as we go. If a node appears in the left subtree of an ancestor, it must be less than that ancestor. If a node appears in the right subtree of an ancestor, it must be greater than that ancestor.
Instead of keeping track of every ancestor to check these inequalities, we just check the largest number it must be greater than (its lower_bound
) and the smallest number it must be less than (its upper_bound
).
Complexity: O(n) time and O(n) space.
Linked List Cycles
You have a singly-linked list and want to check if it contains a cycle.
A singly-linked list is built with nodes, where each node has:
node.next
—the next node in the list.
node.value
—the data held in the node. For example, if our linked list stores people in line at the movies, node.value might be the person’s name.
For example:
class LinkedListNode(object):
` def __init__(self, value):`
` self.value = value`
` self.next = None`
A cycle occurs when a node’s next points back to a previous node in the list. The linked list is no longer linear with a beginning and end—instead, it cycles through a loop of nodes.
Write a function contains_cycle()
that takes the first node in a singly-linked list and returns a boolean indicating whether the list contains a cycle.
We keep two pointers to nodes (we’ll call these “runners”), both starting at the first node. Every time slow_runner
moves one node ahead, fast_runner
moves two nodes ahead.
If the linked list has a cycle, fast_runner
will “lap” (catch up with) slow_runner
, and they will momentarily equal each other.
If the list does not have a cycle, fast_runner
will reach the end.
Complexity: O(n) time and O(1) space.
Nth Fibonacci
Write a function fib()
that takes an integer n and returns the nth Fibonacci number.
Let’s say our Fibonacci series is 0-indexed and starts with 0. So:
fib(0) # => 0
fib(1) # => 1
fib(2) # => 1
fib(3) # => 2
fib(4) # => 3
We use a bottom-up approach, starting with the 0th Fibonacci number and iteratively computing subsequent numbers until we get to n.
Complexity: O(n) time and O(1) space.
Find Rotation Point
I want to learn some big words so people think I’m smart.
I opened up a dictionary to a page in the middle and started flipping through, looking for words I didn’t know. I put each word I didn’t know at increasing indices in a huge list I created in memory. When I reached the end of the dictionary, I started from the beginning and did the same thing until I reached the page I started at.
Now I have a list of words that are mostly alphabetical, except they start somewhere in the middle of the alphabet, reach the end, and then start from the beginning of the alphabet. In other words, this is an alphabetically ordered list that has been “rotated.”
For example:
words = [ ‘ptolemaic’,
‘retrograde’,
‘supplant’,
‘undulate’,
‘xenoepist’,
‘asymptote’, # <– rotates here!
‘babka’,
‘banoffee’,
‘engender’,
‘karpatka’,
‘othellolagkage’, ]
Write a function for finding the index of the “rotation point,” which is where I started working from the beginning of the dictionary. This list is huge (there are lots of words I don’t know) so we want to be efficient here.
This is a modified version of binary search. At each iteration, we go right if the item we’re looking at is greater than the first item and we go left if the item we’re looking at is less than the first item.
We keep track of the lower and upper bounds on the rotation point, calling them floor_index
and ceiling_index
(initially we called them “floor” and “ceiling,” but because we didn’t imply the type in the name we got confused and created bugs). When floor_index
and ceiling_index
are directly next to each other, we know the floor is the last item we added before starting from the beginning of the dictionary, and the ceiling is the first item we added after.
Complexity: Each time we go through the while loop, we cut our range of indices in half, just like binary search. So we have O(lg n) loop iterations.
Implement a Queue with Two Stacks
Implement a queue with 2 stacks. Your queue should have an enqueue and a dequeue method and it should be “first in first out” (FIFO).
Optimize for the time cost of mm calls on your queue. These can be any mix of enqueue and dequeue calls.
Assume you already have a stack implementation and it gives O(1) time push and pop.
Let’s call our stacks in_stack
and out_stack
.
For enqueue, we simply push the enqueued item onto in_stack
.
For dequeue on an empty out_stack
, the oldest item is at the bottom of in_stack
. So we dig to the bottom of in_stack
by pushing each item one-by-one onto out_stack
until we reach the bottom item, which we return.
After moving everything from in_stack
to out_stack
, the item that was enqueued the 2nd longest ago (after the item we just returned) is at the top of out_stack
, the item enqueued 3rd longest ago is just below it, etc. So to dequeue on a non-empty out_stack
, we simply return the top item from out_stack
.
Product of All other Numbers
You have a list of integers, and for each index you want to find the product of every integer except the integer at that index.
Write a function get_products_of_all_ints_except_at_index()
that takes a list of integers and returns a list of the products.
For example, given:
[1, 7, 3, 4]
your function would return:
[84, 12, 28, 21]
by calculating:
[7 * 3 * 4, 1 * 3 * 4, 1 * 7 * 4, 1 * 7 * 3]
Here’s the catch: You can’t use division in your solution!
To find the products of all the integers except the integer at each index, we’ll go through our list greedily twice. First we get the products of all the integers before each index, and then we go backwards to get the products of all the integers after each index.
When we multiply all the products before and after each index, we get our answer—the products of all the integers except the integer at each index!
Complexity: O(n) time and O(n) space. We make two passes through our input a list, and the list we build always has the same length as the input list.
Permutation Palindrome
Write an efficient function that checks whether any permutation of an input string is a palindrome.
You can assume the input string only contains lowercase letters.
Examples:
- “civic” should return True
- “ivicc” should return True
- “civil” should return False
- “livci” should return False
“But ‘ivicc’ isn’t a palindrome!” If you had this thought, read the question again carefully. We’re asking if any permutation of the string is a palindrome. Spend some extra time ensuring you fully understand the question before starting. Jumping in with a flawed understanding of the problem doesn’t look good in an interview.
Our approach is to check that each character appears an even number of times, allowing for only one character to appear an odd number of times (a middle character). This is enough to determine if a permutation of the input string is a palindrome.
We iterate through each character in the input string, keeping track of the characters we’ve seen an odd number of times using a set unpaired_characters
.
As we iterate through the characters in the input string:
- If the character is not in
unpaired_characters
, we add it. - If the character is already in
unpaired_characters
, we remove it.
Finally, we just need to check if less than two characters don’t have a pair.
Complexity: O(n) time, since we’re making one iteration through the nn characters in the string. Our unpaired_characters
set is the only thing taking up non-constant space.
Cafe Order Checker
My cake shop is so popular, I’m adding some tables and hiring wait staff so folks can have a cute sit-down cake-eating experience.
I have two registers: one for take-out orders, and the other for the other folks eating inside the cafe. All the customer orders get combined into one list for the kitchen, where they should be handled first-come, first-served.
Recently, some customers have been complaining that people who placed orders after them are getting their food first. Yikes—that’s not good for business!
To investigate their claims, one afternoon I sat behind the registers with my laptop and recorded:
- The take-out orders as they were entered into the system and given to the kitchen. (
take_out_orders
) - The dine-in orders as they were entered into the system and given to the kitchen. (
dine_in_orders
) - Each customer order (from either register) as it was finished by the kitchen. (
served_orders
)
Given all three lists, write a function to check that my service is first-come, first-served. All food should come out in the same order customers requested it.
We’ll represent each customer order as a unique integer.
As an example,
Take Out Orders: [1, 3, 5] Dine In Orders: [2, 4, 6] Served Orders: [1, 2, 4, 6, 5, 3]
would not be first-come, first-served, since order 3 was requested before order 5 but order 5 was served first.
But,
Take Out Orders: [1, 3, 5] Dine In Orders: [2, 4, 6] Served Orders: [1, 2, 3, 5, 4, 6]
would be first-come, first-served.
We walk through served_orders, seeing if each customer order so far matches a customer order from one of the two registers. To check this, we:
- Keep pointers to the current index in
take_out_orders
,dine_in_orders
, andserved_orders
. - Walk through
served_orders
from beginning to end. - If the current order in
served_orders
is the same as the current customer order intake_out_orders
, incrementtake_out_orders_index
and keep going. This can be thought of as “checking off” the current customer order intake_out_orders
andserved_orders
, reducing the problem to the remaining customer orders in the lists. - Same as above with
dine_in_orders
. - If the current order isn’t the same as the customer order at the front of
take_out_orders
ordine_in_orders
, we know something’s gone wrong and we’re not serving food first-come, first-served. - If we make it all the way to the end of
served_orders
, we’ll check that we’ve reached the end oftake_out_orders
anddine_in_orders
. If every customer order checks out, that means we’re serving food first-come, first-served.
Complexity: O(n) time and O(1) additional space.