Coding Interview Patterns Flashcards
Sliding Window
used to process sequential data, arrays, and strings
It involves maintaining a dynamic window that slides through the array or string, adjusting its boundaries as needed to track relevant elements or characters
examples of problems that can be solved with “Sliding Window”
- Maximum sum subarray of size K
- Longest substring without repeating characters
Does your problem match “Sliding Window”?
- Contiguous data
- Processing subsets of elements
- Efficient computation time complexity
problems in the real world that use the sliding window pattern
- Find the maximum number of users connected to a cellular network’s base station in every k-millisecond sliding window.
- Given a stream of numbers representing the number of buffering events in a given user session, calculate the median number of buffering events in each one-minute interval.
- Given the lists of topics that two users have posted about, find the shortest sequence of posts by one user that includes all the topics that the other user has posted about.
The DNA sequence
Given a string, s, that represents a DNA subsequence,
and a
number k, return all the contiguous subsequences (substrings) of
length k
that occur more than once in the string.
The order of the returned subsequences does not matter.
If no repeated substring is found, the function should return an empty set.
Repeated DNA Sequences
(Select all that apply.) What is the output if the following string and value of 𝑘 are given as input?
s = “AGAGCTCCAGAGCTTG”
k = 6
AGAGCT
Solution: Repeated DNA Sequences
Statement#
Given a string, s, that represents a DNA subsequence, and a number k, return all the contiguous subsequences (substrings) of length k
that occur more than once in the string. The order of the returned subsequences does not matter. If no repeated substring is found, the function should return an empty set.
The DNA sequence is composed of a series of nucleotides abbreviated as 𝐴, C , G , and T
Constraints: 1 1 ≤ s.length ≤ 1 0 3 10 3 s[i] is either A, C, G, or T. 1 ≤ 𝑘 ≤ 10
To recap, the solution to this problem can be divided into the following six main parts:
Iterate over all
𝑘
k
-length substrings.
Compute the hash value for the contents of the window.
Add this hash value to the set that keeps track of the hashes of all substrings of the given length.
Move the window one step forward and compute the hash of the new window using the rolling hash method.
If the hash value of a window has already been seen, the sequence in this window is repeated, so we add it to the output set.
Once all substrings have been evaluated, return the output set.
Average case
The average case time complexity of this solution is O(n), where n
is the length of the input string. It is calculated as follows:
Time taken to populate the numbers array: O(n)
.
Time taken to traverse all the k
-length substrings: O(n−k+1)
.
Time taken to calculate the hash value of a k -length substring: O(1)
.
So, the dominating time omplexity becomes O(n)
.
Worst case
To understand the worst case time complexity of this solution, consider the input string “AAAAAAAA” with
k=2
This combination of inputs ensures that a repeated sequence “AA” is detected and added to the output each time the window slides forward. Therefore, we must generate a k-length substring on each (n−k+1) iteration of the loop. The time to generate a k -length substring is O(k)
. Therefore, the overall time complexity becomes O((n−k)×k)
.
Space complexity
The space complexity of this solution is O(n). It is calculated as follows:
Space occupied by the mapping hash map: O(1)
.
Space occupied by the numbers array: O(n)
.
Space occupied by the hashSet set: O(n−k+1)
.
So, the dominating space complexity becomes O(n)
.