Gullu - Skip Lists Flashcards
Skip Lists
Probabilistic Data Structure. Based on linked lists but fixing no random access problem.
In expectation, the skip list performance is O(log2n). Not as strong as a normal bound but good certainty
Overview of search in skip list
- Let L = Topmost list
- Move right until we find the value or a greater value than we are looking for
- If value is greater, move to lower list and continue traversing there.
Maximum steps of 2 lists L1 and L2
- Worst Case: Go through L1, Go down to L2 and go through L1/L2 steps, in total we have L1+L2/L1 steps
- L2 is n = number of elements in data structure.
- To minimise this we can choose the size of L1 since n is not a choice
- This formula can be smaller: L1+n/L1 either when L1 is smaller or larger, we need balance. L1 = sqrt(n)
Maintain Ratio in Skip List
Use coin toss 50/50 to promote elements. Keeps time O(log2n)
First element in list (coin toss)
Given a value of negative infinity so it starts from the top and list always has a point to start from
Deletion
Trivial. If you want to delete something just find it and remove it
Theorem Skip List
With high probability, every search element in a n element skip list costs O(log2n).
If we have a constant c then no matter what c is; c*O(log2n) is still O(log2n)
WHP
An event occurs whp if for any number a>=1, the event occurs with a probability at least 1-O(1/n^a)
Search Backwards
- Start from result
- If we move up we got heads, if not we move left meaning tails
- We finish at top left negative infinity
- Think of any search string as “HTTHHTTHTHTTTH”
Statistics
Recall that object notation (n choose r) denotes number of ways we can choose r objects from a set containing n distinct elements. Also choose r = n!/r!(n-r)!
So if we flip a coin 4 times, in how many ways can we get 1 heads = HTTT, THTT, TTHT, TTTH.
Important Info
- Search is performed from top left to bottom right
- Can visualize search in reverse
Indexable Skip Lists
Skip Lists do insertions and deletions in O(log2n) time but traversals still take O(n) time, this can be fixed by storing the width of a link for every link. This way we can get O(log2n) time performance to look up an index.