Data structures Flashcards

Question 1

Q

List

Answer

A

ordered collection of nodes/items/elements

key property: ordering

operations:

push: adds element to end of list
pop: removes last element of the list
shift: removes first element of the list
unshift: adds an element to the front of the list
insert: adds an element at a given position
remove: removes the element at a given position
iterate: returns the items in order

also: sorting, searching, copying, joining, splitting

Question 2

Q

Stack

Answer

A

LIFO (last-in first-out) list

objects added (pushed) and removed (popped) from the same end of the list

use array or linked-list

linked list: add and remove in constant time

array: use int sp to track index at top of stack
resizing array: increase the capacity of the stack if it is full (doubling) / decrease the capacity if it is too big (quarter)

Question 3

Q

Amortisation / Performance of push (stack)

Answer

A

geometric series:
T (N ) = Nc + (4c + 8c + · · · + (N − 1)c/2 + (N − 1)c )
first Nc is the cost of writing N elements
rest is for copying to new arrays
time for copying is 2Nc - 6c (geometric series)
T(N) = 3Nc - 6c for this N
T(N) ≤ 3Nc - 6c for all N, so
T(N) = O(N) for N pushes
a push takes O(N)/N = Θ(1) time, on average

push time is AMORTISED Θ(1) (NOT same as Θ(1))

Question 4

Q

Amortisation

Answer

A

amortized analysis considers a sequence of operations
cost of expensive operations is ‘amortized’ across the sequence…
can never “be in debt”

pick a value to ‘pay’ per operation - show that this value covers all costs (never in debt)

Question 5

Q

Resizable array

Answer

A

compared with normal bounded-size array:
average insert time Θ(1) for both
actual time for inserting objects is higher (3c v c)
some very expensive inserts as N increases
keep benefits over linked list: random access, easy implementation

Question 6

Q

Queue

Answer

A

a list

next object removed

earliest on added (FIFO)
one with highest priority (Priority Queue)

does not support indexed access

Question 7

Q

Priority Queue

Answer

A

objects have keys
highest/lowest key is always at front

does not support indexed access

key property: maintain order within each branch
highest or lowest key at the root

add: O(N)
choose a short branch to keep longest branch as short as possible; start at the bottom and find the root

search: O(N)

Question 8

Q

Heap

Answer

A

a tree in an array

when adding: choose shortest branch
restore the shape then restore the order

always remove object at the root
restore shape then restore order

Question 9

Q

Binary heap

Answer

A

O(log2 N) for add and remove one object

height is Θ(log2 N)

time limited by the height of the heap

Question 10

Q

Set

Answer

A

unordered collection of objects each having a unique key

put (Add)
retrieve (Get)
by unique key
key is id for the data; need to compare keys, so must define < and =

Question 11

Q

Binary Search Tree

Answer

A

T.root is a pointer to the root node
at every node n in the tree..
n.left points to the root node of n’s left subtree
n.right points to the root node of n’s right subtree
can access key found at n, e.g. n.data.key

add: new key is always added as a leaf node - start at the root, move down, comparing against each key

Question 12

Q

Red-black tree

Answer

A

extension of binary search trees
RBT’s are self-balancing BST’s
(BST search O(N) in worst case

red-black tree, 5 properties

all nodes (including NILs) are either red or black
the root node is black
every leaf (all NIL) is black
both children of a red node are black
all paths from a node to a descendant leaf contain the same number of black nodes

have a limited height: O(log2 N)

put is much more complicated
new node always colored red, recolor and problem will move closer to the root

restoration procedure works up from leaf towards root
at each node performs constant theta action

get: O(log2 N)
put: O(log2 N) time

Question 13

Q

Hash table

Answer

A

index data by key (indirectly)
like an array with m positions

hash function h to turn key into index
deterministic: key k always has same result

an object with key k is stored at T[h(k)]

time taken by h depends only on k

general set of possible keys is large if not unlimited

limitations, hash table does not support:
in order iteration
next key/object
minimum key
maximum key
because objects are stored in random order (by design)

Question 14

Q

Perfect hashing

Answer

A

every key is mapped to a different value ie no collisions

Question 15

Q

Uniform distribution (hashing)

Answer

A

if keys are not uniformly distributed, hashing can be very poor

Question 16

Q

Collisions

Answer

Study These Flashcards

A

hashing: all objects whose key hashes to h(k) are placed into a linked list
table contains a pointer to the list

Question 17

Q

Chaining

Answer

Study These Flashcards

A

resolve collisions by adding bucket at location

put: Θ(1) time
unaffected by uniformity of hashing

get: O(N) (worst case is all elements in one chain, O(N)
expected time is Θ(N/m)
N/m is called the load factor

SUHA
expected time for an unsuccessful search for key k in a hash table with m slots, containing N objects
expected length of chain: N/m
probability of needing to search chain at T[I] is 1/m
expected number of comparisons (including loop termination) is 1 + N/m
expected time complexity: Θ(N/m)
reasoning similar for successful get Θ(N/m)

if m is proportional to N, expected time for GET is amortized Θ(1)
design of table needs to ensure N/m is Θ(1)
as N increases, table needs to be resized (geometrically) - number of buckets is increased to keep N/m below some constant
all existing data has to be rehashed

Question 18

Q

Probing

Answer

Study These Flashcards

A

probe until find an empty position

simplest: linear: consecutive slots are probed beginning with h(k) up to h(k) - 1

Uniform Hashing: implies every permutation is possible

a hash function h produces uniform hashing for a table with m buckets if, for each possible key k, the probability that ⟨h(k, 0), . . . , h(k, m − 1)⟩ is some permutation p of ⟨0,…,m−1⟩ is the same for all such p

linear probing DOES NOT produce uniform hashing

assuming uniform hashing, the expected number of keys compared when inserting an object depends on the load factor N/m
each probe is to a random bucket with probability N/m it is occupied
is m proportional to N, expected time for Put and Get is amortized constant

Question 19

Q

Simple Uniform Hashing Assumption

Answer

Study These Flashcards

A

With reference to chaining

Given a hash table T with m buckets, using hash function h, the simple uniform hashing assumption (SUHA) states that each different key k is equally likely to hash into any of the buckets. So the probability that h(k) = I, for all 1 ≤ I ≤ m, is 1/m

most probable / expected outcome: same number in each bucket

expected time for an unsuccessful search for key k in a hash table with m slots, containing N objects
expected length of chain: N/m
probability of needing to search chain at T[I] is 1/m
expected number of comparisons (including loop termination) is 1 + N/m
expected time complexity: Θ(N/m)
reasoning similar for successful get Θ(N/m)

Question 20

Q

Uniform Hashing

Answer

Study These Flashcards

A

with reference to probing

a hash function h produces uniform hashing for a table with m buckets if, for each possible key k, the probability that ⟨h(k, 0), . . . , h(k, m − 1)⟩ is some permutation p of ⟨0,…,m−1⟩ is the same for all such p

linear probing DOES NOT produce uniform hashing

assuming uniform hashing, the expected number of keys compared when inserting an object depends on the load factor N/m
each probe is to a random bucket with probability N/m it is occupied
is m proportional to N, expected time for Put and Get is amortized constant

Data structures Flashcards

(20 cards)