Data Structures 1 Flashcards
What is the goal of Hashing?
Do faster than O(LogN) time complexity for: lookup, insert, and remove operations. To achieve O(1)
What kind of Collection is Hashing?
value-orientated.
How does Hashing work?
- you have a key for the item.2. the item’s key gets churned within the hash function to form the Hash index.3. The hash index can be applied to the data array, and so, the specific data is found.
hash Table:
An array that stores a collection of items.
Table Size(TS)
The Array’s Length
Load Factor (LF)
Number of items/Table size. For instance, a load factor of 1 = 100% of the items are used.
Key
Information in items that is used to determine where the item goes into the table.
Hash Function
A function that takes in the key to compute a specific Hash Index.
What would the Perfect Hash Function be?
Each Key maps to an unique Hash Index.
How do you delete a value within the hash table?
You just set Table[hash(Key)] = null
How do you look up a value within the hash table?
return Table[Hash(key)];
How do you insert a value within the hash table?
Table[Hash(key)]=data;
What is the worst case time complexity for: Insert, lookup, and delete, for hash functions?
O(1)
If we have UW-Madison student ID’s, and we wanted the ideal hash functions, how would we do it, and why would there be a problem
-> We’d simply count each one as an index-> Hash table would be huge.
Collisions:
When the Hash Function returns the same index for different keys.
Good Hash Function qualities:
- Must be deterministic:-> Key must ALWAYS generate the same Hash Index (excluding rehashing).2. Must achieve uniformity-> Keys should be distributed evenly across hash table.3. FAST/EASY to compute-> only use parts of the key that DISTINGUISH THE ITEMS FROM EACH OTHER4. Minimize collisions:
Extraction:
Breaking keys into parts and using the parts that uniquely identify with the item. 379452 = 394121267 = 112
Folding:
Combining parts of the key using operations like + and bitwise operations such as exclusive or.Key: 123456789 123 456 789 — 1368 ( 1 is discarded)
Weighting:
Emphasizing some parts of the key over another.
Compressing:
Ensuring the hash code is a valid index for the table size.
The more items a table can hold, the () likely a collision will happen.
less
Load Factor
Approximately how it’s full… 0.7-0.8.
Why do we use prime numbers for table size?
We mod often, and prime numbers give us the most unique numbers. (2*ts+1)
Steps to resizing:
- Double table size to nearest prime number2. Re-hash items from old table into the new table.
Rehashing Complexity:
O(N)– costly. Carefully select initial TS to avoid re-hashing.
Collesion handeling:
How you handle the collisions so each element in the hittable stores only one item.
Idea of probing:
If you have a collision, search somewhere else on the table.
Linear Probing:
Step size is 1. Find the index, and keep incrementing by one until you find a free space.
Quadratic Probing:
Probe Sequence is (Hk+1)^2.Minimizes clusteringbetter at distinguishing items across table.
Probe Hashing:
-> Hash it, and if it leads to a collision, use a separate equation to determine the step size and use that step size to find a new site.
Collision Hashing using Buckets
-Each element can solre than one item.-throw collisions into a bucket.-buckets aren’t sorted.
Array Bucket
– a bucket of arrays. -Fixed in size.-size of about 3 work usually well.
Chained bucket:
+–easy to implement+– buckets can’t overfill+– buckets won’t waste time.+– buckets are dynamically sized.
Tree Buckets
+–WC = O(logN)+–no wasting space+–dynamically sized– more complicated than what’s needed. –> insert with dups= O(1)–> W/o dups = O(N)
HashCode Method:
-method of OBJECT class-Returns an int-default hash code is BAD– computed from Object’s memory address. –> must override
HashTable & HashMap class
java.utilimplements map interfaceK– type paramater for key and v– type parameter for associated valueOperations: lookup, insert, delete.Constructor lets you set init capacity and load factorhandles collisions with chained bucketshash map only allows null for keys and values
TreeMap underlying Structure:
RBT
Treemap complexity of basic operations:
O(logN)
TreeMap complexity for iterating over associated values:
O(N)
HashMap underlying structure:
HashTable with chained buckets
HashMap complexity of basic operations:
O(1)
Complexity for iterating over associated values:
O(T.S + N) –> worst case.
data structure
a particular scheme organizing related data items.
linked list
A sequence of zero or more nodes containing some data and pointers to other nodes of the list.
list
a collection of data items arranged in a certain linear order
stack
LIFO list in which insertions/deletions are only done at one end.
queue
FIFO list in which elements are added from one end of the structure and deleted from the other end.
priority queue
collection of data items from a totally ordered universe
graph (formal)
G = (V,E). V: finite, nonempty set of vertices. E: set of pairs of V, called edges.
If a graph’s edges are unordered [ (u,v) == (v,u)], then the vertices u and v are
adjacent
If a graph’s edges are unordered [ (u,v) == (v,u)], then the vertices u and v are connected by
an undirected edge (u,v).
The vertices u and v of the undirected edge(u,v) are the _ of the edge
endpoints
The vertices u and v of the undirected edge(u,v) are _ to the edge
incident
A graph is undirected if
every edge in it is undirected.
If a graph’s edges are ordered [ (u,v) != (v,u)], then the edge (u,v) is _ from _ to _
directed from u to v
digraph
A graph whose every edge is directed
for the directed edge (u,v), u is the _ and v is the _
u is the tail, v is the head.
complete graph
a graph with every pair of its vertices connected by an edge
dense graph
A graph with very few possible edges missing
sparse graph
A graph with few edges relative to the number of vertices
weighted graph
A graph with numbers assigned to its edges.
weight/cost matrix
Adjacency matrix of a weighted graph.
A path from vertex u to vertex v
a sequence of adjacent vertices that starts with u and ends with v
simple path
A path in which all vertices are distinct
length of a path
The total number of vertices in the vertex sequence defining the path - 1.
directed path
A sequence of vertices (v1,v2,v3,…) such that v1->v2, v2->v3,v3->… for a directed graph.
connected graph
A graph such that for all vertices u and v, there exists a path from u to v.
cycle
A path of positive length that starts and ends at the same vertex and does not traverse the same edge more than once.
acyclic graph
A graph with no cycles.
tree
A connected, acyclic graph
forest
A graph that has no cycles, but not necessarily connected
The number of edges in a tree
One less than the number of vertices
Q: For every vertices u, v in a tree, there exists:
A: Exactly one simple path from u to v.
tree root
top level vertex
Ancestors of a vertex v in a tree
All vertices on the simple path from the root to v
Proper ancestors of a vertex v in a tree
All vertices on the simple path from the root to v, but excluding v itself.
if (u,v) is the last edge of the simple path from the root to vertex v, u is the _ of v
parent
if (u,v) is the last edge of the simple path from the root to vertex v, v is the _ of u
child
siblings
Vertices of a tree that have the same parent.
leaf
A vertex with no children
parental
A vertex with at least one child
descendants of v
All vertices for which v is an ancestor in a tree
subtree of T rooted at v
All descendants of a vertex v
proper descendants
All vertices for which v is an ancestor in a tree, excluding v itself.
depth of a vertex v
The length of the simple path from the root to v
height of a tree
the length of the longest simple path from the root to a leaf
ordered tree
A rooted tree in which all children of each vertex are ordered. (Usually left to right)
binary tree
An ordered tree in which every vertex has no more than two children, with each child designated as a left or right child. Potentially empty.
binary search tree
A binary tree with the property that for all parent nodes, the left subtree contains only values less than the parent, and the right subtree contains only values greater than the parent.
first child-next sibling representation
Representation used for ordered trees with a potentially varying amount of children per parent node.
set
An unordered collection (possibly empty) of distinct items called elements of the set.