Data structures Flashcards
def Variable
a Variable is a named reference to a memory location that stores data that can change during the runtime of the program
why is declaring variable type important
So that the computer knows how much memory space it has to allocate for the data
def Constant
named reference to a memory location where the data inside the constant cannot change during the runtime of the program
whats an ADT(abstract data type)
ADT is logical description of how we view data and the possible operations
e.g queue of print jobs, stack of books, list of tasks to do
Only concerned w/ what data is representing, not how its constructed
2d array: array[x,y]
which is column which is row
row = x
column = y
Remember matrices (3x1)
What are the pointers used in a queue
and the important variables to use in queue
Front
Rear
maxSize
currentSize
In memory, after the first elements memory location, where are the other elements stored
in the next consecutive memory location
In low level langs, amount of data needed 4 array is reserved, starting from 1st reserved mem location, it checks the next mem location, until end of reserved mem locations 4 array.
where do dynamic data structures allocate and deallocate memory from
the heap - an area of memory used for ____________________
3 types of queues
linear
circular
priority
in what manner do queues operate?
FIFO
advantages priority queue
gives preference to more important items
how do priority queues work
Items qiven priority
If item has higher priority than item in front of it, it can jump in front of it
If same or greater, stay put
in what manner do stacks operate
FILO
4 operations of a queue
enQueue - add an item to the REAR of the queue
deQueue - remove and return an item from the FRONT of the queue
isFull - check if queue is full
isEmpty - check if queue is empty
Circular queues
Reuse the empty spaces freed by dequeueing from front of the array
(rear + 1) MOD max
excessive amount of allocation without deallocation leads to
overflow
example dynamic data structre
example static data structure
dynamic: list
static: array
advantage using queues in a list
list is dynamic data structure, will use enough memory for number of elements inside the list
disadvantage using queues as a list
can keep adding items to the list - eventually all the memory will be exhausted and cant keep adding more
To solve this, make it so list has maxsize
Can change the maxsize when needed if need to add more items
advantages of linear queues (created using arrays)
simple to program
predictable memory usage
what to do w/ pointers when dequeue item
move the front pointer (and rear if needed)
You dont actually delete item
When need to add to slot where item is but not in queue, you overwrite it
Saves processing time
advantages circular queue
can reuse free spaces
What is an array
An array is a data structure that holds a number of data items or elements of the same data type.
disvantages linear queue
fixed length
single pass
cant reuse spaces
disadvantages circular queue
harder to program
disadvantages priority queue
additional processing required to keep order
features of array
Indexed - each element has its own index
Mutable - contents of array can change during runtime of program
list of list operations
isempty()
append(item)
remove(item)
count(item)
len(item)
index(item)
insert(pos,item)
pop()
pop(pos)
remove(item) does
remves first occurance of item from the list
count(item)
returns number of occurances of item from list
len(item)
returns num of items in list
insert(pos,item)
insert item at position pos
Implementing a queue as a list
No need for size variable as you have len function
maxSize needed to prevent exhauston of memory
isFull function needed
Common operations on lists
merging, sorting, searching, comparing lists
Linked Lists
dynamic ADT which can be implemented as array and pointers
Composed of nodes
what are nodes in linked lists composed of
the data
pointer of the next node (and of node pointing to it if doubly linked list)
Pointers in linked lists
Start pointer
Nextfree pointer
Pointer in each node pointing to nxt (or prev) node
Adding node to linked list
Store data in the node pointed to by the nextfree pointer
Adjust pointer of the previous item to point to the added item
Added item assumes the value of the pointer of the previous item before the item was added
deleting node in linked list
pointer of node before the deleted item points to node after the deleted item
deleted node points to nextfree index position after itself (might not be same value as nextfree pointer)
an application of stacks
reversing lists
any other good examples add them here
what pointers used in stacks + what variables
one pointer - the top of the stack
size, maxSize variables can be used
operations on stacks
push(item) - add item to top of the stack
pop() - removes and returns the item on the top of the stack
isFull() - check if stack is full
isEmpty() - check if stack is empty
peek() - return top item w/o removing it
size() - return num of items in stack
def overflow
attemping to push into a full stack
def underflow
attempting to pop from an empty stack
stack overflow def
a computer program tries to use more memory space in the call stack than has been allocated to that stack.
Call stack
system level data structure
provides mechanism for passing parameters + return addresses to subroutines
IN OTHER WORDS:
REMEMBERS THE PARAMETERS PASSED INTO SUBROUTINE + WHAT LINE THE SUBROUTINE WAS CALLED SO PROGRAM CAN REMEMBER WHERE TO GO BACK TO
Subroutine calls
calls to subroutine executed as follows:
1st step - parameters saved onto the stack
2nd - address to which program returns to after subroutine is executed saved onto the stack
3rd - execution transferred to subroutine code
Subroutine execution
subroutine executed as follows:
stack space allocated for local variables
subroutine code executes
return address retrieved
parameters popped
execution trasnferred back to return address
hashing is
Hashing is the practice of transforming a given key or string of characters into another value
how is the hashed item found in thedatabase
calculated from the data istelf, using hashing algo
hashing algo use cases
encryption
checking for errors in downloaded code
searching datasets
verifying integrity of data
what is used to ensure records are assigned a unique address and how
hashing is used
some part of the record eg primary key is hashed to give a unique destination address
what is the data structure that stores the results of hashing algorithms
hash table
primary key def
unique identifier of a record
whats a collision
When hashing algo returns same address for different primary keys
Table of items to be inpuuted into hash algo size compared to hash table size
Hash table size must be bigger
load factor def
The number of elements in a hash table divided by the number of slots.
simplest method w/ dealing w/ collisions
put item in next available slot
disvantage of put item in next available slot method hashing
leads to clustering of items
how to make the method of puting an item in next slot hashing better
increment the skip e.g 1st skip is by 1, next is by 3, next is by 5 etc
how big should the hash table be
prime number size
good load factor size
0.7 or below
Mid square method
square the item
take middle digits
perform mod step to get the address
if it has letters or symbols, take denary value of them
folding method
divide item into equal isze pieces (last piece may be unequal size)
add pieces together (e.g 13+12 = 25)
perform mod step
Why do you mod the key by number of slots - hashing
So the result can always fit in the table
Searching for items in hash table if collision occured
From the slot it would have been placed in, check each next slot until the next empty space
Dealing w/ deletion in hash table
When delete an item, put a dummy value in it e.g ‘Deleted’
That way when searching through, it is not treated as an empty space
hashing characters + symbols
take their denary values from ascii table
what is dictionary
ADT made up of key:value pairs
value is accessed via the key
can you have dictionary in dictionary
ya
uses of dictionaries
ascii/unicode table
in translation program - foreign lang dictionary
trees + graphs can be implemented w/ dicts
can multiple items have same key dictionary
ya
Dictionary operations
add new key:value pair to dict
delete a pair from dict
amend value in key:value pair
return value associated w/ a key
return true/false depending on whether key is in dict
return len of dict
when hashing: are key:value pairs held in any particular order
no
chaining
instead of storing actual data in hash table, you store pointer that point to where that data is stored
3 attributes that data in linked list in hash table (chaining) has
key value
data
pointer to next node
if there are multiple collisions in one slot, how does the chaining work
pointer pointing to next collided item (node a) is put in hash table
node a points to next collided item (node b)
node b point to node c etc
When could chaining become a problem
If there is lots of clustering
will take time going thru linked list , too long of a list
what direction undirected graphs
bidriectional
what do edge weights represent
cost of travel
graph def
set of vertices/nodes connected by edges/arcs
directed graph aka
digraph
2 ways a graph can be represented
adjacency list + matrix
adjacency matrix
each row and column represent a node
item at [row,column] indicate connection
in which graph will matrix be syymetric
undirected
what could entries in adjacency matrix represent
edge weights
advantages adjacency matrices
adding an edge/node is simple
disavantage ajaceny matrix
sparse graph w/ little connections will leave most cells empty wasting mem space
ajaceny list
list of nodes created
each node points to list of ajacent nodes
representing a wegihted graph
can be represented as dict of dicts
each key in dict is the node
the value is a dict of adjacent edges + edge weights
avantae ajaceny list
only uses storage for existing connections
good way to represent large sparse graph
graph applications
computer netowrks
states in finite state machine
social networks
web pages + links
chemistry
project management
maps
search engines
trees
an adt
has a root, branches and leaves
can trees be weighted
No
tree def
connected undirected graph with no cycles (unique path from each node to any other node)
rooted tree
one node identified as root
each node except root has 1 unique parent
terminology for trees
root
node
edge
child
parent
subtree
leaf
nodes w/ no children
leaf
binary tree def
a rooted tree with at max 2 children
can you represent trees as matrices
ya
binary tree node - 3 parts
data
left pointer
right pointer
binary search tree
nodes added in order given, first item as root
if item larger, put to right
if less, put to left
2 types of traversal
depth first tree traversal (DFT)
breadth first tree traversal
3 types of DFT
pre order - NLR
in order - LNR
post order - LRN
where are subroutines for tree traversals put in when carried out
in a stack
in case backtracking needs to occur
backtrackin def
Going back a step after there are no more options to explore down the route you just went
avantage storing data in binary search tree
large chunks of data can be eliminated while searching for data
balanced binary trees
nodes distributed in a way height kept to a minimum
leads to efficient searching
deleting leaf node
pointer of parent pointing to leaf changes value to -1
deleting node w/ one child
child takes place of deleted node
parent of deleted node points to child of deleted node
deleting node w/ 2 children
next largest number replaces the deleted node
parent of deleted node has its pointers changed
deleting nodes - why may they be bad
whole tree may have to be rebuilt - time consuming
trees used for searching, shouldnt be used as a generalised ADT