final Flashcards
SQL DDL
sequel data definition language: define relations, integrity constraints, domains,
types in SQL
char(n) fixed len string
varchar(n) variabe len string
int integer
float(n) n digit float
create table example
create tableinstructor(ID char(5),name varchar(20),salary numeric(8,2))
integrity constraints
ensure certain logical things do not happen, ie not null, primary key, foreign key
integrity constraints example
create tableinstructor(ID char(5),name varchar(20) not null,dept_namevarchar(20),salary numeric(8,2),primary key (ID),foreign key (dept_name) referencesdepartment)
not null
can be a constraint: used in create to ensure not null
can be a test: ie select all * where x is not null
insert example
insert into instructor values (‘10211’, ‘Smith’, ’Bio’, 66000);
delete example
delete from r where conditions
drop table
drop table r
alter example
Alter table – add/remove attributes●Structure:
alter table r add A D
alter table r drop A
basic query
select x
from y
where z
search for a substring
select name
from instructor
where name like ‘%dar%’;
more substring
intro% (intro followed by anything)
%mid%
‘___’ 3 characters
‘___%’ 3 or more chars
list people alphabetically
select distinct name
from instructor
order by name asc/desc
between exmp
… where att between 100 and 200
union
combine 2 relations
select …
union
select …
intersect
combine things in common
see union
except
remove items in common
aggregate functions
select x count(y)
from z
group by x
conditions with aggro func
select ... group by x having count(y) > a
nested subqueries
queries where a select-from-where exists in another query
used for membership, comparisons, cardinality
NS for membership
select from
where att in (select from where)
rename
x as y
some
search thru a NS for any member
select from
where x > some NS
all
some but for all in the set returned by a query
unique
returns tuples from an NS where there are no duplicates
delete
delete tuples from relation
delete
from r
where …
update
change values of atts on a where cond
update instructor
set att = att + 4
where att > x
update in one statement
update r set att = case when att > then att = x else att = y end
join
takes 2 relations and if atts with the same name hold the same value then that relation gets its atts appended and returned
course on r.att = s.att
views
basically macros for expressions create view v as select from where
check
added to creates,
create(…
check (x in (“x”,”y”)));
create type
same utility as classes
create type binary as numberic(1,0)
domain
assert things like not null or create domain x constraint check(value in (domain))
create index
create index on r(att)
large object types
blobs and clobs, binary or character, returns pointers
functions
doubt it
triggers
given an event, can insert, delete, or update create trigger setnull_trigger before update of takes referencing new row as nrow for each row when (nrow.grade = ‘ ‘) begin atomic set nrow.grade = null; end; go back to this maybe
types of NoSQL dbs
key value
graph database
document oriented
column family
sharding
distribute a database accross many machines, as opposed to a single server
replica sets
other machines ensure redundancy
master slave replication
operations on a master server automatically correspond to slaves, scalable redundancy, quicker to access a nearby slave
noSQL vs risk based data management system
loose schema definition, distributed, less organization, support
beneits noSQL
elastic scaling: easily distributed big data: high data volumes DBA spec: designed to automate DBA out flexible model: not very strict with schema economics: scalable is cheaper
drawbacks noSQL
minimal support
mongoDB
use hashing to relate to JSON objects
mongoDB why use it
is simple, flexible, small, lightweight, complete, very useful
MDB hierarchy of objects
databases > collections > documents
mongoDB is for
small transactions light on joins
mongoDB data definition
none given, no nulls bc nothing expected
index
small pointer telling you where to go in the database to find something
kinds of indices
ordered: stored in sorted order
hash: use a hash function
indexing serves to
speed up operations
dense index
record appears for every search-key value
sparse index
index records for only some search-key values
sparse indexes
slower, but with less work + overhead, entry points for “blocks”
secondary indices
index points to another index, which points to entry, must be dense
nodes composed of
key search values, and pointers to children
find record with KS value V
C is root
while C is not a leaf
move from K0…Kn until V <= K
if V > all K, then follow the last pointer
else, if V = Ki, then move to Pi+1, else Pi
(now we are at a leaf)
find i such that V = Ki
follow Pi to record
if the find fails, no record exists
insert record into B+ tree
find leaf where SK val would appear if SK val is present add record to the file add a pointer to the bucket else add record to the file add a pointer to the bucket if there's room, insert SK value into node if no room, split the node
splitting nodes
take the ceiling(n/2) nodes and put em in the first, rest go in the second
insert the lowest kp into the parent, split that too if needed
can propogate upwards
remove nodes
find node, remove from bucket if present
if the node now has too few eles, merge
take our node and add its members to the left node
delete the pointer to our old node, merge up if needed
issues with B+ indexing
if a record moves, all pointers have to be updated
efficient entry
bulk entry: adding several entries to a B+ tree at once
sorted entry: sort items and entrer them n in sorted order
bottom up: sort, then add them layer by layer with prefab parts
B- tree
no duplicates
B- comp with B+
very minor speed advantages, hader to maintain
hashing
we obtain SK value via hash function
ie numchars mod 10 returns one of 10 buckets
overflow bucket
when a bucket fills you add a pointer to another bucket at the end
problems with static hashing
- too many buckets
- space allocated but can fail to be filled
- periodic reorganization requires new hash function
dynamic hashign
grows and shrinks in size
delete from extendable hash
locate KV and remove it
remove bucket if empty
coalesce buckets if possible
benefits of extendable hash
doesn’t degrade performance as file grows, bucket addressing, overhead can become problematic