Storing Data: Disks and Files Flashcards
Why can’t you store everything in main memory
- buying enough storage costs too much
- main memory is volatile to power off (ie. not persistent)
Typical storage hierarchy has 3 storage types. Name them and what they are used for
Main memory: for currently used data
Disk: for main database (frequently used)
Tapes: for archiving older versions (infrequently used)
DBMS stores DB where?
what are the major implications for DBMS designs?
DBMS store DB on hard disks
Major implications are planning how to do read/write (I/O operations). They are high-cost operations when compared to CPU operations, so they need to be carefully considered
what is the main advantage disks have over tapes
disks are random access and tapes are sequential. Therefore faster
true or false: pages are the smallest unit of data retrieval you can read and write to
true: you must read and write an entire page, even if you make a minor change to a file/record
True or false: a plater has two surfaces
true
how does the arm assembly move to position
it moves in and out to position the disk heads on a desired track
what is a cylinder in a disk structure
Same track on all surfaces (ie. top and bottom of each platter)
true or false: only one head can read/write at one time
true, can’t have more than one head working at a time
what is a track in a disk structure
A disk drive track is a circular path on the surface of a platter where you read and write information to.
A track is a physical division of data in a disk drive
what is a sector in a disk structure
A disk sector is the smallest storage unit on a disk
Typically It’s a subdivision of a track. it acts as a division of storage on a disk
what is a page in a disk structure
how do you locate a page on a disk?
a continuous set of sectors of a track.
locate by: page id
where
page id = (b, s, c, d)
block b of surface s of cylinder c of disk d
the time to access (read/write) a disk page is determined by 3 actions. What are they?
If you want to speed up the access time, which of the three actions can you affect to actually speed it up?
seek time (moving arm to position disk head on track)
rotational delay (waiting for page to rotate under head)
transfer time (actually moving data from main memory and disk)
if you want to speed things up you can only speed up the first two processes. transfer time cannot be further sped up.
- Seek time varies from about 1 to 20msec
– Rotational delay varies from 0 to 10msec
– Transfer rate is about 1msec per 4KB page
transfer rate time varies so much, why is this? How can we reduce this
it varies so much b/c it depends on the best and worst case of where the disk head is currently positioned.
Best: the disk head is already above the track you want
Worse: we need to move very far to get to the next needed track
can reduce the time by reducing the seek and rotation time by careful arrangement of data pages on the disk
describe what the ‘next page’ concept is
have the data organized in a way where you always have the next page ready/prepped/next in line. Pages should be stored sequentially to minimize seek/rotation delay.
the order:
Pages on same track, followed by pages on
same cylinder, followed by pages on
adjacent cylinder
In practice, data pages are not always sequentially stored on the disk even though it will minimize seek/rotational delays, why?
If we delete data, it will create an unusable hole
There are layers to the calls a DBMS will make on a DB. What is the hierarchy?
- application, like sql (highest)
- Memory Management
- Disk Management
- DB on Disk (lowest)
files mean two different things depending on whether you are a user or DBMS. What is the difference?
user: a file is a collection of logical records
DBMS: a file is a collection of disk pages