Data Storage Flashcards
Access ZTime
Time required to access the first byte in a read or write operation
Transfer Rate
speed at which data is read or written. following initial access.
Volatile memory
Memory that is lost when disconnected from power.
Non-volatile memory
Memory is retained when power is lost
Main Memory/RAM
Primary memory used when computer programs execute
Flash Memory/SSD
less expensive and higher capacity than main memory
Magnetic Disk/HDD
used to store large amounts of data
Sectors
Magnetic disk groups data in sectors, traditionally 512 bytes per sector but 4 kilobytes with newer disk formats
Pages
Flash memory groups data in pages, usually between 2 kilobytes and 16 kilobytes per page
Block
Databases and file systems use a uniform size, called a block, when transferring data between main memory and storage media
Row-oriented storage
Relational databases usually store an entire row within one block, which is called row-oriented storage
Column-oriented/Columnar Storage
Each block stores values for a single column only.
Table Structure
Scheme for organizing rows in blocks on storage media
heap table
no order is imposed on rows
Sorted Table/Sort Column
Database designer identifies a sort column that determines physical row order
Hash Table
Rows are assigned to buckets
Bucket
Block or group of blocks containing rows
Hash Key
Column or group of columns, usually the primary key
hash Function
Computes the bucket containing the row from the hash key
Modulo Function
Simple has function with four steps
Dynamic hash function
Automatically allocates more blocks to the table, creates additional buckets, and distributes rows across al buckets. With more buckers, fewer rows are assigned to each bucket and, on average, buckets contain fewer linked blocks.
Table clusts/Multi-Tables
Interleave rows of two or mote tables in the same storage area.
Cluster key
Column that is available in all interleaved tables
Single-level index
File containing column values, along with pointers to rows containing the column value
Multi-level index
Each index entry is a composite of values from all indexed columns.
Table scan
Database operation that reads table blocks directly, without access an index
Index Scan
Database operation that reads index blocks sequentially, in order to locate the needed table blocks
Hit Ratio/Filter facotr/Selectivity
Percentage of table rows selected by a query
Binary Search
Database repeatedly splits the indexed in two until it finds the entry containing the search value
Primary Index/Clustering index
Index on a sort column
Dense index
Contains an entry for every table rows
Sparse index
Contains an entry for every table block
multi-level indexd
Stores column values and rows pointer s in a hierarchy
Fan-out
Number of index entries per block is called the fan-out of a multi-level index
Branch
Top-level block to a bottom-level block is called a branch.
balanced/imbalanced
when all branches are the same length and imbalanced when branches are different lengths
B+tree
All indexed values appear in the bottom level. Pointers to table blocks appear only in the bottom level. Values are occasionally repeated in the index
B-Tree
Indexed values appear in a higher level, the value is not repeated at lower levels. Instead, a pointer to the corresponding table block appears in the higher level along with the value
Hash index
INdex entries are assigned to buckets
BUcket
Block or group of blocks containing index entries
hash Function
Index entry is determined by a hash function, which computes a bucket number from the value of the indexed column
Bitmap Index
Grid of Bits
Physical INdexd
Single or multi-level index normally contains pointers to table blocks
Logical index
Single or multi-level index in which pointers to table blocks are replaced with primary key values.
Function index
Database designer specifies a function on the column value. Index entries contain the result of the function applied to column values, rather than the column values
Tablespace
Database object that maps one or more tables to a single file
fragmented
Files are updated, blocks become scattered, or fragmented, across many tracks
partition
subset of table data. One table has many partitions that do not overlap, and together, contain all table data
Horizontal Partition
Subset of table rows
Vertical partition
Subset of table columns
Shard
Subset of table data, usually a subset of rows rather than columns. Stored on different computers in a distributed database