Data Storage Flashcards
Access ZTime
Time required to access the first byte in a read or write operation
Transfer Rate
speed at which data is read or written. following initial access.
Volatile memory
Memory that is lost when disconnected from power.
Non-volatile memory
Memory is retained when power is lost
Main Memory/RAM
Primary memory used when computer programs execute
Flash Memory/SSD
less expensive and higher capacity than main memory
Magnetic Disk/HDD
used to store large amounts of data
Sectors
Magnetic disk groups data in sectors, traditionally 512 bytes per sector but 4 kilobytes with newer disk formats
Pages
Flash memory groups data in pages, usually between 2 kilobytes and 16 kilobytes per page
Block
Databases and file systems use a uniform size, called a block, when transferring data between main memory and storage media
Row-oriented storage
Relational databases usually store an entire row within one block, which is called row-oriented storage
Column-oriented/Columnar Storage
Each block stores values for a single column only.
Table Structure
Scheme for organizing rows in blocks on storage media
heap table
no order is imposed on rows
Sorted Table/Sort Column
Database designer identifies a sort column that determines physical row order
Hash Table
Rows are assigned to buckets
Bucket
Block or group of blocks containing rows
Hash Key
Column or group of columns, usually the primary key
hash Function
Computes the bucket containing the row from the hash key
Modulo Function
Simple has function with four steps
Dynamic hash function
Automatically allocates more blocks to the table, creates additional buckets, and distributes rows across al buckets. With more buckers, fewer rows are assigned to each bucket and, on average, buckets contain fewer linked blocks.
Table clusts/Multi-Tables
Interleave rows of two or mote tables in the same storage area.
Cluster key
Column that is available in all interleaved tables
Single-level index
File containing column values, along with pointers to rows containing the column value
Multi-level index
Each index entry is a composite of values from all indexed columns.
Table scan
Database operation that reads table blocks directly, without access an index
Index Scan
Database operation that reads index blocks sequentially, in order to locate the needed table blocks
Hit Ratio/Filter facotr/Selectivity
Percentage of table rows selected by a query
Binary Search
Database repeatedly splits the indexed in two until it finds the entry containing the search value
Primary Index/Clustering index
Index on a sort column
Dense index
Contains an entry for every table rows
Sparse index
Contains an entry for every table block
multi-level indexd
Stores column values and rows pointer s in a hierarchy
Fan-out
Number of index entries per block is called the fan-out of a multi-level index
Branch
Top-level block to a bottom-level block is called a branch.
balanced/imbalanced
when all branches are the same length and imbalanced when branches are different lengths
B+tree
All indexed values appear in the bottom level. Pointers to table blocks appear only in the bottom level. Values are occasionally repeated in the index
B-Tree
Indexed values appear in a higher level, the value is not repeated at lower levels. Instead, a pointer to the corresponding table block appears in the higher level along with the value
Hash index
INdex entries are assigned to buckets
BUcket
Block or group of blocks containing index entries
hash Function
Index entry is determined by a hash function, which computes a bucket number from the value of the indexed column
Bitmap Index
Grid of Bits
Physical INdexd
Single or multi-level index normally contains pointers to table blocks
Logical index
Single or multi-level index in which pointers to table blocks are replaced with primary key values.
Function index
Database designer specifies a function on the column value. Index entries contain the result of the function applied to column values, rather than the column values
Tablespace
Database object that maps one or more tables to a single file
fragmented
Files are updated, blocks become scattered, or fragmented, across many tracks
partition
subset of table data. One table has many partitions that do not overlap, and together, contain all table data
Horizontal Partition
Subset of table rows
Vertical partition
Subset of table columns
Shard
Subset of table data, usually a subset of rows rather than columns. Stored on different computers in a distributed database
Partition Expression/Partition Columns
Specifies a partition expression based on one or more partition columns. Partition expression may be simple, such as the value of a single partition column, or a complex based on several partition columns. Rows are assigned to partitions in different ways.
Range partition
Associates each partition with a range of parittion expression values. VALUES LESS THAN keywords specify the upper bound of each range. MAXVALUE keyword represents the highest column value, and VALUES LESS THAN MAXVALUE specifies the highest range.
List Partition
Associates each partition with an explicit list of partition expression values using the VALUES IN keywords.
Hash Parititon
A partition expression with positive integer values. Database administrator specifies the number of partitions, N, and partitions are automatically named p0 through p(n-1). Partition number for each row is computed as: (Partition expression value) modulo N
Key Parition
Partition expression is determined automatically by the database
Logical Design
Specifies tables, columns, and keys.
Physical Design
Specifies indexes, table structures, and partitions. Physical design affects query performance but never affects query results
Storage Engine/Storage Manager
Translates instructions generated by a query processor into low-level commands that access data on storage media. Storage engines support different index and table structures, so physical design is dependent on a specific storage engine.
CREATE INDEX
statement creates an index by specifying the index name and table columns that compose the index
DROP INDEX
Statement deletes a table’s index
SHOW INDEX
Statement displays a table’s index
EXPLAIN
Statement generates a result table that describes how a statement is executed by the storage engine
slow query low
File that records all long-running queries submitted to the database
Time required to access the first byte in a read or write operation
Access ZTime
speed at which data is read or written. following initial access.
Transfer Rate
Memory that is lost when disconnected from power.
Volatile memory
Memory is retained when power is lost
Non-volatile memory
Primary memory used when computer programs execute
Main Memory/RAM
less expensive and higher capacity than main memory
Flash Memory/SSD
used to store large amounts of data
Magnetic Disk/HDD
Magnetic disk groups data in sectors, traditionally 512 bytes per sector but 4 kilobytes with newer disk formats
Sectors
Flash memory groups data in pages, usually between 2 kilobytes and 16 kilobytes per page
Pages
Databases and file systems use a uniform size, called a block, when transferring data between main memory and storage media
Block
Relational databases usually store an entire row within one block, which is called row-oriented storage
Row-oriented storage
Each block stores values for a single column only.
Column-oriented/Columnar Storage
Scheme for organizing rows in blocks on storage media
Table Structure
no order is imposed on rows
heap table
Database designer identifies a sort column that determines physical row order
Sorted Table/Sort Column
Rows are assigned to buckets
Hash Table
Block or group of blocks containing rows
Bucket
Column or group of columns, usually the primary key
Hash Key
Computes the bucket containing the row from the hash key
hash Function
Simple has function with four steps
Modulo Function
Automatically allocates more blocks to the table, creates additional buckets, and distributes rows across al buckets. With more buckers, fewer rows are assigned to each bucket and, on average, buckets contain fewer linked blocks.
Dynamic hash function
Interleave rows of two or mote tables in the same storage area.
Table clusts/Multi-Tables
Column that is available in all interleaved tables
Cluster key
File containing column values, along with pointers to rows containing the column value
Single-level index
Each index entry is a composite of values from all indexed columns.
Multi-level index
Database operation that reads table blocks directly, without access an index
Table scan
Database operation that reads index blocks sequentially, in order to locate the needed table blocks
Index Scan
Percentage of table rows selected by a query
Hit Ratio/Filter facotr/Selectivity
Database repeatedly splits the indexed in two until it finds the entry containing the search value
Binary Search
Index on a sort column
Primary Index/Clustering index
Contains an entry for every table rows
Dense index
Contains an entry for every table block
Sparse index
Stores column values and rows pointer s in a hierarchy
multi-level indexd
Number of index entries per block is called the fan-out of a multi-level index
Fan-out
Top-level block to a bottom-level block is called a branch.
Branch
when all branches are the same length and imbalanced when branches are different lengths
balanced/imbalanced
All indexed values appear in the bottom level. Pointers to table blocks appear only in the bottom level. Values are occasionally repeated in the index
B+tree
Indexed values appear in a higher level, the value is not repeated at lower levels. Instead, a pointer to the corresponding table block appears in the higher level along with the value
B-Tree
INdex entries are assigned to buckets
Hash index
Block or group of blocks containing index entries
BUcket
Index entry is determined by a hash function, which computes a bucket number from the value of the indexed column
hash Function
Grid of Bits
Bitmap Index
Single or multi-level index normally contains pointers to table blocks
Physical INdexd
Single or multi-level index in which pointers to table blocks are replaced with primary key values.
Logical index
Database designer specifies a function on the column value. Index entries contain the result of the function applied to column values, rather than the column values
Function index
Database object that maps one or more tables to a single file
Tablespace
Files are updated, blocks become scattered, or fragmented, across many tracks
fragmented
subset of table data. One table has many partitions that do not overlap, and together, contain all table data
partition
Subset of table rows
Horizontal Partition
Subset of table columns
Vertical partition
Subset of table data, usually a subset of rows rather than columns. Stored on different computers in a distributed database
Shard
Specifies a partition expression based on one or more partition columns. Partition expression may be simple, such as the value of a single partition column, or a complex based on several partition columns. Rows are assigned to partitions in different ways.
Partition Expression/Partition Columns
Associates each partition with a range of parittion expression values. VALUES LESS THAN keywords specify the upper bound of each range. MAXVALUE keyword represents the highest column value, and VALUES LESS THAN MAXVALUE specifies the highest range.
Range partition
Associates each partition with an explicit list of partition expression values using the VALUES IN keywords.
List Partition
A partition expression with positive integer values. Database administrator specifies the number of partitions, N, and partitions are automatically named p0 through p(n-1). Partition number for each row is computed as: (Partition expression value) modulo N
Hash Parititon
Partition expression is determined automatically by the database
Key Parition
Specifies tables, columns, and keys.
Logical Design
Specifies indexes, table structures, and partitions. Physical design affects query performance but never affects query results
Physical Design
Translates instructions generated by a query processor into low-level commands that access data on storage media. Storage engines support different index and table structures, so physical design is dependent on a specific storage engine.
Storage Engine/Storage Manager
statement creates an index by specifying the index name and table columns that compose the index
CREATE INDEX
Statement deletes a table’s index
DROP INDEX
Statement displays a table’s index
SHOW INDEX
Statement generates a result table that describes how a statement is executed by the storage engine
EXPLAIN
File that records all long-running queries submitted to the database
slow query low