Chapter 11 - Data Storage Design Flashcards
What is data storage design?
Data storage design is how data is stored and handled by programs that run the system.
What are the main steps of data storage design?
The main steps of data storage design are to:
- Select the data storage format
- Convert the previously made logical data model into a physical data model
- Ensure that the ERDs and DFDs balance
- Design the selected data storage format to optimize its processing efficiency
What are the main types of data storage formats?
There are two main types of data storage formats:
- Files
- Database
What is a linked list?
A linked list is a set of files that are linked together using pointers.
What are some types of files that are commonly used?
There are several different types of files:
- Master Files - Store application-critical information
- Look-up files - Contain static values (like variables)
- Transaction files - Store information that can be used to update a master file
- Audit files - Records “before” and “after” states of data as the data is altered
- History files - store past transactions
What are some examples of databases?
Some examples of databases are:
- Legacy Database
- Relational Database
- Object Database
- Multidimensional Database
What is a Legacy database?
A legacy database is a database that is based on older technology, and is rarely used to develop new applications.
What are the two types of legacy database?
There are two forms of legacy databases:
- Hierarchical database: Uses hierarchies or inverted trees to represent relationships.
- Network databases: Created to address M:N or nonhierarchical associations
What is a relational database?
A relational database is the most popular kind of database currently, based on collections of tables, each of which have a primary key. Tables are linked/related to one another by creation of a foreign key.
What is referential integrity?
Referential integrity is the ability of a relational database to ensure values that are linked between tables stay in sync.
What is an object database?
An object database is a database that is based around the principles of object orientation. All things are objects with attributes and behavior.
What is a multidimentional database?
A multidimensional database is a form of relational database that is used extensively in data warehousing.
What is an aggregation?
An aggregation is a collection of data stored by a data warehouse in multiple dimensions.
What are the main factors to consider when choosing a storage format?
The main factors to consider when choosing a data format are:
- Data Types
- Types of Application Storage
- Existing Storage Formats
- Future Needs
What is the difference between a logical ERD and a physical ERD?
A physical ERD contains references to how much data will be stored, and has considerably more metadata defined.
What are the steps involved in converting from a logical ERD to a physical ERD?
There are five steps involved in converting from a logical ERD to a physical ERD:
- Change entities to tables or files
- Change attributes to fields
- Add primary keys
- Add foreign keys
- Add system-related components
What are the main dimensions in which a relational database is optimized?
There are two main dimensions that a relational database is optimized in:
- Storage efficiency
- Speed of access
What are the best ways of optimizing storage efficiency?
The best ways to optimize storage efficiency are to:
- Ensure there is no redundant data and few null values
- Normalize properly
What are the best ways of optimizing storage access speed?
The best ways to optimize storage access speed are to:
- Denormalize
- Use clustering
- Use indexing
- Properly estimate the size of data for hardware planning
What is denormalization?
Denormalization is the act of adding in redundant data in order to reduce the amount of time required to find data that is commonly retrieved.
What is clustering?
Clustering is when records are placed together physically so that like records are close to one another.
What is intrafile clustering?
Intrafile clustering is a form of clustering where similar records in a table are stored close together.
What is interfile clustering?
Interfile clustering is when multiple records are combined from more than one table that are generally retrieved together.
What is indexing?
Indexing is the use of an index in order to improve retrieval times from a database.