Chapter 11 - Data Storage Design Flashcards

1
Q

What is data storage design?

A

Data storage design is how data is stored and handled by programs that run the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main steps of data storage design?

A

The main steps of data storage design are to:

  • Select the data storage format
  • Convert the previously made logical data model into a physical data model
  • Ensure that the ERDs and DFDs balance
  • Design the selected data storage format to optimize its processing efficiency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main types of data storage formats?

A

There are two main types of data storage formats:

  • Files
  • Database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a linked list?

A

A linked list is a set of files that are linked together using pointers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some types of files that are commonly used?

A

There are several different types of files:

  • Master Files - Store application-critical information
  • Look-up files - Contain static values (like variables)
  • Transaction files - Store information that can be used to update a master file
  • Audit files - Records “before” and “after” states of data as the data is altered
  • History files - store past transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some examples of databases?

A

Some examples of databases are:

  • Legacy Database
  • Relational Database
  • Object Database
  • Multidimensional Database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a Legacy database?

A

A legacy database is a database that is based on older technology, and is rarely used to develop new applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two types of legacy database?

A

There are two forms of legacy databases:

  • Hierarchical database: Uses hierarchies or inverted trees to represent relationships.
  • Network databases: Created to address M:N or nonhierarchical associations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a relational database?

A

A relational database is the most popular kind of database currently, based on collections of tables, each of which have a primary key. Tables are linked/related to one another by creation of a foreign key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is referential integrity?

A

Referential integrity is the ability of a relational database to ensure values that are linked between tables stay in sync.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an object database?

A

An object database is a database that is based around the principles of object orientation. All things are objects with attributes and behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a multidimentional database?

A

A multidimensional database is a form of relational database that is used extensively in data warehousing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an aggregation?

A

An aggregation is a collection of data stored by a data warehouse in multiple dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the main factors to consider when choosing a storage format?

A

The main factors to consider when choosing a data format are:

  • Data Types
  • Types of Application Storage
  • Existing Storage Formats
  • Future Needs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between a logical ERD and a physical ERD?

A

A physical ERD contains references to how much data will be stored, and has considerably more metadata defined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the steps involved in converting from a logical ERD to a physical ERD?

A

There are five steps involved in converting from a logical ERD to a physical ERD:

  • Change entities to tables or files
  • Change attributes to fields
  • Add primary keys
  • Add foreign keys
  • Add system-related components
17
Q

What are the main dimensions in which a relational database is optimized?

A

There are two main dimensions that a relational database is optimized in:

  • Storage efficiency
  • Speed of access
18
Q

What are the best ways of optimizing storage efficiency?

A

The best ways to optimize storage efficiency are to:

  • Ensure there is no redundant data and few null values
  • Normalize properly
19
Q

What are the best ways of optimizing storage access speed?

A

The best ways to optimize storage access speed are to:

  • Denormalize
  • Use clustering
  • Use indexing
  • Properly estimate the size of data for hardware planning
20
Q

What is denormalization?

A

Denormalization is the act of adding in redundant data in order to reduce the amount of time required to find data that is commonly retrieved.

21
Q

What is clustering?

A

Clustering is when records are placed together physically so that like records are close to one another.

22
Q

What is intrafile clustering?

A

Intrafile clustering is a form of clustering where similar records in a table are stored close together.

23
Q

What is interfile clustering?

A

Interfile clustering is when multiple records are combined from more than one table that are generally retrieved together.

24
Q

What is indexing?

A

Indexing is the use of an index in order to improve retrieval times from a database.

25
Q

What are the steps involved for calculating estimated storage size?

A

The steps to calculate estimated storage size are:

  • Calculate the amount of raw data
  • Calculate the overhead
  • Record the initial records that will be loaded into the table