Chapter 8: Data Structures and CAATTs for Data Extraction Flashcards
Data structures have two fundamental components:
organization and access method
_______________ refers to the way records are physically arranged on the secondary storage device. This may be either sequential or random.
Organization
The _______________ is the technique used to locate records and to navigate through the database or
file.
access method
Under this arrangement, for example, the record with key value 1875 is placed in the physical storage space immediately following the record with key value 1874. Thus, all records in the file lie in contiguous storage spaces in a specified sequence (ascending or descending) arranged by their primary key.
sequential structure
An ________________ is so named because, in addition to the actual data file, there exists a separate index that is itself a file of record addresses. This index contains the numeric
value of the physical disk storage location (cylinder, surface, and record block) for each record in the associated data file.
indexed structure
Records in an _________________ are dispersed throughout a disk without regard for their physical proximity to other related records
indexed random file
The ___________________________structure is used for very large files that require routine batch processing and a moderate degree of individual record processing. For instance, the customer file of a public utility company will be processed in batch mode for billing purposes and directly accessed in response to individual customer
queries
Virtual Storage access method (VSAM)
A VSAM file has three physical components:
the indexes
the prime data storage area
the overflow area.
A ______________ employs an algorithm that converts the primary key of a record directly into a storage address.
hashing structure
The principal advantage of hashing is _____________________.
access speed
________________ is used to create a linked-list file.
pointer structure
A ___________________ contains the actual disk
storage location (cylinder, surface, and record number) needed by the disk controller. This physical address allows the system to access the record directly without obtaining
further information. This method has the advantage of speed, since it does not need to be manipulated further to determine a record’s location.
physical address pointer
A _____________ contains the relative position of a record in the file. For
example, the pointer could specify the 135th record in the file. This must be further manipulated to convert it to the actual physical address. The conversion software calculates this by using the physical address of the beginning of the file, the length of each record
in the file, and the relative address of the record being sought.
relative address pointer
A _________________ contains the primary key of the related record. This key value is then converted into the record’s physical address by a hashing algorithm.
logical key pointer
This structure uses an index in conjunction with a sequential file organization. It facilitates both direct access to individual records and batch processing of the entire file. Multiple indexes can be used to create a cross-reference, called an inverted list, which allows even more flexible access to data.
indexed sequential file structure
An _______ is anything about which the organization wishes to capture data. These may be physical, such as inventories, customers, or employees. They may also be conceptual, such as sales (to a customer), accounts receivable (AR), or accounts payable (AP).
entity
The term _____________ is used to describe the number of instances or records that pertain to a specific entity.
occurrence