- The cost of secondary storage is decreasing - The disk performances are improving (good cost performance ratio) - One can easily replace a disk unit with another

DBMS - File Structures Flashcards by Greg Pavia

File

named collection of records stored. Usually resident on secondary storage units ex. Disk

How well did you know this?

Not at all

Perfectly

File Processing

Management of these files ex. creation insertion, updating and deletion

How well did you know this?

Not at all

Perfectly

Two main modes of processing

Batch (as a whole)
Online (one record at a time)

How well did you know this?

Not at all

Perfectly

What to provide for in File Structures and Processing

Design requirements
Understanding a DBMS
Comparison and comprehension of software
Data Reorganization
Query optimization purposes

How well did you know this?

Not at all

Perfectly

File Processing trains designers with

analytical and engineering techniques

How well did you know this?

Not at all

Perfectly

Need for file processing

A Computation’s input and output can be larger than any main memory can hold
A considerable amount of data sharing is required
Persistent storage between bouts of file processing needed

How well did you know this?

Not at all

Perfectly

Mechanism for data access from secondary devices

Different from main memory

How well did you know this?

Not at all

Perfectly

Granularity of disk access

Cluster/Sector/Block based

How well did you know this?

Not at all

Perfectly

Data on storage devices can get

Lost/Damaged from physical or logical events

How well did you know this?

Not at all

Perfectly

Files placed on

Persistent storage devices. Secondary storage is not directly addressable for computation.

How well did you know this?

Not at all

Perfectly

IO instructions between main and secondary storage

Specialized

How well did you know this?

Not at all

Perfectly

Access time and magnitude

Has much larger magnitude than main memory accesses

How well did you know this?

Not at all

Perfectly

Background insights

The cost of secondary storage is decreasing
The disk performances are improving (good cost performance ratio)
One can easily replace a disk unit with another

How well did you know this?

Not at all

Perfectly

Batch

comes in spans ex. end of day/week

How well did you know this?

Not at all

Perfectly

Online/Transaction Model

Data processed as soon as it arrives. Activity load spread and users are catered at the same time.

How well did you know this?

Not at all

Perfectly

Access Method

Study These Flashcards

Program that implements selection criteria and navigation to manage data in a file or on behalf of an application program.

Types of Files

Study These Flashcards

Master
Reference
Transaction
Temporary
Archive
Log Files

Master File

Study These Flashcards

Used as an authority in a given job and is relatively permanent but can change.

Reference File

Study These Flashcards

Usually with an array of data value pairs. A reference file’s intention is a more static master file, with less data content and access to its records must be the fasdtest

Transaction File

Study These Flashcards

Used for a specific part of the business process. Processed with master and reference file(s).

Temporary File

Study These Flashcards

Acts as a dynamic, structured, or transient space. Data usually created and purged within the span of the same procedure

Archive Files

Study These Flashcards

Files record user and system actions, and historically relevant but now inert data

Log Files

Study These Flashcards

Support transaction processing requirements

File protection

Study These Flashcards

Loss of data due to programming or physical damage (but delimiting unlawful access)

File security

Protection of data files from system operational hazards. Involves standard and severe crashes

Standard crash

Coined for situations where data is lost while in main memory

Severe crash

when secondary storage system's integrity/activity are affected.

Records

Details that concern a single item present in a data file are grouped into a record. Data file can have a large number of records.

Record formats

- Structured (Known set of attributes/fields) - Semi-structured (flexible set of attributes)

Structured record size requirement

May be fixed or variable depending on fields.

Fields/Attributes

- Each data detail in record is a field. Each field has a name, datatype and size - Some fields are special ex. key fields - Size depends on medium and file organization

Organizing fields in record

Fixed length, Length+data pair, Field and record delimiters, Field name plus data pair

File Organization (serial)

- Records have no value based ordering - Can have any size, most serial files have intrinsic temporal order - Usually is an input into some processing operation ex. sorting ex. Serial Search

File Operations

Open file, read next char, read line, open file for output, write line

File Organization (Sequential)

- Records ordered on one of its fields (key) ascending or descending - Can have any size - Uses sorted nature to drive another procedure Ex. Performance just the same as serial file

File Organization (Direct)

- Given a record's key value, translate to an address of record in data file - Record access performance probably the best - Records placement in data file, independent of record's order by a key field - Accommodates free lists Deletion: Records reused on deletion (linked list) use the free list. Ex. Direct RRN access

File Organization (Indexed Sequential)

- Data file records are maintained in key sequence but access direct record access is supported. - Hybrid between sequential and direct access files - Slower than sequential to read consecutive records - Slower than direct files for direct access of records - Best to handle both direct and sequential access of records Ex. Direct by key, Sequential (on key)

Back of the envelope calculation

Estimate of size, effort etc. 1. Determine main parameters that describe and affect the problem 2. Derive a relationship that given known set-up values and parameters of the process required describes the scenario 3. Tone values for data volumes and most typical access methods and apply the relationship above to yield an estimated solution.

DBMS - File Structures Flashcards

(38 cards)