DBMS - File Structures Flashcards

1
Q

File

A

named collection of records stored. Usually resident on secondary storage units ex. Disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

File Processing

A

Management of these files ex. creation insertion, updating and deletion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two main modes of processing

A
  • Batch (as a whole)
  • Online (one record at a time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What to provide for in File Structures and Processing

A
  • Design requirements
  • Understanding a DBMS
  • Comparison and comprehension of software
  • Data Reorganization
  • Query optimization purposes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

File Processing trains designers with

A

analytical and engineering techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Need for file processing

A
  • A Computation’s input and output can be larger than any main memory can hold
  • A considerable amount of data sharing is required
  • Persistent storage between bouts of file processing needed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Mechanism for data access from secondary devices

A

Different from main memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Granularity of disk access

A

Cluster/Sector/Block based

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data on storage devices can get

A

Lost/Damaged from physical or logical events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Files placed on

A

Persistent storage devices. Secondary storage is not directly addressable for computation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

IO instructions between main and secondary storage

A

Specialized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Access time and magnitude

A

Has much larger magnitude than main memory accesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Background insights

A
  • The cost of secondary storage is decreasing
  • The disk performances are improving (good cost performance ratio)
  • One can easily replace a disk unit with another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Batch

A

comes in spans ex. end of day/week

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Online/Transaction Model

A

Data processed as soon as it arrives. Activity load spread and users are catered at the same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Access Method

A

Program that implements selection criteria and navigation to manage data in a file or on behalf of an application program.

17
Q

Types of Files

A

Master
Reference
Transaction
Temporary
Archive
Log Files

18
Q

Master File

A

Used as an authority in a given job and is relatively permanent but can change.

19
Q

Reference File

A

Usually with an array of data value pairs. A reference file’s intention is a more static master file, with less data content and access to its records must be the fasdtest

20
Q

Transaction File

A

Used for a specific part of the business process. Processed with master and reference file(s).

21
Q

Temporary File

A

Acts as a dynamic, structured, or transient space. Data usually created and purged within the span of the same procedure

22
Q

Archive Files

A

Files record user and system actions, and historically relevant but now inert data

23
Q

Log Files

A

Support transaction processing requirements

24
Q

File protection

A

Loss of data due to programming or physical damage (but delimiting unlawful access)

25
Q

File security

A

Protection of data files from system operational hazards. Involves standard and severe crashes

26
Q

Standard crash

A

Coined for situations where data is lost while in main memory

27
Q

Severe crash

A

when secondary storage system’s integrity/activity are affected.

28
Q

Records

A

Details that concern a single item present in a data file are grouped into a record. Data file can have a large number of records.

29
Q

Record formats

A
  • Structured (Known set of attributes/fields)
  • Semi-structured (flexible set of attributes)
30
Q

Structured record size requirement

A

May be fixed or variable depending on fields.

31
Q

Fields/Attributes

A
  • Each data detail in record is a field. Each field has a name, datatype and size
  • Some fields are special ex. key fields
  • Size depends on medium and file organization
32
Q

Organizing fields in record

A

Fixed length, Length+data pair, Field and record delimiters, Field name plus data pair

33
Q

File Organization (serial)

A
  • Records have no value based ordering
  • Can have any size, most serial files have intrinsic temporal order
  • Usually is an input into some processing operation ex. sorting

ex. Serial Search

34
Q

File Operations

A

Open file, read next char, read line, open file for output, write line

35
Q

File Organization (Sequential)

A
  • Records ordered on one of its fields (key) ascending or descending
  • Can have any size
  • Uses sorted nature to drive another procedure

Ex. Performance just the same as serial file

36
Q

File Organization (Direct)

A
  • Given a record’s key value, translate to an address of record in data file
  • Record access performance probably the best
  • Records placement in data file, independent of record’s order by a key field
  • Accommodates free lists

Deletion: Records reused on deletion (linked list) use the free list.

Ex. Direct RRN access

37
Q

File Organization (Indexed Sequential)

A
  • Data file records are maintained in key sequence but access direct record access is supported.
  • Hybrid between sequential and direct access files
  • Slower than sequential to read consecutive records
  • Slower than direct files for direct access of records
  • Best to handle both direct and sequential access of records

Ex. Direct by key, Sequential (on key)

38
Q

Back of the envelope calculation

A

Estimate of size, effort etc.

  1. Determine main parameters that describe and affect the problem
  2. Derive a relationship that given known set-up values and parameters of the process required describes the scenario
  3. Tone values for data volumes and most typical access methods and apply the relationship above to yield an estimated solution.