lec exam Flashcards

1
Q

What are the most important files (file types) in an Oracle database?

A

parameter file, password file, backup files, archive log files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give the most important memory structures of an Oracle instance.

A

? data dictionary cache, server result cache, response queue, request queue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give the most important processes of an Oracle instance

A

? PMON, RMON, RECO, MMON etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

List 5 data dictionary views in an Oracle database.

A

dba views dba objects dba tables dba synonyms dba indexes dba_tab_columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

List 10 different schema objects in an Oracle database?

A

Clusters
Database links
Database triggers
Dimensions
External procedure libraries
Indexes and index types
Java classes, Java resources, and Java sources
Materialized views and materialized view logs
Object tables, object types, and object views
Operators
Sequences
Stored functions, procedures, and packages
Synonyms
Tables and index-organized tables
Views

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a sequence in an Oracle database? Give SQL examples for the creation and usage.

A

A sequence is a mechanism for automatically generating integers that follow a pattern.

creation:
CREATE SEQUENCE seq1
MINVALUE 1 MAXVALUE 100 INCREMENT BY 5
START WITH 50 CYCLE;

SELECT * FROM DBA_SEQUENCES
WHERE sequence_name=’SEQ1’;

usage:
INSERT INTO dept VALUES(seq1.NEXTVAL, ‘IT’, ‘Budapest’);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give the data storage concepts (segment, extent etc.) in an Oracle database, and draw the relationships among them.

A

https://imgur.com/a/g7b6E9K

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe RAID level 0, 1, 2, 3, 4, 5, 6 technology.

A

RAID 0 (also known as a stripe set or striped volume) splits (“stripes”) data evenly across two or more disks, without parity information, redundancy, or fault tolerance.

RAID 1 consists of an exact copy (or mirror) of a set of data on two or more disks; a classic RAID 1 mirrored pair contains two disks.

RAID 2, which is rarely used in practice, stripes data at the bit (rather than block) level, and uses a Hamming code for error correction.

RAID 3, which is rarely used in practice, consists of byte-level striping with a dedicated parity disk.

RAID 4 consists of block-level striping with a dedicated parity disk.

RAID 5 consists of block-level striping with distributed parity.

RAID 6 extends RAID 5 by adding another parity block; thus, it uses block-level striping with two parity blocks distributed across all member disks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does it mean: spanned vs unspanned record? Draw example data blocks for both.

A

?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give 3 sequencing options for records.

A

Next record physically contiguous, linked, overflow area

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the difference between row store and column store. Give example records for both.

A

In row storing fields of a record are stored contiguously but in column storing fields are stored together(in a column)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between a sparse index and a dense index?

A

A dense index has an index entry for every search key value in the data file. A sparse index, on the other hand, has index entries for only some of the search values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between a primary index and secondary index?

A

Primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.
Secondary index may be generated from a field which is a candidate key and has a unique value in every record, or a non-key with duplicate values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a clustering index?

A

Clustering index is defined on an ordered data file. The data file is ordered on a non-key field. It includes one index entry for each distinct value of the field; the index entry points to the first data block that contains records with that field value. It is an example of non-dense index where Insertion and Deletion is relatively straightforward with a clustering index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between a B-tree and a B+ tree?

A

A B+ tree only stores data in the leaf nodes but a B- tree stores data in the interior nodes as well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a bitmap index? What are there in the leaf nodes?

A

A bitmap index is a special kind of database index that uses bitmaps. A leaf node is made up of the following components:
An entry header, which stores the number of columns and locking information
Key column length-value pairs, which define the size of a column in the key followed by the value for the column (The number of such pairs is a maximum of the number of columns in the index.)
ROWID of a row that contains the key values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is dynamic hashing?

A

Dynamic hashing is a method of hashing, or shortening a string of characters in computer programming, where the set of shortened characters grows, shrinks, and reorganizes to fit the way the data is being accessed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the most important cost factor in query execution?

A

? Disk I/Os

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Give the meaning of the following notations that we use in cost estimation: T(R), B(R), bf(R), V(R,A), SC(R,A).

A

T(R): number of records in R
B(R): number of pages to store relation R
bf(R): blocking factor
V(R,A): number of distinct values of attribute A in R
SC(R,A): selection cardinality of A in R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the average cost of a selection operation (“sigma”A=xR) if we use a clustered B+ tree index?
in case of single record/multiple record

A

single record: HTi + 1

multiple records: HTi + ceil( SC(R,A)/bfR )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the average cost of a selection operation (“sigma”A=xR) if we use a secondary B+ tree index?
in case of key field/nonkey filed

A

key field: HTi + 1

nonkey field: HTi + SC(A,R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Describe the external Sort-Merge algorithm. What is the cost of it?

A

The external merge sort is a technique in which the data is stored in intermediate files and then each intermediate files are sorted independently and then combined or merged to get a sorted data.
https://imgur.com/a/Ek4Deb3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the cost of a Nested Loop join algorithm? (best case/worst case)

A

best case: B(R)+B(S)

worst case: T(R) * B(S) + B(R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the cost of a Block Nested Loop join algorithm? (best case/worst case)

A

best case: B(R)+B(S)

worst case: B(R) * B(S) + B(R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the cost of the improved Block Nested Loop join algorithm?

A
best case: B(R)+B(S)
general case
(B(R) / (M-1)) * B(S) + B(R)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Describe the Indexed Nested Loop join algorithm? What is the cost of it?

A

B(R) + T(R) * c

c~=T(s)/V(s,A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Describe the Sort-merge join algorithm. What is the cost of it?

A

cost of sorting + B(S) + B(R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Describe the Hash-join algorithm? What is the cost of it?

A

2*(B(R)+B(S)) + (BR()+B(S))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is materialization and pipelining?

A

Materialization: one operation at a time, materialize intermediate results for subsequent use
Pipeline: evaluate several operations simultaneously, no need to store
temporary results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q
Give some basic relational algebra expression equivalence rules
Conjunctive selection decomposition
Distribution of selection over join
Distribution of projection over join
Associativity of joins, products, union
A

https://imgur.com/a/uFVDigk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Give the meaning of the following index options:
composite index
function-based index
compressed index

A

A composite index is an index on multiple columns.
?
?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a partitioned table?

A

Table partitioning is a way to divide a large table into smaller, more manageable parts without having to create separate tables for each part.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

List the partitioning types an Oracle database supports

A

Range, hash, list partitioning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Give the properties of an Index-Organized Table

A

An IOT(index organized table) must contain a primary key.
Rows are accessed via a logical rowid and not a physical rowid like in heap-organized tables.
An IOT cannot be in a cluster.
An IOT cannot contain a column of LONG data type.
You cannot modify an IOT index property using ALTER INDEX (error ORA-25176), you must use an ALTER TABLE instead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Give a diagram about two clustered tables

A

?

36
Q

Give estimation for the number of blocks of a product operation: B(R x S)

A

?

37
Q

What is Selection Cardinality?

A

It specifies how many records that can be selected from a node.

38
Q

Give the main steps of query optimization (diagram)

A

?

39
Q

Give two conventional wisdom rules about query optimization

A
Relational algebra level:
transformations
good transformations
Detailed query plan level:
estimate costs
generate and compare plans
40
Q

What is the difference between ALL_ROWS and FIRST_ROWS optimization modes?

A

all_rows: first compute, then return
first_rows: return while computing(if possible)

41
Q

What does an EXPLAIN PLAN statement do?

A

Stores plan in Plan_Table

42
Q

What is stored in PLAN_TABLE?

A

row-sources and operations

43
Q

What is a Full Table Scan operation

A
  • All blocks read sequentially into buffer cache

- Per block: extract + return all rows

44
Q

What is an Index Unique Scan operation

A

Traverses the node blocks to locate correct leaf block
Searches value in leaf block (if not found => done)
Returns rowid to parent row-source

45
Q

What is an Index Range Scan operation

A

Traverses the node blocks to locate most left leaf block
Searches 1st occurrence of value in leaf block
Returns rowid to parent row-source
Continues on to next occurrence of value in leaf block

46
Q

What is Clustering Factor?

A

Average number of blocks to access a single value

Used to rank multiple available range scans

47
Q

What is the difference between Explain Plan and Tracing?

A

Explain-plan: give insight before execution

Tracing: give insight in actual execution

48
Q

What can we do with hints?

A

Force optimizer to pick specific alternative

49
Q

What can we do with ANALYZE command?

A

We can periodically generate statistics

50
Q

What does it mean: a consistent database?

A

Consistent state: satisfies all constraints

Consistent DB: DB in consistent state

51
Q

What is a transaction?

A

collection of actions that preserve consistency

52
Q

How can constraints be violated?

A

Transaction bug (incomplete transaction)
DBMS bug (some process)
Hardware failure
Data sharing

53
Q

What are undesired expected events?

A

System crash

  • memory lost
  • cpu halts, resets
54
Q

Give the 3 important address spaces of a DBMS.

A
  1. The disk blocks
  2. The shared main memory
  3. The local address space of a Transaction
55
Q

Describe the following operations: read, write, input, output.

A

Input (x): block containing x -> memory
Output (x): block containing x-> disk
Read (x,t): do input(x) if necessary t

56
Q

What does “atomicity” property of a transaction mean?

A

execute all actions of a transaction or none at all

57
Q

Describe UNDO logging rules.

A

1) For every action generate undo log record (containing old value)
2) Before x is modified on disk, log records belonging to x must be on disk (write ahead logging: WAL)
3) Before commit is flushed to log, all writes of transaction must be reflected on disk

58
Q

Give the write order to disk in case of UNDO logging

A
  1. The log records indicating changed database elements.
  2. The changed database elements themselves.
  3. The COMMIT log record.
59
Q

Give the recovery rules in case of UNDO logging.

A
1) Let S = set of transactions with			 in log, but no
		 (or ) record in log
(2) For each  in log,
	  in reverse order (latest  earliest) do:
		- if Ti  S then    - write (X, v)
				            - output (X)
(3) For each Ti  S do
		- write  to log (plus FLUSH LOG)
60
Q

What happens when a failure occurs during recovery from UNDO log?

A

We just have to undo idempotent

61
Q

Give the steps of a simple checkpoint in UNDO logging.

A

(1) Do not accept new transactions
(2) Wait until all transactions finish
(3) Flush all log records to disk (log)
(4) Flush all buffers to disk (DB) (do not discard buffers)
(5) Write “checkpoint” record on disk (log)
(6) Resume transaction processing

62
Q

Give the steps of a non-quiescent checkpoint in UNDO logging.

A
  1. Write log record
    T1 … Tk are active transactions, and flush log.
  2. Wait until all Ti-s commit or abort, but don’t prohibit other transactions from starting.
  3. When all Ti-s have completed, write a log record and flush the log.
63
Q

To which point do we have to scan backwards in UNDO log if we use checkpoint?

A

If we first meet an record, then we know that all incomplete transactions began after the previous
record.
We may thus scan backwards as far as the next , and then stop; previous log is useless and may as well have been discarded.

If we first meet a record < START CKPT (T1, … , Tk)>, then the crash occurred during the checkpoint. We need scan no further back than the start of the earliest of these incomplete transactions.

64
Q

Describe REDO logging rules.

A

(1) For every action, generate redo log
record (containing new value)
(2) Before X is modified on disk (DB), all log records for transaction that modified X (including commit) must be on disk
(3) Flush log at commit
(4) Write END record after DB updates flushed to disk

65
Q

Give the write order to disk in case of REDO logging

A
  1. The log records indicating changed database elements.
  2. The COMMIT log record.
  3. The changed database elements themselves.
66
Q

Give the recovery rules in case of REDO logging

A

1) Let S = set of transactions with (and no ) in log
(2) For each in log, in forward
order (earliest  latest) do:
- if Ti  S then Write(X, v)
Output(X)
(3) For each Ti  S, write
to Log (plus FLUSH LOG)

67
Q

Give the steps of a non-quiescent checkpoint in REDO logging.

A
  1. Write log record
    T1 … Tk are active (uncommitted) transactions, and flush log.
  2. Write to disk all database elements that were written to buffers but not yet to disk by transactions that had already committed when the record was written to the log. (dirty buffers)
  3. When all Ti-s have completed, write a log record and flush the log.
68
Q

To which point do we have to scan backwards in REDO log if we use checkpoint?

A

If we first meet an record, we need to scan no further back than the earliest of records (among corresponding .

If we first meet a record < START CKPT (T1, … , Tk)>, then the crash occurred during the checkpoint.
We cannot be sure that committed transactions prior to the start of this checkpoint had their changes written to disk.
Thus, we must search back to the previous record, find its matching < START CKPT (T1, … , Tk)>.

69
Q

What are the key drawbacks of UNDO log and REDO log?

A

Undo logging: need frequent disk writes

Redo logging: need to keep all modified blocks in memory until commit

70
Q

Describe UNDO/REDO logging rules.

A

Page X can be flushed before or after Ti commit
Log record flushed before corresponding updated page (WAL)
Flush at commit (log only)

71
Q

Give the recovery rules in case of UNDO/REDO logging.

A
  1. Redo all the committed transactions in the order earliest-first, and
  2. Undo all the incomplete transactions in the order latest-first.
72
Q

Give the steps of a non-quiescent checkpoint in UNDO/REDO logging

A
  1. Write log record
    T1 … Tk are active (uncommitted) transactions, and flush log.
  2. Write to disk all the buffers that are dirty; i.e., they contain one or more changed database elements. Unlike redo logging, we flush all dirty buffers, not just those written by committed transactions.
  3. When all Ti-s have completed, write a log record and flush the log.
73
Q

What is concurrency control

A

General process of assuring that transactions preserve consistency when executing simultaneously is called concurrency control.

74
Q

What is a schedule?

A

A schedule is a sequence of the important actions taken by one or more transactions.

75
Q

What is a serial schedule

A

A schedule is serial if its actions consist of all the actions of one transaction, then all the actions of another transaction, and so on.

76
Q

What is a serializable schedule?

A

A schedule S is serializable if there is a serial

schedule S’ such that for every initial database state, the effects of S and S‘ are the same.

77
Q

What does it mean: “conflict equivalent”?

A

S1, S2 are conflict equivalent schedules

if S1 can be transformed into S2 by a series of non-conflicting swaps of adjacent actions.

78
Q

What does it mean: “conflict serializable”?

A

A schedule is conflict serializable if it is conflict equivalent to some serial schedule.

79
Q

What is a precedence graph?

A

?

80
Q

What can we say if two schedules are conflict equivalent?

A

S1, S2 conflict equivalent -> P(S1)=P(S2)

81
Q

Give two schedules whose precedence graphs are the same, but not confl. equivalent

A

P(S1)=P(S2) !-> S1, S2 conflict equivalent

82
Q

What can we state about precedence graphs

A

?

83
Q

What does “consistency of a transaction” mean?

A
  1. A transaction can only read or write an element if it previously was granted a lock on that element and hasn’t yet released the lock.
  2. If a transaction locks an element, it must later unlock that element.
84
Q

What does “legality of schedules” mean?

A

Locks must have their intended meaning: no two transactions may have locked the same element without one having first released the lock.

85
Q

What does “two phase locking” mean?

A

It means a transaction hasn’t been unlocked before its locked or locked after it is unlocked