Database Flashcards by H Li

SQL Pipeline from query submit to result returned

[User] 
|
Query
|
[Query Parser] - Parser, Preprocessor/Validation, Optimizer
|
Query Plan
|
[Execution Engine]
|
Result

How well did you know this?

Not at all

Perfectly

ACID Properties of Database System

Atomicity
- All or nothing
Consistency
- Strong (always consistent)
- Eventual (May have stale data for short-while)
Isolation
- each Transaction is executed in isolation (like in its own VM)
Durability
- Result/effect of transaction, if completed, is permanent

How well did you know this?

Not at all

Perfectly

Data Model (or java Class, or Data Structure)

Data Model (is similar to a Java Class) that it describes a “Table”

1) Structure of Data/Table (what fields/data members)
2) Constraints on Data (value type/range/format of each filed)
3) Operations on Data (what data operations insert/join/append/getter/setter)

Object instances to Java class is Tables to DBMS Relational Data Models

Inside Each Table, then the “Records” can be represented by Object instances. That is a DBMS Relational Data Models stores objects

How well did you know this?

Not at all

Perfectly

General Data Model vs DBMS Data Model

General Data Model is like a java class, that it extends fro Java Object class.

The java Object class does NOT describe relationships between objects. Thus overall in general, java objects do not carry Relationship with other objects.

DBMS Relational Data Model (A “Table” class is a like a specialized java class, that all Data Models in DBMS extends from.

Each DBMS Relational Data Model Instance defines a “Table” or “Relation”

DBMS Relational Data Model class defines these members and behaviors:

1) Schema (attributes, and validators0
2) Tuples (records of the table)
3) Getter and Setter (of tuples)

// standard ops on Relational Data Model, such as
// join, select, ...

// Schema of the Model can be used to generate attributes
// and their validators.

class DBMS_Relational_DataModel {
    public relation_name;
    public List attributes;
    public List attributes_types;
    public String primary_key_column;
    public String Schema

public Set tuples; //records of the table

DBMS_Relational_DataModel 
join(DBMS_Relational_DataModel that) {
    return new DBMS_Relational_DataModel(
        this.relation_name.join(that.relation_name, "\_\_"),
        this.attributes.append(that.attributes),
        this.attributes_types.append(that.attributes_types)
    )
} };

// Car table inherits ops such as join, select, …

Class Car extends DBMS_Relational_DataModel {
/* Creating a Car instance == Instantiate a Table /
Car() {
DBMS_Relational_DataModel(
“Cars”,
{“Index”, “Model”, “Make”, “MFR”, “Year”, “MRP”}
{Integer.class , String.class, String.class, String.class, Date.class, Integer.class}
),
“Index” / as primary key */
};
}

Representation of a DBMS Data Model Instance is
R( such as
R(“Index”, “Model”, “Make”,…“MRP”);

How well did you know this?

Not at all

Perfectly

Relation, Instance, Table, Record, Keys

A Table is a Relation (among its attributes)
An Record/tuple of the table is an Instance of Relation

The set of attributes that uniquely identify a Record/Instance is “Keys Of Relationship”
(I.e. the constraints on the values of the “Keys” attributes is that they cannot have same value between two records/instances/tuples)

How well did you know this?

Not at all

Perfectly

3 Data Representations in SQL

Table, View and (Transient Temporary Tables)

Stored Relations (Tables)
Views (Seen by clients, Computation Results, Temporary Tables in memory to present computation results) E.g. the “Join” view of two Tables.
Temporary Tables (not seen by clients…intermediate tables created by DBMS implementation)

SQL user only can create TABLE or VIEW

CREATE TABLE employee …
or
CREATE VIEW employee2 AS SELECT ….

How well did you know this?

Not at all

Perfectly

Why SQL is a high-level language

In DBMS impl, there is the class DBMS_Relational_DataModel, and can be used to instantiate each DBMS Relation class/Table. But from user’s perspective, we are not using java or C++ code to instantiate DBMS Relation Data Model instance. Instead, we use high-level language like SQL to “instantiate” a Table, and manipulate. This way it also does not matter if backend is implemented with JAVA or C or NodeJS etc.

E..g

SQL> CREATE table “Movies” (
“name”, CHAR(100),
“length”, INT)
)

will trigger DBMS to instance a new class

      new DBMS_Relational_DataModel(
        "Movies", 
        {"Index", "name", "length"}
        {Integer.class, String.class , Integer.class}
      ),

How well did you know this?

Not at all

Perfectly

Algebra vs (Relational Algebra) vs SQL

Algi-brei-ik Query Language is the center of SQL

SQL built on Relational Algebra

**Algebra: Relationship between variables x, y z such as x^2 + y^2 = z^2

**Relational Algebra: Relationship between Variables that are Sets/Relation/tables (operator Overloading)
===== SET OPERATIONS====(union/intersection/difference)
“+” in Algebra means “join” In Relational Algebra

**SQL: High-level language that express Relational Algebra
Instead of saying X+Y we say X OUTTER JOIN Y

How well did you know this?

Not at all

Perfectly

Relational Algebra = Set/Select/Project Operations

UNION = Set operations
SELECT = Projection
WHERE = Selection
AS = Rename

Relational Algebra between two Table Variables R, S,
1) union (R u S) , intersection (R n S), difference (R - S), of tuples of Relation R, S
(‘UNION’ is not JOIN)

2) Selection (row elimination) return a new Relation with selected all cols and selected rows

3) Projection (col elimination)
Return a new Relation with all rows and selected cols (new duplicate tuples may result)

4) Renaming (meta data change only)

Statement:

SELECT last, first, address FROM employee AS ‘short_form’ WHERE last == ‘john’

SELECT – Projection
WHERE – Selection
AS – Rename

How well did you know this?

Not at all

Perfectly

Relational Algebra = Set/Select/Project Operations

UNION | EXCEPT = Set operations (union, diff)
SELECT = Projection
WHERE = Selection
AS = Rename

Relational Algebra between two Table Variables R, S,
1) union (R u S) , intersection (R n S), difference (R - S), of tuples of Relation R, S
(‘UNION’ is not JOIN)

2) Selection (row elimination) return a new Relation with selected all cols and selected rows

3) Projection (col elimination)
Return a new Relation with all rows and selected cols (new duplicate tuples may result)

4) Renaming (meta data change only)

Statement:

SELECT last, first, address FROM employee AS ‘short_form’ WHERE last == ‘john’

SELECT – Projection
WHERE – Selection
AS – Rename

How well did you know this?

Not at all

Perfectly

Cartesian Product (Cross Product)

Ka-‘ti-sian Product between R, S

R: m cols, u rows
S: n cols, v rows

R, S shares at least 1 column, let’s say it is B

Cartesian(R, S) on B = (m+n) cols, u * v rows
for every row/tuple in R, append each row from S to form a new tuple

r0 r0.B, s0.B, s0
r0 r0.B, s1.B, s1
…
r0 r0.B, svB, sv

r1 r1.B, s0B, s0
r1 r1.B, s1B, s1
…
r1 r1.B, svB, sv

   ...
   ru ru.B, s0B,  s0
   ru ru.B, s1B,  s1
   ...
   ru ru.B, svB,  sv

Then different JOINs apply SELECTION and PROJECTION on the Cartesian (R, S) result

How well did you know this?

Not at all

Perfectly

Natural Join R, S on attributes {A1, A2…}

Apply __________ on Cartesian Product

Apply ___SELECTION____ on Cartesian Product where WHERE condition

T = Cartesian Product (R, S)

T1 = SELEC * FROM T where R.A1 == S.A1 AND R.A2 == S.A2 AND R.A3 == S.A3

How well did you know this?

Not at all

Perfectly

Theta Join R, S on any Condition C

Apply __________ on Cartesian Product

Apply ___SELECTION____ on Cartesian Product with WHERE condition

T = Cartesian Product (R, S)

SELEC * FROM T WHERE C

How well did you know this?

Not at all

Perfectly

SQL Query–> Albegra Relation Operators:

Selection, Projection, Rename,
Union, Intersection, Difference
Join

SQL: 1 statement => Multiple Operations

SQL parse statement into primitive operations (Query Plan), then submit for Optimized Execution.

SELECT titles, years FROM movies AS fox_movies
where
length > 100
AND studioName == “Fox”

T = Cartesian Product (R, S) for tables in FROM
T1 = SELECT * FROM movies WHERE length > 100
T2 = SELECT * FROM movies WHERE studioName==”fox”

return Project (titles, years) from (T1 n T2)

As such, the query is broken into

1) Cartesian product (FROM)
2) Selection (WHERE)
3) Intersection/Union/Difference (AND|OR)
4) Projection (SELECT)
5) Rename.(AS)

How well did you know this?

Not at all

Perfectly

Map SQL to Operations
SELECT-->?
FROM --> ?
WHERE (AND|OR) --> ?
AS --> ?

SELECT–> Projection
FROM –> Cartesian Product
WHERE (AND|OR) –> Selection|Intersection/Union/Diff
AS –> Rename

How well did you know this?

Not at all

Perfectly

Evaluation Order: SELECT FROM WHERE

Study These Flashcards

Order is

FROM
WHERE
SELECT

Because
FROM Is before operations (Source)
WHERE is during operation (Union/Intersection)
SELECT is after operations (Projection on results)

null in SQL

Study These Flashcards

null in SQL is not compared against any object/values.

It can be only checked by “is null” or “NOT is null”

how to check for nullness in SQL

Study These Flashcards

null in SQL is not compared against any object/values.
It can be only checked by “is null” or “NOT is null”

where x is not null
where x = 3
where NOT x=3

“in” operator

Study These Flashcards

WHERE x in (1, 2, 3)

This can be used check for Referential Integrity constraints

“like” operator

Study These Flashcards

in SQL, “like” operator does regular expression matching.

wildcard operator “%” is “.*” in regular expression (0, 1 or more chars)
wildcard operator “_” is “.{1}’ in regular expression (1 char)

Underlying Operations for JOIN (R, S)
determined by how result Key set is generated

SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;

Study These Flashcards

Assuming R, S can be joined on shared attributes (e.g. {A, B}

T = JOIN(R, S)

The Inner, Left, Right, Full join basically determines HOW the result Key set is generated from R’s Key set and S’s Key set.

INNER Join (Intersection)

1) T = Cartesian Product of R * S
2) select * from T WHERE R.A=S.A AND R.B=S.B

LEFT join = R + INNER(R, S)
RIGHT join = S + INNER(R, S)
FULL JOIN = LEFT JOIN + RIGHT JOIN

how to achieve Cartesian Product with SELECT statement

Study These Flashcards

select * from R, S

is the result of Cartesian Producgt (R, S)

Cartesian Product in SELECT statement

Study These Flashcards

select * from R, S

is the result of Cartesian Producgt (R, S)

Set operations (diff, intersection, union) in SQL

R n S
R u S
R - (R n S)

Study These Flashcards

DIFF (
* Difference between R, 
* (R - S) 
* what is in R but not in S?
)
select ... from R
where ... not in (select .... from S)

INTERSECT )
 * what does both R and S have
 * what is in R is also in S
)
select ... from R
where ... in (select .... from S)

UNION (
* combine both R and S
)
select ... from R
union
select ... from S

will perform union/diff on the two results

eqiv-join

inner join

anti-join

What does R - S = 0 mean?

It means for every tuple in R, it is in S I.e R is a subset of S or R is in S R - S = R - (R n S) = R - R = 0 = Empty set R belongs to S ==> R - S = 0 ==> every tuple in R is also in S If Every tuple in R is also in S ==> R belong to S ==> R - (R n S) = R - R = 0 If R and S are have identical and different tuples, then R - (R n S) = R - S != 0 (What is left is tuple in R but not in S) I.e. (R - S) not in S

What does R - (R n S) = R - R = 0 mean?

Mean R is contained inside S | (R - (R n S) is not in S

Referential Integrity Constraint

Referential Integrity Constraint is like: T1.A uses Enum as attribute T2.A defines Enum values (Enum value must from Enum class) Each T1.A must be from a value in T2.A, I.e. {T1.A} is in {T2.A} E.g Employee : empno, deptno Dept: depno, loc Every depno values in Employee must be from Dept. I.e. select deptno from emp where deptno not in (select deptno from dept) should be empty if RIC is expected.

Key Constraint | If Key is same ==> tuple is same

A Key Constraint means no two tuples in the table/Relation has a same value for the Key attribute Or if two tuples have same value for the Key attribute, they must also have same value for all other attributes. To confirm, get the Cartesian product of the relation itself. select * from emp as R1, emp as R2 where R1.empno=R2.empno AND R1.ename != R2.ename should result in empty set

Data Type constraint

contains on type/length etc of values | such as varchar(128) or int ...

What is a constraint?

A constraint is a condition when it is false, the resulting set should be an empty set. E.g. employee gender should only have 'M' or 'F' as value, thus select * from employee where gender != 'M' ANd gender != 'F' should result in an empty set. We would say the constraint is satisfied. The DBMS should always be in a a condition where all constraints are met after each CRUD transaction. E.g. the DBMS can run above select statement after each INSERT operation to make sure the new entries are also only have "F" or "M" values in gender attribute.

What are the operators in Relational Algebra

union, intersection, difference, selection, projection, Cartesian product, natural join, theta-join, and renaming.

Database Flashcards

(33 cards)