L3: SQL Basics Flashcards
How to avoid
Tuple Variable Ambiguity (same name)
in Multi-table Operations?
Explicitly state the table name followed by the variable name:
table.var
Select Distinct fields in the table,
use:
SELECT DISTINCT t1.var1 t2.var3, etc
What is Tuple Variable Ambiguity?
When two different tables (tuples) have the same variable name.
SQL:
Joining Two Tables
- Returns all unique combinations of their tuples which meet some specified join condition
Two Ways:
- Declare both tables in the “FROM” section:
SELECT *
FROM Product, Company
WHERE Manufacturer = CName
- Explicitly declare a JOIN between the tables on a condition:
SELECT *
FROM
Product JOIN Company ON Manufacturer=CName
- Both ways return a new tuple with the combinations
SQL Query:
Basic Form
(SFW Query)
“Select From Where” Query:
SELECT <attributes></attributes>
FROM <one></one>
WHERE <conditions></conditions>
Table Schemas:
Schema
The Schema of a table includes:
- Table Name
- Attributes, one or more underlined as key
- Attribute types
Format:
RelationName( attr1: type, attr2: type, …);
Table Schemas:
Key
The Key
of a Table Schema
is an attribute whose values are unique for each entry.
Indicated by underlining the attribute name:
Table( myKey: int, attr2: string, …)
SQL Operations:
Creating a Table
CREATE TABLE Name(
param1 CHAR(20),
param2 CHAR(10),
PRIMARY KEY(param1),
FOREIGN KEY(param2) REFERENCES Table2(key)
)
- Each parameter may be given a maximum size
- Can define one or more parameters as a primary key
- Declare foreign keys that reference other tables
Simple SQL Query:
Eliminating Duplicate Tuples
Use the DISTINCT command:
If there are more than one of the same value for a parameter, only the first tuple containing it is presented.
SELECT DISTINCT Category
FROM Product
SQL Queries:
Selection
Overview
Selection is the process of
filtering a relation’s tuples
on some condition
Get all the tuples that meet the condition(s)
specified by the WHERE command:
SELECT *
FROM myTable
WHERE
SQL Queries:
Projection
Projection is the operation of
producing an output table
with tuples that have a subset
of their prior attibutes
The returned set of tuples only has
the attributes specified by SELECT.
Can use SELECT * to get all attributes
Order of Operations (Semantics)
in a Multitable Operation
- Put tables together:
- FROM table1, table2…
- Apply Selections/Conditions
- Filters out unwanted tuples
- WHERE ….
- Apply Projection
- Only get required attributes
- SELECT …
Simple SQL Query:
Sorting the Results
Use the ORDER BY command
at the end of the query
- Orders the tuples in ascending order
- Can specify descending order with DESC keyword
- Ties are broken by the second attribute on the ORDERBY list
Example:
SELECT Pname, Price, Manufacturer
FROM Product
WHERE Category=’gizmo’ AND Price>50
ORDER BY Price, PName
Simple SQL Query:
Matching String Patterns:
- Keyword
- Associated Symbols
Use the LIKE keyword
in the Condition Statement
- Finds strings that match a given pattern
Symbols:
- % - matches any sequence of characters
- _ - matches any single character
Usage:
SELECT *
FROM Product
WHERE Pname LIKE ‘%gizmo%’
Multi-Table SQL Queries:
Foreign Keys
A Foreign Key is a parameter of a tuple in a relation table.
It corresponds to the Primary Keys of another table.
Essentially, it is a reference to a tuple in another table.
True or False:
SQL commands and values are
NOT case sensitive
FALSE!
- SQL Commands are NOT case sensitive:
- SELECT = Select = select
However,
- SQL Values ARE case sensitive:
- ‘Seattle’ ≠ ‘seattle’
Key Constraints
- A key is a minimal subset of attibutes
- It acts as a unique identifier for tuples in a relation
- A key is an Implicit Constraint on the tuples that are allowed to be in the relation
- No two tuples can have the same key
- If two tuples have the same key, they must be the same tuple
Tables in SQL
Relations are represented by tables.
A relation is a multiset of tuples,
having the attributes specified by the schema.
- The Columns of the table represent the attributes
- Each row represents one tuple
Data Types
in SQL
- All attributes must have an atomic type, the tables are “flat”
- Character Types:
- CHAR - set number of characters
- CHAR(20) or CHAR(30)
- VARCHAR - variable number of characters, up to some maximum :
- VARCHAR(50)
- CHAR - set number of characters
- Numerical Values:
- INT
- SMALLINT
- BIGINT
- FLOAT
- Other Common types:
- MONEY
- DATETIME
What happens if
the Entry that corresponds to a Foreign Key
is deleted?
There are multiple options,
implementation is up to the Database Administrator
- Disallow the delete operation
- This is pretty restrictive
- Remove all tuples that reference that entry
- Probably a lot of unnecessary deletions
- Replace the Foreign Key with NULL
- Most likely
Foreign Keys
as Constraints
Foreign Keys can act as constraints:
- Cannot insert a tuple who’s foreign key doesn’t exist as a primary key in the referenced table
Example:
A Student with sid=”123” must first exist in the “Students” table
before attempting to insert a tuple into the “Enrolled” table using sid=”123”
Tables in SQL:
Cardinality
Arity
- Cardinality
- The number of tuples/rows of the relation
- How many entries are there?
- Arity
- The number of attributes/columns of the relation
- How big are the entries?
What is
SQL?
Structured Query Language
- A standard language for querying and manipulating data
- A very High Level programming language
- Probably the world’s most successful Parallel Programming Language
- Is both:
- Data Definition Language( DDL )
- Data Manipulation Language( DML )
SQL
Data Definition Language (DDL)
Features
- Defines relational schema
- Create/alter/delete tables and attributes
SQL
Data Manipulation Language (DML)
Features
- Insert/Delete/Modify tuples in tables
- Query one or more tables
General Points about
Constraints
- In SQL, arbitrary constraints can be specified
- ex: “There cannot be 25 people in DB class”
- In practice, the fewer constraints the better
- More constraints lead to worse performance
- Lack of constraints may lead to
- more ugly, less convenient techniques
- but improves performance
Multiset:
Definition
Multiset
- An unordered list
- A set with multiple duplicate instances allowed
NULL
and
NOT NULL
NULL
Used when a value is unknown or does not exist
Placed just like a value for some attribute in a tuple
NOT NULL
In SQL, an attribute/column can be constrained to be NOT NULL.
This prohibits NULL values from being added to the table in that column.
Set Operations:
Top 3 Explicit Set Operations in SQL
and their Multiset Variations
- INTERSECT
- Entries existing in both tables
- Removes duplicate entries
- Multiset: INTERSECT ALL
- This does NOT remove duplicates
- UNION
- All entries in both tables
- Removes duplicates
- Multiset: UNION ALL
- Does NOT remove duplicates
- EXCEPT
- Entries that are in one table, not the other
- Multiset: EXCEPT ALL
- Does NOT remove duplicates
SQL Aggregation Operations
SUM
COUNT
MIN
MAX
AVG
- Except for COUNT, all aggregations apply to a single attribute
- Aggregation operations are applied in PROJECTION (meaning after SELECT):
SELECT COUNT(category)
SQL Command Structure
for a Nested Query
- Use the IN keyword, followed by a subquery in parenthesis
- May name the table returned by a subquery which can then be referenced by the base query
SELECT *
FROM s
WHERE s.a IN ( <subquery></subquery>
SELECT t.a
FROM t
WHERE … )
SQL Keywords:
Common Conditional Operation Keywords
ALL
ANY
EXISTS
SQL Keywords:
Conditional Operation Keywords:
ALL
ALL
Use: S > ALL R
Used to check against ALL tuples in a table
The example above checks that S is larger than all tuples in R
SQL Keywords:
Conditional Operation Keywords:
ANY
ANY
Used to check if any one tuple meets the condition.
Use:
S < ANY R
is true if any tuple in R is larger than S
SQL Keywords:
Conditional Operation Keywords:
EXISTS
EXISTS
Used to check that something exists
Use:
EXISTS R
checks that tuples exist in R
SQL Command:
Intersection of Table R and S
Perform 2 Queries that get all entries in both tables, with an INTERSECT command in between the queries
SELECT *
FROM R
INTERSECT
SELECT *
FROM S
Usually, it should be more specific, such as checking only a single attribute
Database Set Notation:
λ(X)
λ(X)
“Count of the tuple in multiset X”
The number of instances of a specific tuple.
Multiset Operations:
Count Relationship
for
Intersection of Tables
For the intersection of tables X and Y:
Count: λ(Z) = min( λ(X) , λ(Y) )
Multiset Operations:
Count Relationship
for
Union of Tables
For the intersection of tables X and Y:
Count: λ(Z) = λ(X) + λ(Y)
SQL Command:
Union of Table R and S
SELECT *
FROM R
UNION
SELECT *
FROM S
Difference between the
HAVING
and
WHERE
conditional clauses
HAVING
clauses contain conditions on aggregates
ex: HAVING SUM(quantity) > 100
WHERE
clauses condition on individual tuples
ex: WHERE quantity > 100
SQL is Compositional.
What does that mean?
Why is it important?
Meaning:
Everything(inputs and outputs) is represented as multisets
Importance:
This allows the output of one query to be used as the input of another.
Meaning we can perform Nested Queries
SQL Command:
EXCEPT
Remove attribute “A” of table R that are also in table S:
SELECT R.A
FROM R
EXCEPT
SELECT R.A
FROM R, S
WHERE R.A = S.A