Exam 3 Flashcards
SQL original name
SEQUEL: Structured English Query Language: Part of System R in 1973
SQL stands for
Structured Query Language
SQL is based on
Based on relational tuple calculus and some algebra
ANSI & ISO Standards for SQL
SQL/86, SQL 89, 92, SQL2, SQL3, Revisions in 2003, 2006, 2008, 2011…
SQL is supported by
IBM DB2, ORACLE, SYSBASE, SqlServer, MySQL, many more (sort of all support it)
INSERT statement for SQL
Insert into UserInterests(Email, Interest, SinceAge) Values(‘user12@gt.edu’, ‘Reading’, 5);
only will insert a single row at a time
can be rejected if you try to insert a row if the primary key already exists in the table
Delete statement in SQL
delete from UserInterests where Interest=”Swimming”;
deletes multiple rows on matching rows
Update statement in SQL
update UserInterests set Interest=’Rock Music’ where Email=”user3@gt.edu’ and Interest=’Music’;
can affect multiple rows if the condition succeeds
General SQL Query Syntax
Select column1, column2, … columnN from table1, table2, … tableM where condition;
column is the name of a column in some table like BirthYear
table is the name of a table like RegularUser
condition may compare values of columns to constraints or to each other like BirthYear > 1985 or CurrentCity = HomeTown
conditions can be combined with ANT, OR, Not, and ()
Selection and the * wildcard
Find all RegularUser’s
SELECT Email, BirthYear, Sex, CurrentCity, HomeTown from RegularUser;
or
Select * from RegularUser;
Selection with a WHERE clause
Find all RegularUser’s with HomeTown Atlanta
Select * from RegularUser where HomeTown = ‘Atlanta’;
returns all rows that matches “Atlanta” for hometown
Selection with composite WHERE clause
Find all RegularUser’s who have the same CurrentCity and HomeTown or who live in Atlanta
Select * from RegularUser WHERE CurrentCity = HomeTown OR HomeTown=’Atlanta’;
Returns rows that matches any of the two conditions
Projection in SQL
Find Email, BirthYear, and Sex for RegularUser’s in Atlanta
Select Email, BirthYear, Sex from RegularUser where HomeTown=’Atlanta’;
includes only the rows that you select. But does include duplicates by default!
Distinct - tables may have duplicate rows
Select Distinct(Sex) from RegularUser where HomeTown=’Atlanta’;
Eliminates duplicates like in relational algebra and calculus for all the columns that you mark as distinct
Natural Inner Join – and dot notation
Find Email, BirthYear, and Salary for RegularUser’s who have a salary by combining RegularUser data with YearSalary data
SELECT Email, RegularUser.BirthYear, Salary from RegularUser, YearSalary WHERE RegularUser.BirthYear = YearSalary.BirthYear
remember this only returns rows where the join condition is true
the dot notation is only necessary when there’s ambiguity
Remember we can also do NATURAL JOIN!
Natural Join in SQL!
SELECT Email, RegularUser.BirthYear, Salary from RegularUser NATURAL JOIN YearSalary;
This works just like using =
Natural Inner Join - Aliases
Select Email, R.BirthYear, Salary From RegularUser as R, YearSalary as Y Where R.BirthYear = Y.BirthYear;
Aliases save typing
Aliases are used to disambiguate table references
Aliases must be used when joining a table to itself!
Left Outer Join
Find Email, BirthYear, and Salary for RegularUser’s who have a Salary by combining RegularUser data with YearSalary data. Return Email and BirthYear even when the RegularUser has no Salary.
Select Email, RegularUser.BirthYear, Salary From RegularUser LEFT OUTER JOIN YearSalary;
All rows from the first table have NULL for the extra columns
String Matching
Find data about RegularUser’s who live in a CurrentCity that starts with “San”
SELECT Email, Sex, Current City FROM RegularUser WHERE CurrentCity LIKE ‘San%’;
% matches any string, including the empty string
Wildcard %
matches any string, including the empty string
Wildcard _
matches any single character. ‘A____’ A with 4 characters
Sorting! (oh no!)
Find data about RegularUser’s who are Males. Sort the data by ascending CurrentCity
Select Email, Sex, CurrentCity From RegularUser Where Sex=’M’ Order By CurrentCity Asc;
sorts it in alphabetical order
Can sort on multiple columns (one by asc, another with desc)
Set Operations - Union
Find all CurrentCity’s and HomeTowns without duplicates
Select CurrentCity from RegularUser UNION Select HomeTown From RegularUser;
NO DUPLICATES!
What Union command returns duplicates?
Select CurrentCity from RegularUser Union All Select HomeTown from RegularUser;
Set Operations - Intersect
Find all cities that are CurrentCity’s for someone and HomeTown’s for someone. Do not include duplicates.
SELECT CurrentCity FROM RegularUser INTERSECT Select HomeTown FROM RegularUser;
What Intersection command returns duplicates?
SELECT CurrentCity FROM RegularUser INTERSECT ALL Select HomeTown FROM RegularUser;
If a value intersects and appears twice in the same column, it will be listed twice with ALL
Set Operations - Except
Find all cities that are CurrentCity’s but exclude those that are HomeTown’s. Do not include duplicates.
SELECT CurrentCity FROM RegularUser EXCEPT SELECT HomeTown FROM RegularUser;
Except with duplicates?
SELECT CurrentCity FROM RegularUser EXCEPT ALL SELECT HomeTown FROM RegularUser;
If a value intersects is in the first column and appears twice in the that column, it will be listed twice with ALL
List some built in functions
count, sum, avg, min, max
Count the number of RegularUser’s
Select count(*) FROM RegularUser;
Find Email and BirthYear for the Youngest Female RegularUser
Select Email, max(BirthYear) from RegularUser where Sex=’F’;
Group By
Group UserInterests on Email.
Select Email, UserInterests Group By Email
Group By (2)
For each group, return the email, the number of interests, and the average sinceage for each group
group userinterests on email
sort the result by ascending number of interests
SELECT Email, count(*) AS NumInt, avg(SinceAge), AvgAge FROM UserInterests GROUP BY Email ORDER BY NumInt ASC;
Having - condition on the group
having is like a “where” but for Group By
Nested Queries - In/Not In
IN means in the result rows of the subquery
Find email and interest for RegularUser’s in Atlanta using nested queries
Select Email, Interest from UserInterests Where Email IN (Select Email FROM RegularUser WHERE HomeTown = ‘Atlanta’);
As an alternative for this, we can just use a natural join
Nested Queries: =, , etc
Find CurentCity’s with at least one RegularUser with a Salary that’s higher than all Salaries of RegularUsers with Hometown Austin
SELECT CurrentCity FROM User R, YearSalary Y WHERE R.BirthYear = Y.BirthYear AND Salary > ALL (SELECT Salary FROM RegularUser R, YearSalary Y Where R.BirthYear = Y.BirthYear AND HomeTown = ‘Austin’);
Nested Queries - Correlated
Find Email and BirthYear of RegularUsers who have no Interests
Select R.Email, BirthYear from RegularUser R Where not exist (Select * From UserInterests U Where U.Email = R.Email)
R does not exist in inner query. It is a “reference out of scope”
think of it as a sub-query evaluated once for each row of the outer query