Lecture 7- Aggregation and Grouping Flashcards
What is an aggregate?
Aggregate function that take a collection (a set or multiset) of values as input and return a single value
What are the five standard built in aggregate functions?
Describe these operations on sets:
- COUNT ([DISTINCT] A)
- SUM ([DISTINCT] A)
- AVG ([DISTINCT] A)
- MAX (A)
- MIN (A)
- The number of (unique) values in the A column
- The sum of all (unique) values in the A column
- The average of all (unique) values in the A column
- The maximum value in the A column
- The minimum value in the A column
What is basic aggregation?
- Count is frequently used to count the number of tuples in a relation
- The notation for this function in SQL is count(* )
Describe the use of basic aggregation and the use of DISTINCT
- It is not allowed in SQL to use DISTINCT with COUNT(asterix)
- If count does not include DISTINCT then COUNT(asterix) = COUNT(x), where x is any set of attributes
- Retaining duplicates is important in computing an average
What is this query doing?
customer (cname, street, city)
Find the number of customers in Edmonton
What is this query doing?
branch (bname, address, city, assets)
Find the total assets of branches in Edmonton
What is this query doing?
deposit (accno, cname, bname, balance)
Find the number of different branches where John Doe has a deposit account
What is this query doing?
loan (accno, cname, bname, amount)
Find the name of the customer with the higest loan amount
What is this query doing?
customer (cname, street, city)
deposit (accno, cname, bname, balance)
Find the number of customers who have deposit accounts in at least 3 different branches
What is this query doing?
branch (bname, address, city, assets)
Find the names of branches which have assets greater than the average assets of all branches
How to compute the number of customers per city?
- Fire off a seperate query for each city
- Define a special grouping operator
SELECT COUNT(asterix)
FROM customer
WHERE city = ‘Edmonton’
SELECT city, COUNT(cname)
FROM customer
GROUP BY city
What is GROUP BY used for?
- The GROUP BY clause is used in SQL to group rows in a relation based on give attributes
- The attributes given in the GROUP BY clause are used to form groups called subgroups
If there is a ________ in the SELECT statement, then the GROUP BY clause must be placed after it
WHERE clause
What type of query is this?
GROUP BY with aggragate functions
What is the rule of GROUP BY with aggregate functions?
Every column name in the SELECT clause must appear in the GROUP BY clause (this is not required if the column name is only used in aggregate function)
What is the HAVING clause used for?
- Used to specify a filter condition for the group generated by the GROUP BY clause
- It is used in SQL statements to specify a filter conditionfor groups of rows or aggregates
What does HAVING clause do?
HAVING is applied to each group seperately to determine if the group as a whole qualifies
What is this query doing?
branch (bname, address, city, assets)
- Find cities with more than two bank branches
- GROUP BY city partitions tuples into groups where all tuples in a group have the same values in city column
- HAVING is applied to each group separately to determine if the group as a whole qualifies
What is this query doing?
loan (accno, cname, bname, amount)
For every branch, list the branch name and the name of every customer who has more than 3 loans each over $100,000 in that branch
Describe the evaluation of GROUP BY with HAVING