Writing complex query Flashcards
In this complex query which will sql process first, the first or second SELECT query?
- Find products that are more
- expensive than lettuce (id = 3)
SELECT *
FROM products
WHERE unit_price > (
SELECT unit_price
FROM products
WHERE product_id = 3
)
parenthesis will process first, the subquery in this query
- Find products that are more
- expensive than lettuce (id = 3)
SELECT *
FROM products
WHERE unit_price > (
SELECT unit_price
FROM products
WHERE product_id = 3
)
True or False
A subquery is limited within a WHERE clause
False
- exercise
- In sql_hr database
- Find employees who earn more than average
SELECT *
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
)
notice the avg calculation is performed first in the parenthesis within a sub query
This query is returning what information
– Find products that have never been ordered
SELECT DISTINCT product_id
FROM order_items

A list of items that have been ordered by product_id
This query is returning what?
– Find products that have never been ordered
SELECT *
FROM products
WHERE product_id NOT IN (
SELECT DISTINCT product_id
FROM order_items
)

All the items in NOT IN the subquery list of distinct items ordered, so it’s excluding product_id sweet pea sprouts not being ordered.

Explain the steps in completing this exercise
– Find clients without invoices
SELECT *
FROM clients
WHERE client_id NOT IN (
SELECT distinct client_id
FROM invoices
)
- get all the distinct client_id in the invocies table
SELECT distinct client_id
FROM invoices
- then exclude the clients from the client table NOT IN this list created by the subquery
SELECT *
FROM clients
WHERE client_id NOT IN
True or False
Many subqueries and joins can be performed interchangeablely
True
Cite the two types of joins
inner and outter
Same results or not?
– SELECT *
– FROM clients
– WHERE client_id NOT IN (
– SELECT distinct client_id
– FROM invoices
– )
SELECT *
FROM clients
LEFT JOIN invoices USING (client_id)
WHERE invoice_id IS NULL
Same results
Given the tables involved explain this query

– Find customers who have ordered lettuce (id=3)
– Select customer_id, first_name, last_name
select *
from customers
where customer_id in (
select o.customer_id
from order_items oi
join orders o using (order_id)
where product_id = 3
)
subquery with a join on customer_id and order_id on the different tables

Given the same two table explain this query

select distinct customer_id,
first_name,
last_name
from customers c
join orders o using (customer_id)
join order_items oi using (order_id)
where oi.product_id = 3
The two joins here are clear and filtered result from the where clause, easier reading than a subquery route and distinct being used cause a customer appeared more than once in the query.

Memorize the steps here and understand the question being asked.
- select invoices larger than all invoices of
- client 3

Returning all the clients that have a invoice_total > than client #3
select *
from invoices
where invoice_total > (
select max(invoice_total)
from invoices
where client_id = 3
)
Will this query work without the ALL clause?
– select invoices larger than all invoices of
– client 3
select *
from invoices
where invoice_total > all (
select invoice_total
from invoices
where client_id = 3
)
No, the ALL clause is calculating all the values instead of just returning a single value in a where clause

True or False
the MAX and ALL sql clauses are comprable
True
What is the count(*) clause doing here?
– select clients with at least two invoices
select *
from clients
where client_id IN (
select client_id
from invoices
group by client_id
having count(*) >= 2
)
count everything, can see the direct results by including it in the select clause
select client_id, count(*)
then remove the count(*) column with the not needed tally

True or False, this query is equalivalent
WHERE client_id IN (subquery)
WHERE client_id = ANY (subquery)
True same results
This is an example of what type of sql query?
select *
from employees e
where salary > (
select avg(salary)
from employees
where office_id = e.office_id
)
Correlated query, meaning that the outter sub query will be executed based on the number of records being returned. This not returning a single avg across the entire table but avg based on office_id.
Exercise

- Get invoices that are larger than the
- clients average invoice amount
select *
from invoices i
where invoice_total >(
select avg(invoice_total)
from invoices
where client_id = i.client_id
)
True or False
In the following query the bold portion is considered the inner join
select *
from invoices i
where invoice_total >(
select avg(invoice_total)
from invoices
where client_id = i.client_id
)
False
True or False
In the following query the bold portion is considered the outter join
select *
from invoices i
where invoice_total >(
select avg(invoice_total)
from invoices
where client_id = i.client_id
)
False
Refactor this query using the EXISTS clause.
– select clients that have an invoice
select *
from clients
where client_id IN (
select distinct client_id
from invoices
)
select *
from clients c
where exists (
select client_id
from invoices
where client_id = c.client_id
)
True or False
The EXISTS clause is much more efficient than the IN clause
True
Explain why the EXISTS clause is more efficient than the IN clause?
select *
from clients c
where exists (
select client_id
from invoices
where client_id = c.client_id
)
the boolean is first returned to the EXISTS clause, then the subquery where clause is performed, unlike the
where client_id IN
would return all the records no matter how many rows are affected
Exercise
- find the products that have never been ordered
- use the sql_store database
select *
from products p
where not exists(
select product_id
from order_items
where product_id = p.product_id
)
Write this SELECT subquery to return this table
select
invoice_id,
invoice_total,
(select avg(invoice_total)
from invoices) as invoice_average,
invoice_total - (select invoice_average) as difference
from invoices
Write this SELECT subquery to return this table
ERROR!
select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales
(select avg(invoice_total) from invoices) as average
(select total_sales - average) as difference
from clients c
True or False
You can use an alias in a sql function
False
Write a select subquery that would return this result

select
invoice_id,
invoice_total,
(select avg(invoice_total)
from invoices) as invoice_average,
invoice_total - (select invoice_average) as difference
from invoices
Exercise
Write a query that would return this result. Total sales are calculated per client.

select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales
from clients c
Just study how to do this

select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales,
(select avg(invoice_total) from invoices) as average,
(select total_sales - average) as difference
from clients c
notice no aggregate funcs is being performed on the alias’s
select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales,
(select avg(invoice_total) from invoices) as average,
(select total_sales - average) as difference
from clients c
True or False
When using a subquery in the FROM clause an alias is always required?
True, even if the alias isn’t being used
select *
from (
select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales,
(select avg(invoice_total) from invoices) as average,
(select total_sales - average) as difference
from clients c
) as what_the_fuck
Just study this from subquery, it’s best to keep sales_summary as a view to use on other virtual tables.

select *
from (
select
client_id,
name,
(select sum(invoice_total)
from invoices
where client_id = c.client_id) as total_sales,
(select avg(invoice_total) from invoices) as average,
(select total_sales - average) as difference
from clients c
) as sales_summary
where total_sales is not null