SQL 2 Flashcards
SELECT trips.date,
trips.dropoff,
riders.username
FROM trips
LEFT JOIN riders
ON trips.rider_id = riders.id;
this LEFT JOIN only returns columns you want
In our airplanes database, you’d like to know which flights had an origin airport with an elevation greater than 2000 feet. We can do this with a subquery.
SELECT *
FROM flights
WHERE origin in (
SELECT code
FROM airports
WHERE elevation > 2000);
non-correlated subquery
a subquery that can be run independently of the outer query and as we saw, can be used to complete a multi-step transformation.
but we can also perform transformations on a single table. For instance, sometimes we need to aggregate in multiple steps - like taking an average of a count.
in the e.g. The inner query provides the distance flown by day, and the outer query uses the inner query’s result set to compute the average by day of week of a given month.
SELECT a.dep_month,
a.dep_day_of_week,
AVG(a.flight_distance) AS average_distance
FROM (
SELECT dep_month,
dep_day_of_week,
dep_date,
sum(distance) AS flight_distance
FROM flights
GROUP BY 1,2,3
) a
GROUP BY 1,2
ORDER BY 1,2;
correlated subquery
In a correlated subquery, the subquery can not be run independently of the outer query. The order of operations is important in a correlated subquery:
A row is processed in the outer query.
Then, for that particular row in the outer query, the subquery is executed.
This means that for each row processed by the outer query, the subquery will also be processed for that row. In this example, we will find the list of all flights whose distance is above average for their carrier.
SELECT id
FROM flights AS f
WHERE distance > (
SELECT AVG(distance)
FROM flights
WHERE carrier = f.carrier);
By default, the UNION operator selects only distinct values.What if we wanted to allow duplicate values?
We can do this by using the ALL keyword with UNION, with the following syntax:
SELECT column_name(s) FROM table1
UNION ALL
SELECT column_name(s) FROM table2;
utilize a subquery to find the average sale price over both order_items and order_items_historic tables.
SELECT id, avg(a.sale_price) FROM (
SELECT id, sale_price FROM order_items
UNION ALL
SELECT id, sale_price FROM order_items_historic) AS a
GROUP BY 1;
Pay attention to “AS a “ here. “a” is an alias assigned to the new table created by the Union. the letter does not have to be “a”, it could be anything you want to use to name the new table
INTERSECT
INTERSECT is used to combine two SELECTstatements, but returns only common rows returned by the two SELECT statements.
For instance, we might want to know what brands in our newly acquired store are also in our legacy store. We can do so using the following query:
SELECT brand FROM new_products
INTERSECT
SELECT brand FROM legacy_products;
EXCEPT
EXCEPT is constructed in the same way, but returns distinct rows from the first SELECTstatement that aren’t output by the second SELECT statement.
Suppose we want to see if there are any categories that are in the new_products table that aren’t in the legacy_products table. We can use an EXCEPTquery to perform this analysis:
SELECT category FROM new_products
EXCEPT
SELECT category FROM legacy_products;
in table airports, there’s a column elevation, but no elevation_tier. what code returns the table below?
elevation_tier COUNT(*)
High 103
Low 243
Medium 298
SELECT
CASE
WHEN elevation < 500 THEN ‘Low’
WHEN elevation BETWEEN 500 AND 1999 THEN ‘Medium’
WHEN elevation >= 2000 THEN ‘High’
ELSE ‘Unknown’
END AS elevation_tier
, COUNT(*)
FROM airports
GROUP BY 1;
In the above statement, END is required to terminate the statement, but ELSE is optional. If ELSE is not included, the result will be NULL.
explain what the code does
SELECT state,
COUNT(CASE WHEN elevation >= 2000 THEN 1 ELSE NULL END) as count_high_elevation_aiports
FROM airports
GROUP BY state;
identify the total amount of airports with high elevation by state
if we wanted to sum the total flight distance and compare that to the sum of flight distance from a particular airline (in this case, United Airlines) by origin airport, we could run the following query:
SELECT origin, sum(distance) as total_flight_distance, sum(CASE WHEN carrier = ‘UA’ THEN distance ELSE 0 END) as total_united_flight_distance
FROM flights
GROUP BY origin;
Oftentimes we’d like to combine aggregates, to create percentages or ratios.
In the instance of the last query, we might want to find out the percent of flight distance that is from United by origin airport. We can do this simply by using the mathematical operators we need in SQL:
SELECT origin,
100.0*(sum(CASE WHEN carrier = ‘UN’ THEN distance ELSE 0 END)/sum(distance)) as percentage_flight_distance_from_united FROM flights
GROUP BY origin;
Imagine that each dessert in our baked_goods table is inspected 2 hours, 30 minutes, and 1 day after the manufacture time. To derive the inspection date for each baked good, we can use the following query
SELECT DATETIME(manufacture\_time, '+2 hours', '30 minutes', '1 day') as inspection\_time FROM baked\_goods;
Numeric functions can be used to transform numbers. Some common SQLite mathematical functions are included below that take numeric data types as inputs:
SELECT (number1 + number2);: Returns the sum of two numbers. Similar, SQL can be used for subtraction, multiplication, and division.
SELECT CAST(number1 AS REAL) / number3;: Returns the result as a real number by casting one of the values as a real number, rather than an integer.
SELECT ROUND(number, precision);: Returns the numeric value rounded off to the next value specified. e.g. SELECT ROUND(ingredients_cost, 4) as rounded_cost FROM baked_goods;
A couple more useful numeric SQL functions are included below: MAXand MIN.
`
MAX(n1,n2,n3,…): MIN(n1,n2,n3,…)