INNER JOIN (chinook.db) Flashcards
SPECIAL ANNOUNCEMENT
The “chinook.db” is very well-known because it is the open source DB that is part of SQLite. Lots of tables are linked together via PK/FK, so this makes an ideal way to practice JOINs.
VERY IMPORTANT: In order to carry out these exercises, it will be VERY important that you have a copy of the physical data model for chinook.db, so you can see how all the tables are interconnected.
Please use this deck in good health and good happiness!
Print out a list of all the customers, first name, last name, and country
SELECT FirstName, LastName, Country FROM customers;
How many customers are in India?
SELECT * FROM customers WHERE Country = “India”;
Print out a list of all the countries and how many customer are in each country.
Hopefully you will remember this is a very, very easy example of the paradigm with COUNT() and GROUP BY:
SELECT Country, COUNT ( Country )
FROM customers
GROUP BY Country;
Print out a list of all the countries, how many customers are in each country, and order the list from low to high
Hopefully you will remember this is a very, very easy example of the paradigm with COUNT() and GROUP BY:
SELECT Country, COUNT ( Country )
FROM customers
GROUP BY Country
ORDER BY COUNT ( Country ) ASC;
Print out a list of all the cities in the U.S. that have customers
SELECT City FROM customers WHERE
Country = “USA”;
Print out a list of all the cities in the US and how many customers are in each city
Again, if you remember the paradigm COUNT () and GROUP BY, this is very very easy:
SELECT City, COUNT ( City)
FROM customers WHERE
Country = “USA”
GROUP BY City;
Print out a list of all customers and the date that they made a purchase
Here you will need 2 tables: “customers” for the customer information, and “invoices” for the invoice information.
You’ll want a list of ALL the customers, so this means to do a LEFT JOIN. The attribute (column) CustomerID is what links the two tables together, so that will be what we use for ON.
SELECT customers.FirstName, customers.LastName, invoices.InvoiceDate
FROM customers
LEFT JOIN invoices
ON
customers.CustomerID = invoices.CustomerID;
The table “tracks” has a nice attribute “Name” which is the name of the song. It would be nice to know, which people bought which songs.
To do this, we need a number of LEFT JOINS: one LJ to link “customers” with “invoices,” then another LJ to link “invoices” with “invoice_items,” then another LJ to link “invoice_items” with “tracks.”
But . . . it is easy! Just one LEFT JOIN after the other.
Write out the SQL query that prints a list of all customers (first name, last name) and song that they purchased.
Let’s take this one step at a time. To print out the names, it looks like this:
SELECT FirstName, LastName FROM customers;
Next, we can LEFT JOIN with “invoices,” like this:
SELECT FirstName, LastName
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID;
We’re one step closer, but not yet there. Now we need a LEFT JOIN with “invoice_items” using the attribute “InvoiceID” that links the two tables:
SELECT FirstName, LastName
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID;
OK, we are almost there. Now we need the final LEFT JOIN, with “tracks” using “TrackID” to connect “tracks” with “invoice_items”, like this:
SELECT FirstName, LastName
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID
LEFT JOIN tracks
ON tracks.TrackID = invoice_items.TrackID;
This is EXACTLY what we want . . . but we need to print out the track name, like this:
SELECT FirstName, LastName, Name
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID
LEFT JOIN tracks
ON tracks.TrackID = invoice_items.TrackID;
This only works because “FirstName,” and “LastName,” and “Name” are unique. It would be better to write this:
SELECT
customers.FirstName,
customers.LastName,
tracks.Name
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID
LEFT JOIN tracks
ON tracks.TrackID = invoice_items.TrackID;
In the previous query we printd out (LastName, FirstName, Name of song).
It would be nice to know, how many songs were purchased? So print out a table of the song names and how many times they were purchased.
You have to admit, this is a very reasonable request. Just about everyone wants to know, how many of each song was purchased? What were the most popular songs?
This sounds horrible - and maybe it is? But it is simple enough. We just take the query we wrote before, then we apply out COUNT () and GROUP BY paradigm.
Here is the original query:
SELECT
customers.FirstName,
customers.LastName,
tracks.Name
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID
LEFT JOIN tracks
ON tracks.TrackID = invoice_items.TrackID;
Now let’s tweak this with COUNT() and GROUP BY, like this:
SELECT
tracks.Name,
COUNT ( tracks.Name )
FROM customers
LEFT JOIN invoices
ON invoices.CustomerID = customers.CustomerID
LEFT JOIN invoice_items
ON invoices.InvoiceID = invoice_items.InvoiceID
LEFT JOIN tracks
ON tracks.TrackID = invoice_items.TrackID
GROUP BY tracks.Name LIMIT 50;
DID YOU KNOW?
The Boston Consulting Group is one of the most prestigious management consulting groups in the world? They only take very incredible people. The prime minister of israel Benjamin Netanyaho worked for them. The singer John Legend worked for them.
People train for months to pass the BCG job interview, and there are many books about how to pass the interview.
Which artists have the most tracks?
Don’t be scared - let’s take this one step at a time. First, we could print out a table (artist, tracks). It’s not exactly what we need, but it is a step in the right direction.
Then we can use our COUNT() and GROUP BY trick to get what we want.
So, first the table, which requires a LEFT JOIN between “artists” and “albums” and “tracks,”
SELECT artists.Name, tracks.Name FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID;
At this point, it is recommended you look at the data, and here it can be helpful to use LIMIT if you need to, like this:
SELECT artists.Name, tracks.Name FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID LIMIT 20;
Now we do our trick with COUNT() and GROUP BY:
SELECT artists.Name, COUNT ( tracks.Name ) FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID GROUP BY tracks.Name;
It is a pretty long table, so scrolling is not very helpful. So it makes sense to ORDER BY, like this:
SELECT artists.Name, COUNT ( tracks.Name ) FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID GROUP BY tracks.Name ORDER BY COUNT ( tracks.Name ) ASC;
We can see there are a HUGE number of artists that have sold a single track, but only just a handful of artists have sold more than 1 track.
But that’s a great question! How many artists have sold 2 tracks or more? You might think about WHERE COUNT ( tracks.Name) > 1 . . . and you would be going down the right street! But remember with GROUP BY we need to using HAVING, so it looks like this:
SELECT artists.Name, COUNT ( tracks.Name ) FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID GROUP BY tracks.Name HAVING COUNT ( tracks.Name ) > 1 ORDER BY COUNT ( tracks.Name ) ASC;
NOW . . . not done yet. It might be nice to know “what are the artists” but our table has duplicates, so we can filter them out by eliminating the COUNT and using DISTINCT, like this:
SELECT DISTINCT artists.Name FROM artists LEFT JOIN albums ON artists.ArtistID = albums.ArtistID LEFT JOIN tracks ON tracks.AlbumID = albums.AlbumID GROUP BY tracks.Name HAVING COUNT ( tracks.Name ) > 1 ORDER BY COUNT ( tracks.Name ) ASC;
This is fun and you can see it gets easier by building on SQL queries you already know!
Playlists . . . that sounds like something fun. It would be interesting to know their names. Print them out, figure out how many there are:
To print their names,
SELECT Name FROM playlists;
OK, that’s a bit boring to be honest. Just labels like “classical” - not even the fun stuff like glam rock, punk rock, alternative rock, progressive rock. But ok, not our job.
We don’t even need COUNT in this case but if we did
SELECT COUNT ( Name ) FROM playlists;
Here’s something interesting. The table “tracks” has “composer”, what are the composers?
SELECT Composer FROM tracks;
Print out artists and albums, that’s pretty interesting
SELECT artists.Name, albums.Title FROM artists LEFT JOIN albums ON artists.ArtistID = albums.AlbumID;
I am totally confused. It shows “Alanis Morisette” with “Let there be rock”, which is actually an album from AC/DC in the year 1977. Alanis was born in 1974 so she would have been three years old when that album came out.
Does anyone understand this???