Kaggle SQL Flashcards

1
Q

What is one thing to note about GROUP BY

A

One thing to note is that if you SELECT a column that you don’t pass to 1) GROUP BY or 2) use as input to an aggregate function, you’ll get an error. So this query won’t work, because the Name column isn’t either passed to either an aggregate function or a GROUP BY clause:

NOT A VALID QUERY! “Name” isn’t passed to GROUP BY
# or an aggregate function
SELECT Name, Animal, COUNT(ID)
FROM bigquery-public-data.pet_records.pets
GROUP BY Animal

If you make this error, you’ll get the error message SELECT list expression references column (column’s name) which is neither grouped nor aggregated at.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does HAVING do?

A

Another option you have when using GROUP BY is to specify that you want to ignore groups that don’t meet certain criteria. So this query, for example, will only include groups that have more than one ID in them:

SELECT Animal, COUNT(ID)
FROM bigquery-public-data.pet_records.pets
GROUP BY Animal
HAVING COUNT(ID) > 1

As a result, this query will return a table with only one row, since this there only one group remaining. It will have two columns: one for “Animal”, which will have “Cat” in it, and one for COUNT(ID), which will have 2 in it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly