Intermediate SQL Flashcards by Mahsa Zamanifard

What are the aggregation functions used in SQL?

COUNT counts how many rows are in a particular column.
SUM adds together all the values in a particular column.
MIN and MAX return the lowest and highest values in a particular column, respectively.
AVG calculates the average of a group of selected values.

How well did you know this?

Not at all

Perfectly

What does COUNT command do?

COUNT is a SQL aggregate function for counting the number of rows in a particular column.

How well did you know this?

Not at all

Perfectly

Typing COUNT(1) has the same effect as COUNT(*). Which one you use is a matter of personal preference. True/False?

True

How well did you know this?

Not at all

Perfectly

What does below code do?
SELECT COUNT(high)
FROM tutorial.aapl_historical_stock_price

The code will provide a count of all rows in which the high column is NOT NULL.

How well did you know this?

Not at all

Perfectly

What does below code do?
SELECT COUNT(date) AS count_of_date
FROM tutorial.aapl_historical_stock_price

The column header for count(date) is changed from “count” to “count_of_date”.

How well did you know this?

Not at all

Perfectly

Does COUNT command, count null values?

How well did you know this?

Not at all

Perfectly

What does SUM do?

SUM is a SQL aggregate function. that totals the values in a given column.

How well did you know this?

Not at all

Perfectly

An important thing to remember: aggregators only aggregate vertically. If you want to perform a calculation across rows, you would do this with simple arithmetic. True/False?

True

How well did you know this?

Not at all

Perfectly

How does SUM treat nulls?

SUM treats nulls as 0.

How well did you know this?

Not at all

Perfectly

What do MIN / MAX commands do?

MIN and MAX are SQL aggregation functions that return the lowest and highest values in a particular column.

How well did you know this?

Not at all

Perfectly

MIN/MAX can be used on non-numerical columns. True/False, if True, how do they do it?

True.
MIN will return the lowest number, earliest date, or non-numerical value as close alphabetically to “A” as possible.
As you might suspect, MAX does the opposite—it returns the highest number, the latest date, or the non-numerical value closest alphabetically to “Z.”

How well did you know this?

Not at all

Perfectly

What does AVG command do?

AVG is a SQL aggregate function that calculates the average of a selected group of values.

How well did you know this?

Not at all

Perfectly

What does GROUP BY do?

GROUP BY allows you to separate data into groups, which can be aggregated independently of one another.

How well did you know this?

Not at all

Perfectly

When is it recommended to use numbers instead of column names for group by?

It’s generally recommended to do this only when you’re grouping many columns, or if something else is causing the text in the GROUP BY clause to be excessively long

How well did you know this?

Not at all

Perfectly

We use HAVING clause with aggregation, because WHERE doesn’t allow you to filter on aggregate columns. True/False?

True

How well did you know this?

Not at all

Perfectly

What is SQL’s way of handling if/then logic?

CASE

How well did you know this?

Not at all

Perfectly

The CASE statement is followed by at least one pair of WHEN and THEN statements. True/False?

True

How well did you know this?

Not at all

Perfectly

How do we end a CASE statement?

Every CASE statement MUST end with the END statement.

How well did you know this?

Not at all

Perfectly

The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements. True/False?

True

How well did you know this?

Not at all

Perfectly

What does below code do?
SELECT CASE WHEN year = ‘FR’ THEN ‘FR’
WHEN year = ‘SO’ THEN ‘SO’
WHEN year = ‘JR’ THEN ‘JR’
WHEN year = ‘SR’ THEN ‘SR’
ELSE ‘No Year Data’ END AS year_group,
COUNT(1) AS count
FROM benn.college_football_players
GROUP BY 1

It counts the number of players in each year_group

How well did you know this?

Not at all

Perfectly

What does below code do?
SELECT state,
COUNT(CASE WHEN year = ‘FR’ THEN 1 ELSE NULL END) AS fr_count,
COUNT(CASE WHEN year = ‘SO’ THEN 1 ELSE NULL END) AS so_count,
COUNT(CASE WHEN year = ‘JR’ THEN 1 ELSE NULL END) AS jr_count,
COUNT(CASE WHEN year = ‘SR’ THEN 1 ELSE NULL END) AS sr_count,
COUNT(1) AS total_players
FROM benn.college_football_players
GROUP BY state
ORDER BY total_players DESC

displays the number of players in each state, with FR, SO, JR, and SR players in separate columns and another column for the total number of players. Order results such that states with the most players come first.

How well did you know this?

Not at all

Perfectly

What does below code do?
SELECT CASE WHEN state IN (‘CA’, ‘OR’, ‘WA’) THEN ‘West Coast’
WHEN state = ‘TX’ THEN ‘Texas’
ELSE ‘Other’ END AS arbitrary_regional_designation,
COUNT(1) AS players
FROM benn.college_football_players
WHERE weight >= 300
GROUP BY 1

Counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (everywhere else).

How well did you know this?

Not at all

Perfectly

When do we use SELECT DISTINCT?

Study These Flashcards

You’ll occasionally want to look at only the unique values in a particular column. You can do this using SELECT DISTINCT syntax

What will happen if you include two (or more) columns in a SELECT DISTINCT clause?

Study These Flashcards

Your results will contain all the unique pairs of those two columns

You only need to include DISTINCT once in your SELECT clause—you do not need to add it for each column name. True/False?

True

Which aggregation function is used with DISTINCT more commonly?

You can use DISTINCT when performing an aggregation. You'll probably use it most commonly with the COUNT function.

DISTINCT goes outside the aggregate function rather than at the beginning of the SELECT clause. True/False?

False, DISTINCT goes INSIDE the aggregate function rather than at the beginning of the SELECT clause.

SUM, AVG, MIN, MAX have very little practical use with DISTINCT, True/False?

True, SUM or AVG the distinct values in a column, but there are fewer practical applications for them. For MAX and MIN, you probably shouldn't ever use DISTINCT because the results will be the same as without DISTINCT, and the DISTINCT function will make your query substantially slower to return results.

It's worth noting that using DISTINCT, particularly in aggregations, can slow your queries down quite a bit. True/False

True

How can we give a table an alias?

benn.college_football_teams teams

Once you've given a table an alias, you can refer to columns in that table in the SELECT clause using the alias name. True/False?

True

How can we select columns using table aliases? Code

SELECT teams.conference AS conference, AVG(players.weight) AS average_weight

What should we do when 2 tables have columns with the same name and we want to join these tables and see these 2 columns?

The results can only support one column with a given name—when you include 2 columns of the same name, the results will simply show the exact same result set for both columns even if the two columns should contain different data. You can avoid this by naming the columns individually. SELECT players.school_name AS players_school_name, teams.school_name AS teams_school_name FROM benn.college_football_players players JOIN benn.college_football_teams teams ON teams.school_name = players.school_name

Why is right join rarely used?

RIGHT JOIN is rarely used because you can achieve the results of a RIGHT JOIN by simply switching the two joined table names in a LEFT JOIN

Why do we use COUNT(DISTINCT companies.permalink) to count distinct company names instead of using COUNT(DISTINCT companies.company_name) on MODE website?

Because it returned results much faster

What’s the difference between below two codes? FROM tutorial.crunchbase_companies companies LEFT JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink AND acquisitions.company_permalink != '/company/1000memories' FROM tutorial.crunchbase_companies companies LEFT JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink WHERE acquisitions.company_permalink != '/company/1000memories' OR acquisitions.company_permalink IS NULL .

In the first one the conditional statement AND... is evaluated before the join occurs. You can think of it as a WHERE clause that only applies to one of the tables. In the second one, the filter happens after the tables are joined

How can we count NULL values in a table in SQL? What should we write in the SELECT section?

By using COUNT(CASE...) For example: COUNT(CASE WHEN column IS NULL THEN ‘It’s empty’)

What's the difference between join and union?

SQL joins allow you to combine two datasets side-by-side, but UNION allows you to stack one dataset on top of the other. Put differently, UNION allows you to write two separate SELECT statements, and to have the results of one statement display in the same table as the results from the other statement.

"UNION only appends distinct values." What does this mean?

When you use UNION, the dataset is appended, and any rows in the appended table that are exactly identical to rows in the first table are dropped.

What can we do when we want all the values (even duplicates) in the second table, to be appended to the first table?

If you'd like to append all the values from the second table, use UNION ALL.

You'll likely use UNION ALL far more often than UNION. True/False?

True

What are the rules of appending datasets in SQL?

SQL has strict rules for appending data: 1- Both tables must have the same number of columns 2- The columns must have the same data types in the same order as the first table

How can we create a column and fill it with a fixed value?

SELECT 'investments_part1' AS dataset_name, FROM table

What are 2 reasons for using multiple keys for joining datasets?

There are couple reasons you might want to join tables on multiple foreign keys. The first has to do with accuracy. The second reason has to do with performance.

What does below code do? SELECT DISTINCT japan_investments.company_name, japan_investments.company_permalink FROM tutorial.crunchbase_investments_part1 japan_investments JOIN tutorial.crunchbase_investments_part1 gb_investments ON japan_investments.company_name = gb_investments.company_name AND gb_investments.investor_country_code = 'GBR' AND gb_investments.funded_at > japan_investments.funded_at WHERE japan_investments.investor_country_code = 'JPN' ORDER BY 1

to identify companies that received an investment from Great Britain following an investment from Japan.

You only need to include DISTINCT once in your SELECT clause—you do not need to add it for each column name. True/False?

True

What will happen If you include two (or more) columns in a SELECT DISTINCT clause?

Your results will contain all the unique pairs of those two columns

When do we use SELECT DISTINCT?

You'll occasionally want to look at only the unique values in a particular column. You can do this using SELECT DISTINCT syntax

What does below code do? SELECT CASE WHEN state IN ('CA', 'OR', 'WA') THEN 'West Coast' WHEN state = 'TX' THEN 'Texas' ELSE 'Other' END AS arbitrary_regional_designation, COUNT(1) AS players FROM benn.college_football_players WHERE weight >= 300 GROUP BY 1

Counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (everywhere else).

What does below code do? SELECT state, COUNT(CASE WHEN year = 'FR' THEN 1 ELSE NULL END) AS fr_count, COUNT(CASE WHEN year = 'SO' THEN 1 ELSE NULL END) AS so_count, COUNT(CASE WHEN year = 'JR' THEN 1 ELSE NULL END) AS jr_count, COUNT(CASE WHEN year = 'SR' THEN 1 ELSE NULL END) AS sr_count, COUNT(1) AS total_players FROM benn.college_football_players GROUP BY state ORDER BY total_players DESC

What does below code do? SELECT CASE WHEN year = 'FR' THEN 'FR' WHEN year = 'SO' THEN 'SO' WHEN year = 'JR' THEN 'JR' WHEN year = 'SR' THEN 'SR' ELSE 'No Year Data' END AS year_group, COUNT(1) AS count FROM benn.college_football_players GROUP BY 1

It counts the number of players in each year_group

The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements. True/False?

True

How do we end a CASE statement?

Every CASE statement MUST end with the END statement.

The CASE statement is followed by at least one pair of WHEN and THEN statements. True/False?

True

What is SQL’s way of handling if/then logic?

CASE

We use HAVING clause with aggregation, because WHERE doesn't allow you to filter on aggregate columns. True/False?

True

Intermediate SQL Flashcards

(56 cards)