SQL Intermediate Flashcards

1
Q

COUNT

A

Returns the count of only non-null rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Write a query to count the number of non-null rows in the low column.

A

SELECT COUNT(low) AS low FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Write a query that determines counts of every single column.

A

SELECT COUNT(year) AS year,

COUNT(month) AS month,

COUNT(open) AS open,

COUNT(high) AS high,

COUNT(low) AS low,

COUNT(close) AS close,

COUNT(volume) AS volume

FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Write a query to calculate the average opening price

A

SELECT (SUM(open)/COUNT(open)) as avg_open
FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MIN and MAX

A

Depending on the column type, MIN will return the lowest number, earliest date, or non-numerical value as close alphabetically to “A” as possible. As you might suspect, MAX does the opposite—it returns the highest number, the latest date, or the non-numerical value closest alphabetically to “Z.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

select the minimum and maximum volumes of stock

A

SELECT MIN(volume) AS min_volume, MAX(volume) AS max_volume FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What was Apple’s lowest stock price (at the time of this data collection)?

A
SELECT MIN(low)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was the highest single-day increase in Apple’s share value?

A
SELECT MAX(close-open)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

AVG

A

calculates the average of a selected group of values. It’s very useful, but has some limitations. First, it can only be used on numerical columns. Second, it ignores nulls completely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SELECT AVG(high) FROM tutorial.aapl_historical_stock_price WHERE high IS NOT NULL

VS

SELECT AVG(high) FROM tutorial.aapl_historical_stock_price

A

No difference as AVG ignores nulls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Write a query that calculates the average daily trade volume for Apple stock.

A
SELECT AVG(volume)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

GROUP BY

A

GROUP BY allows you to separate data into groups, which can be aggregated independently of one another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Calculate the total number of shares traded each month for each year. Order your results chronologically.

A

SELECT year, month, SUM(volume) as total_num_shares
FROM tutorial.aapl_historical_stock_price
GROUP BY 1, 2
ORDER BY 1, 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Using GROUP BY with ORDER BY

A

The order of column names in your GROUP BY clause doesn’t matter—the results will be the same regardless. If you want to control how the aggregations are grouped together, use ORDER BY.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SELECT year, month, COUNT(*) AS count FROM tutorial.aapl_historical_stock_price GROUP BY year, month ORDER BY month, year

VS

SELECT year, month, COUNT(*) AS count FROM tutorial.aapl_historical_stock_price GROUP BY year, month ORDER BY year, month

A

2000 1 20
2001 1 21
2002 1 21
2003 1 21
2004 1 20

vs

2000 1 20
2000 2 20
2000 3 23
2000 4 19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Write a query to calculate the average daily price change in Apple stock, grouped by year.

A

SELECT year,
AVG(close-open) AS avg_change
FROM tutorial.aapl_historical_stock_price
GROUP BY 1
ORDER BY 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Write a query that calculates the lowest and highest prices that Apple stock achieved each month.

A

SELECT year, month,
MAX(high) AS highest,
MIN(low) AS lowest
FROM tutorial.aapl_historical_stock_price
GROUP BY 1, 2
ORDER BY 1, 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

find every month during which AAPL stock where the max was over $400/share.

A

SELECT year, month, MAX(high) AS month_high FROM tutorial.aapl_historical_stock_price GROUP BY year, month HAVING MAX(high) > 400 ORDER BY year, month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

ORDER of SQL operations

A

SELECT

FROM

WHERE

GROUP BY

HAVING

ORDER BY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

CASE

A

The CASE statement is SQL’s way of handling if/then logic. The CASE statement is followed by at least one pair of WHEN and THEN statements—SQL’s equivalent of IF/THEN in Excel. Because of this pairing, you might be tempted to call this SQL CASE WHEN, but CASE is the accepted term.

Every CASE statement must end with the END statement. The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

write a query that adds a new column called is_a_senior that returns yes if the player is a senior and no if not

A

SELECT player_name, year,

CASE

WHEN year = ‘SR’ THEN ‘yes’

ELSE

‘no’

END AS is_a_senior

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Write a query that includes a column that is flagged “yes” when a player is from California, and sort the results with those players first.

A

SELECT player_name,
CASE
WHEN state = ‘CA’ THEN ‘yes’
ELSE ‘no’
END AS from_ca
FROM benn.college_football_players
ORDER BY 2 DESC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

sort players into three weight classes:

over 250

201 - 250

176 - 200

175 and under

A

SELECT player_name, weight,

CASE WHEN weight > 250 THEN ‘over 250’

WHEN weight > 200 AND weight <= 250 THEN ‘201-250’

WHEN weight > 175 AND weight <= 200 THEN ‘176-200’

ELSE ‘175 or under’ END AS weight_group

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Write a query that includes players’ names and a column that classifies them into four categories based on height. Keep in mind that the answer we provide is only one of many possible answers, since you could divide players’ heights in many ways.

A

SELECT player_name, height,

CASE

WHEN height > 74 THEN ‘over 74’

WHEN height > 72 AND height <= 74 THEN ‘73-74’

WHEN height > 70 AND height <= 72 THEN ‘71-72’

ELSE ‘under 70’

END AS height_group

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Write a query that selects all columns from benn.college_football_players and adds an additional column that displays the player’s name if that player is a junior or senior.

A

SELECT *,
CASE
WHEN year = ‘JR’ OR year = ‘SR’ THEN player_name
ELSE ‘not_jr_sr’
END AS jr_sr
FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

count the number of students by their year, INCLUDING null values

“hint: case”

A

SELECT

CASE

WHEN year = ‘FR’ THEN ‘FR’

WHEN year = ‘SO’ THEN ‘SO’

WHEN year = ‘JR’ THEN ‘JR’

WHEN year = ‘SR’ THEN ‘SR’

ELSE ‘No Year Data’ END AS year_group,

COUNT(1) AS count

FROM benn.college_football_players

GROUP BY 1

normally count ignores nulls, by making them a case, you can count them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Write a query that counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (Everywhere else).

A

SELECT
CASE
WHEN state IN (‘CA’, ‘OR’, ‘WA’) THEN ‘west_coast’
WHEN state = ‘TX’ THEN ‘texas’
ELSE ‘other’
END AS region,
COUNT(1) as players –count the case to get all the nulls
FROM benn.college_football_players
WHERE weight >= 300
GROUP BY 1

28
Q

Write a query that calculates the combined weight of all underclass players (FR/SO) in California as well as the combined weight of all upperclass players (JR/SR) in California.

A

SELECT
CASE
WHEN year IN (‘FR’, ‘SO’) THEN ‘fr_so’
WHEN year IN (‘JR’, ‘SR’) THEN ‘jr_sr’
ELSE NULL
END AS year_in_school,
SUM(weight) as total_weight –count the case to get all the nulls
FROM benn.college_football_players
WHERE state = ‘CA’
GROUP BY 1

29
Q

SELECT

CASE

WHEN year = ‘FR’ THEN ‘FR’

WHEN year = ‘SO’ THEN ‘SO’

WHEN year = ‘JR’ THEN ‘JR’

WHEN year = ‘SR’ THEN ‘SR’

ELSE ‘No Year Data’

END AS year_group,

COUNT(1) AS count

FROM benn.college_football_players

GROUP BY 1

pivot this not using a pivot command

A
SELECT COUNT(CASE WHEN year = 'FR' THEN 1 ELSE NULL END) AS fr\_count,
 COUNT(CASE WHEN year = 'SO' THEN 1 ELSE NULL END) AS so\_count,
 COUNT(CASE WHEN year = 'JR' THEN 1 ELSE NULL END) AS jr\_count,
 COUNT(CASE WHEN year = 'SR' THEN 1 ELSE NULL END) AS sr\_count
 FROM benn.college\_football\_players

Output:

fr_count so_count jr_count sr_count
9665 5881 5665 5087

30
Q

Write a query that displays the number of players in each state, with FR, SO, JR, and SR players in separate columns and another column for the total number of players. Order results such that states with the most players come first.

A

SELECT state,
COUNT(CASE WHEN year = ‘FR’ THEN 1 ELSE NULL END) AS fr_count,
COUNT(CASE WHEN year = ‘SO’ THEN 1 ELSE NULL END) AS so_count,
COUNT(CASE WHEN year = ‘JR’ THEN 1 ELSE NULL END) AS jr_count,
COUNT(CASE WHEN year = ‘SR’ THEN 1 ELSE NULL END) AS sr_count,
count(*) as total_players
FROM benn.college_football_players
GROUP BY state
ORDER BY total_players DESC

31
Q

Write a query that shows the number of players at schools with school names that start with A through M, and the number at schools with names starting with N - Z.

A

SELECT

CASE

WHEN school_name < ‘n’ THEN ‘A-M’

WHEN school_name >= ‘n’ THEN ‘N-Z’

ELSE NULL

END AS school_name_group,

COUNT(1) AS players

FROM benn.college_football_players

GROUP BY 1

32
Q

For each school give the number of students whose names start with A-m and n-z

A

SELECT full_school_name,
COUNT(CASE WHEN player_name BETWEEN ‘A’ AND ‘M’ THEN 1 ELSE NULL END) AS a_m,
COUNT(CASE WHEN player_name BETWEEN ‘N’ AND ‘Z’ THEN 1 ELSE NULL END) AS n_z
FROM benn.college_football_players
GROUP BY 1

33
Q

SELECT DISTINCT year, month FROM tutorial.aapl_historical_stock_price

which values will this return unique ones for

A

both year and month

34
Q

count the unique values in the month column.

A

SELECT COUNT(DISTINCT month) AS unique_months FROM tutorial.aapl_historical_stock_price

35
Q

taking average trade volumes by month

A

SELECT month, AVG(volume) AS avg_trade_volume

FROM tutorial.aapl_historical_stock_price

GROUP BY month

ORDER BY 2 DESC

36
Q

Write a query that counts the number of unique values in the month column for each year.

A

SELECT year,
COUNT(DISTINCT month) AS unique_months
FROM tutorial.aapl_historical_stock_price
GROUP BY 1
ORDER BY 1

37
Q

Write a query that separately counts the number of unique values in the month column and the number of unique values in the year column.

A

SELECT COUNT(DISTINCT year) AS unique_years,
COUNT(DISTINCT month) AS unique_months
FROM tutorial.aapl_historical_stock_price

38
Q

Write a query that selects the school name, player name, position, and weight for every player in Georgia, ordered by weight (heaviest to lightest). Be sure to make an alias for the table, and to reference all column names in relation to the alias.

get school_name from teams

A

SELECT teams.school_name, players.player_name, players.position, players.weight
FROM benn.college_football_players players
JOIN benn.college_football_teams teams
ON players.school_name = teams.school_name
WHERE players.state = ‘GA’
ORDER BY players.weight DESC

39
Q

INNER JOIN

A

nner joins eliminate rows from both tables that do not satisfy the join condition set forth in the ON statement. In mathematical terms, an inner join is the intersection of the two tables.

Therefore, if a player goes to a school that isn’t in the teams table, that player won’t be included in the result from an inner join. Similarly, if there are schools in the teams table that don’t match to any schools in the players table, those rows won’t be included in the results either.

40
Q

Inner join with two columns of the same name

A

The results can only support one column with a given name—when you include 2 columns of the same name, the results will simply show the exact same result set for both columns even if the two columns should contain different data. You can avoid this by naming the columns individually.

41
Q

Write a query that displays player names, school names and conferences for schools in the “FBS (Division I-A Teams)” division.

A

SELECT players.player_name, players.school_name, teams.conference
FROM benn.college_football_players players
INNER JOIN benn.college_football_teams teams
ON teams.school_name = players.school_name
WHERE teams.division = ‘FBS (Division I-A Teams)’

42
Q

Outer joins

A

unmatched rows in one or both tables can be returned.

43
Q

Left Join

A

Keep everything from the left database, the first one chosen in the on statement

44
Q

Right Join

A
45
Q

get schema from postgresql

A

\d table_name or \d+ table_name

SELECT table_name, column_name, data_type FROM information_schema.columns WHERE table_name = ‘city’;

46
Q

loginto postgre sgl

A

psql -U postgres -W

postgres is the name of the user

for a specific database

psql -d database -U user -W

47
Q

switch to the table you want

A

postgres=# \c dvdrental Password for user postgres: You are now connected to database “dvdrental” as user “postgres”.

48
Q

list all available databases in postgre

A

\l

49
Q

list all available tables

A

\dt

50
Q

Write a query that performs an inner join between the tutorial.crunchbase_acquisitions table and the tutorial.crunchbase_companies table, but instead of listing individual rows, count the number of non-null permalink rows in each table.

A

SELECT COUNT(companies.permalink) AS companies_rowcount, COUNT(acquisitions.company_permalink) AS acquisitions_rowcount FROM tutorial.crunchbase_companies companies JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink

51
Q

SELECT COUNT(companies.permalink), COUNT(acquisitions.company_permalink)
FROM tutorial.crunchbase_companies companies
LEFT JOIN tutorial.crunchbase_acquisitions acquisitions
ON companies.permalink = acquisitions.company_permalink

VS

– Returns first 100 rows from tutorial.crunchbase_companies
SELECT COUNT(companies.permalink), COUNT(acquisitions.company_permalink)
FROM tutorial.crunchbase_companies companies
JOIN tutorial.crunchbase_acquisitions acquisitions
ON companies.permalink = acquisitions.company_permalink

A

The first will return counts of all values in the left table and only counts of rows in the right table that are equal to the left.

The second will return the counts of only rows that are in both table

52
Q

Count the number of unique companies (don’t double-count companies) and unique acquired companies by state. Do not include results for which there is no state data, and order by the number of acquired companies from highest to lowest.

A

SELECT companies.state_code, COUNT(DISTINCT companies.permalink) AS unique_companies, COUNT(DISTINCT acquisitions.company_permalink) AS unique_companies_acquired

FROM tutorial.crunchbase_companies companies

LEFT JOIN tutorial.crunchbase_acquisitions acquisitions

ON companies.permalink = acquisitions.company_permalink

WHERE companies.state_code IS NOT NULL GROUP BY 1 ORDER BY 3 DESC

53
Q

SELECT companies.permalink AS companies_permalink, companies.name AS companies_name, acquisitions.company_permalink AS acquisitions_permalink, acquisitions.acquired_at AS acquired_date FROM tutorial.crunchbase_companies companies LEFT JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink AND acquisitions.company_permalink != ‘/company/1000memories’ ORDER BY 1

VS

SELECT companies.permalink AS companies_permalink, companies.name AS companies_name, acquisitions.company_permalink AS acquisitions_permalink, acquisitions.acquired_at AS acquired_date FROM tutorial.crunchbase_companies companies LEFT JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink WHERE acquisitions.company_permalink != ‘/company/1000memories’ OR acquisitions.company_permalink IS NULL ORDER BY 1

A

Compare the following query to the previous one and you will see that everything in the tutorial.crunchbase_acquisitions table was joined on except for the row for which company_permalink is ‘/company/1000memories’. It does return the row from the companies table with that permalink

You can see that the 1000memories line is not returned (it would have been between the two highlighted lines below). Also note that filtering in the WHERE clause can also filter null values, so we added an extra line to make sure to include the nulls.

54
Q

Write a query that shows a company’s name, “status” (found in the Companies table), and the number of unique investors in that company. Order by the number of investors from most to fewest. Limit to only companies in the state of New York.

A

SELECT companies.name, companies.status, COUNT(DISTINCT investments.investor_name)
FROM tutorial.crunchbase_companies companies
LEFT JOIN tutorial.crunchbase_investments investments
ON companies.permalink = investments.company_permalink
WHERE companies.state_code = ‘NY’
GROUP BY 1,2
ORDER BY 3 DESC

55
Q

Write a query that lists investors based on the number of companies in which they are invested. Include a row for companies with no investor, and order from most companies to least.

A
56
Q

write a query using a full join that counts the number of rows in the acquisitions table, the companies table and both tables.

A

SELECT

COUNT(CASE WHEN companies.permalink IS NOT NULL AND acquisitions.company_permalink IS NULL THEN companies.permalink ELSE NULL END) AS companies_only,

COUNT(CASE WHEN companies.permalink IS NOT NULL AND acquisitions.company_permalink IS NOT NULL THEN companies.permalink ELSE NULL END) AS both_tables,

COUNT(CASE WHEN companies.permalink IS NULL AND acquisitions.company_permalink IS NOT NULL THEN acquisitions.company_permalink ELSE NULL END) AS acquisitions_only

FROM tutorial.crunchbase_companies companies

FULL JOIN tutorial.crunchbase_acquisitions acquisitions ON companies.permalink = acquisitions.company_permalink

57
Q

Write a query that joins tutorial.crunchbase_companies and tutorial.crunchbase_investments_part1 using a FULL JOIN. Count up the number of rows that are matched/unmatched as in the example above.

A

SELECT
COUNT(CASE WHEN companies.permalink IS NOT NULL AND investments.company_permalink ISNULL THEN 1 ELSE NULL END)AS company_only,
COUNT(CASE WHEN companies.permalink ISNULL AND investments.company_permalink IS NOT NULL THEN 1 ELSE NULL END)AS investment_only,
COUNT(CASE WHEN companies.permalink IS NOT NULL AND investments.company_permalink IS NOT NULL THEN 1 ELSE NULL END)AS both_tables
FROM tutorial.crunchbase_companies companies
FULL JOIN tutorial.crunchbase_investments_part1 investments
ON companies.permalink = investments.company_permalink

58
Q
A
59
Q

What would this result in:

SELECT * FROM tutorial.crunchbase_investments_part1

UNION

SELECT * FROM tutorial.crunchbase_investments_part2

A

This would stack the two together ontop of each other.

More specifically, when you use UNION, the dataset is appended, and any rows in the appended table that are exactly identical to rows in the first table are dropped. If you’d like to append all the values from the second table, use UNION ALL.

Both tables must have the same number of columns

The columns must have the same data types in the same order as the first table

60
Q

Write a query that appends the two crunchbase_investments datasets above (including duplicate values). Filter the first dataset to only companies with names that start with the letter “T”, and filter the second to companies with names starting with “M” (both not case-sensitive). Only include the company_permalink, company_name, and investor_name columns.

A

SELECT company_permalink, company_name, investor_name
FROM tutorial.crunchbase_investments_part1
WHERE company_name ILIKE ‘t%’

UNION ALL

SELECT company_permalink, company_name, investor_name
FROM tutorial.crunchbase_investments_part2
WHERE company_name ILIKE ‘m%’

61
Q

Write a query that shows 3 columns. The first indicates which dataset (part 1 or 2) the data comes from, the second shows company status, and the third is a count of the number of investors.

Hint: you will have to use the tutorial.crunchbase_companies table as well as the investments tables. And you’ll want to group by status and dataset.

A

SELECT ‘investments_part1’ AS dataset_name, companies.status,

COUNT(DISTINCT investments.investor_permalink) AS investors

FROM tutorial.crunchbase_companies companies

LEFT JOIN tutorial.crunchbase_investments_part1 investments

ON companies.permalink = investments.company_permalink

GROUP BY 1,2

UNION ALL

SELECT ‘investments_part2’ AS dataset_name, companies.status,

COUNT(DISTINCT investments.investor_permalink) AS investors

FROM tutorial.crunchbase_companies companies

LEFT JOIN tutorial.crunchbase_investments_part2 investments

ON companies.permalink = investments.company_permalink

GROUP BY 1,2

62
Q

SELECT companies.permalink, companies.name, companies.status, COUNT(investments.investor_permalink) AS investors FROM tutorial.crunchbase_companies companies LEFT JOIN tutorial.crunchbase_investments_part1 investments ON companies.permalink = investments.company_permalink AND investments.funded_year > companies.founded_year + 5 GROUP BY 1,2, 3

A

Here’s an example using > to join only investments that occurred more than 5 years after each company’s founding year:

63
Q

couple reasons you might want to join tables on multiple foreign keys

A

accuracy

speed

joins with two keys can increase the speed due to the the way that SQL indexes

64
Q

Identify companies that received investments from great Britain following an investment from japan, using the investments database

A

SELECT DISTINCT japan_investments.company_name, japan_investments.company_permalink
FROM tutorial.crunchbase_investments_part1 as japan_investments
JOIN tutorial.crunchbase_investments_part1 as gb_investments
ON japan_investments.company_permalink = gb_investments.company_permalink
AND gb_investments.investor_country_code = ‘GBR’
AND gb_investments.funded_at > japan_investments.funded_at
WHERE japan_investments.investor_country_code = ‘JPN’
ORDER BY 1

65
Q

Convert the funding_total_usd and founded_at_clean columns in the tutorial.crunchbase_companies_clean_date table to strings (varchar format) using a different formatting function for each one.

A
SELECT CAST(funding\_total\_usd AS varchar),
founded\_at\_clean::varchar
FROM tutorial.crunchbase\_companies\_clean\_date
66
Q
A