Class 4 - Preparing Data 2 Queries Flashcards

1
Q

Explain the steps that go into Power Query 1? What is it?

A

Power Query 1:
Can use Power Query to combine data and export data into a worksheet that allows merging by multiple variables
Practice using “storesales” and “promotion” tables in “promotion.xlsx” excel file
Want to merge on multiple variables, would like to know which year each product had a promotion (merge on product and year)

Step 1: Open file, make data into excel tables
Step 2: Create a query connection to connect the two for each table. Data - Get Data - From other sources - From Table/Range, Power Query pops up. Under name tabe, select ‘close and load to”, import data pops up, only create connection (do this for 2 tables)
Step 3: Excel now recognizes it. Merge tables: data-get data- combine queries - merge
Step 4: Select fields that two datasets are merged by
Step 5: Select “Join Kind”
Step 6: Power Query Editor pops up, scroll to RHS of worksheet
Step 7: Click the arrow icon to expand the merged column, select the columns to keep
Step 8: Click “Close and Load to” - import data pops up, select only create connection
Step 9: Insert pivot-table, select from external data source, then choose connection, select the merged on

(have pivot table: filter, columns, rows, values
Year: colum, product(row) and promotion(yes/now), Quantity: Value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Explain the steps that go into Power Query 2? What is it?

A

When files or databases are to large for you to load into an excel spreadsheet(limited to 1 mill rows) you will connect to the files or database instead of loading into excel

Step 1: Open a blank excel worksheet
Step 2: Create query connection (different). Data - Get Data - From file. Select file to connect, select table/worksheet to connect. Select “Only Create Connection”. Repeat to connect more datasets or files.
Step 3 - 9: Exactly as before
Step 10: Save Worksheet
Step 11: Data - Refresh all

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What is SQL? What are the SQL Extraction Queries and the tables in SQL?

A

Standard language for querying and manipulating data. Can handle big data sets (excel limited to 1 mil row)

SQL Extraction Queries: Select(Distinct), From, Inner Join, Left Join, On, Where, Group by, having, Order by

Tables in Sql: table name, attribute names, tuples/row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Distinct in Select (SELECT DISTINCT)

A

Removes duplicate values from results of query
Ensure unique values are returned

SELECT DISTINCT Category
FROM Product

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the WHERE Clause in Select?

A

Filter rows based on certain conditions
Filter for a single condition, combine conditions with AND, filter in a range with BETWEEN

SELECT Price, Category
FROM Product
WHERE Price > 100

SELECT *
FROM Product
WHERE category = “Gadgets”

WHERE Price BETWEEN 20 AND 200
WHERE Price > 100 AND Category = “Gadgets”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain SQL Syntax

A

SQL not case sensitive Select and SELECT is the same
Where = filter rows, BETWEEN = filter ranges (dates)
Is Null and Is Not Null handles missing values
Use # for dates (#01/29/2021#)
Use [] for names with space

SELECT [store id], price, store_city
FROM [store sales]
WHERE Price BETWEEN 20 AND 200
AND order_date BETWEEN #06/01/2020 AND #12/31/2020#
AND Price IS NOT NULL;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain Order By - Sorting Query Results

A

Must be the last line in a query ASC is default, unless its DSC

When sorting by first column if two rows tie (same value) SQL uses second column in order by list to sort

SELECT category, price, manufacturer
FROM Product
Order By category, price

SELECT category price, manufacturer
FROM Product
ORDER BY category DESC, price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Aggregation in SELECT

A

Aggregation function: Sum(), Count(), Min(), Max(), AVG()
Alias (AS): Used to rename result column for clarity
Count can be applied to multiple attributes, applies to duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Group By - Aggregation in grouping

A

Group rows that have the same value in a column using Agg

SELECT product, sum(quantity) AS TotalSold
FROM Purchase
WHERE data > #10/05/2020
GROUP BY product

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Having - Aggregation and filtering

A

Filters rows after aggregation (sum, avg)
Diff: WHERE filters before aggregation, HAVING filters results after aggregation

SELECT product, sum(quantity) AS TotalSold
FROM Purchase
WHERE data > #10/05/2020
GROUP BY product
HAVING Sum(quantity)>20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Inner Join: Show the Diagram and Code

A

Only matching rows in both tables.

SELECT *
FROM Students
INNER JOIN Classes
ON Students.ID= Classes.ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Left Outer Join: Show the Diagram and Code

A

All rows from the left table + matches

SELECT *
FROM Students
LEFT OUTER JOIN Classes
ON Students.ID= Classes.ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Right Outer Join: Show the Diagram and Code

A

All rows from the right table + matches.

SELECT *
FROM Students
RIGHT OUTER JOIN Classes
ON Students.ID= Classes.ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Full Outer Join: Show the Diagram and Code

A

All rows from both tables, match or not.

SELECT *
FROM Students
FULL OUTER JOIN Classes
ON Students.ID= Classes.ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

LEFT ANTI JOIN

A

Rows in left table with no match.

SELECT *
FROM Students
FULL OUTER JOIN Classes
ON Students.ID= Classes.ID
WHERE Classes.ID IS NULL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain Table Aliases

A

Shortens table names to make the query easier to read. Use AS to rename tables temporarily.

SELECT a.category, SUM(b.units) AS Total Units

FROM Product AS a
LEFT JOIN Purchase As b
ON a.name = b.prodName
Where b.month = ‘september’
GROUP BY a.category

11
Q

RIGHT ANTI JOIN

A

Rows in right table with no match.

SELECT *
FROM Students
FULL OUTER JOIN Classes
ON Students.ID= Classes.ID
WHERE Students.ID IS NULL

12
Q

Explain Merging on Multiple Variables

A

Joins two tables using more than one condition (e.g., matching on two columns).

SELECT Product.category, sum(Purchase.units) AS TotalUnits
FROM Product
LEFT JOIN Purchase
ON Product.name = Purchase.prodName
AND Product.month = Purchase.month

13
Q

Explain Merging Three or More Tables

A

Combines three or more tables using JOINs. Conceptually merges two tables first, then merges the result with another table.

SELECT Location.state, product.category, SUM(Purchase.units)

FROM (Product
LEFT JOIN Purchase
ON Product.name = Purchase.prodName)

LEFT JOIN Location
ON Purhcase.store = Location.store

14
Q

Explain the Application -Merging Three or More Tables (GROUP BY)

A

Calculates the sum of units grouped by state and category. Example of combining tables with conditions and grouping.

SELECT Location.state, product.category, SUM(Purchase.units)

FROM (Product
LEFT JOIN Purchase
ON Product.name = Purchase.prodName)

LEFT JOIN Location
ON Purhcase.store = Location.store
GROUP BY Location.state, Product.Category