Create Transact-SQL SELECT queries Flashcards
Ways to assign alias in SELECT statement
- column AS alias
- column alias
- alias = column
When do you need to delimit identifiers?
When it doesn’t comply with the rules:
- First character must be a letter, underscore, @, or #.
- Subsequent characters can include letters, decimal numbers, @, #, $, or underscore.
- Cannot be a reserved keyword in T-SQL
- Cannot have embedded spaces
- Must not include supplementary characters
Two ways to delimit identifiers
- “2017”
2. [2017]
Why shouldn’t you use = when looking for a NULL? What should you use instead?
Because nothing is considered equal to a NULL–not even another NULL.
You should use IS NULL or IS NOT NULL operators
e.g.
WHERE region IS NULL;
What is the order of operations for AND, OR, and NOT?
NOT, AND, OR
Note: can use parentheses to specify order
How should you identify Unicode character strings?
N’string’
Syntax for pattern matching
What do these wildcards mean?
%, _, [characters], [character-range], [^characters]
column LIKE pattern
% : matches any string including empty one
_ : a single character
[characters] : a single character from a list of characters
[character-range] : a single character from a range
[^characters] : a single character that is not in the list or range
How to look for a character that is considered a wildcard?
- Use escape character: col1 LIKE ‘!_%’ ESCAPE ‘!’ looks for strings that start with an underscore
- Use square brackets: col1 LIKE ‘[_]%’
Which date form is considered language-neutral for all date and time types?
‘YYYYMMDD’
You’re trying to find orders placed in April. What is wrong with the following and what should you do instead?
SELECT orderid, orderdate, empid
FROM Sales.Orders2
WHERE orderdate BETWEEN ‘20160401’ AND ‘20160430 23:59:59.999’
The value is rounded up to the next millisecond. This also returns orders placed in May 1, 2016.
SELECT orderid, orderdate, empid
FROM Sales.Orders2
WHERE orderdate >= ‘20160401’ AND orderdate < ‘20160501’
Can you order by columns that are not in the SELECT statement?
When would this fail?
Yes
It can fail when DISTINCT is used because since duplicates are removed, result rows don’t necessarily map to source rows in a one-to-one manner
In what order does SQL process the components of a query?
FROM WHERE GROUP BY HAVING SELECT ORDER BY
Syntax for TOP for both number of rows and percent
Does TOP PERCENT round up or down the number of rows?
SELECT TOP (4) col1, col2
FROM Sales.Orders
ORDER BY orderdate DESC;
Note: Can also specify number of rows without parentheses, but correct syntax is with
Note: Can use expressions as inputs
SELECT TOP (4) PERCENT col1, col2
FROM Sales.Orders
ORDER BY orderdate DESC;
Note: TOP PERCENT rounds up for number of rows
What does OFFSET do?
Syntax for OFFSET-FETCH
Which clause is required for OFFSET-FETCH?
Do you always need to use OFFSET-FETCH together?
What if want to filter a certain number of rows in arbitrary order?
After which statement is OFFSET-FETCH processed in a query?
It skips rows
SELECT col1, col2
FROM Sales.Orders
ORDER BY orderdate
OFFSET 50 ROWS FETCH NEXT 25 ROWS ONLY;
ORDER BY is required to use OFFSET-FETCH
FETCH clause required OFFSET, but OFFSET doesn’t require FETCH
Note: If skipping 0 rows, use FETCH FIRST instead of NEXT
Note: can use expressions as inputs
For arbitrary order use ORDER BY (SELECT NULL)
OFFSET-FETCH is processed after SELECT. Think of it as an extension of the ORDER BY clause
Between TOP and OFFSET-FETCH, which one is standard and which one is not?
OFFSET-FETCH is standard
Guidelines for set operators: UNION, UNION ALL, INTERSECT, and EXCEPT
- About number of columns and column types
- Set operators use distinctness-based comparison rather than equality-based. What does this mean?
- These operators are set operators and not cursor operators. What restriction does this impose on the individual queries?
- Where can you have an ORDER BY clause?
- How are column names of the result columns determined?
- Number of columns between the queries has to be the same and the column types of corresponding columns need to be compatible (implicitly convertible)
- A comparison between two NULLs yields true. Comparison between NULL and non-NULL yields false.
- Individual queries cannot have an ORDER BY clause
- You can add an ORDER BY clause at the end of the query so that it acts on the result of the set operator
- Columns names are determined by the first query.
What does UNION do?
Syntax for UNION
When do you use UNION ALL?
If the sets you’re unifying don’t have duplicate rows, should you use UNION or UNION ALL?
Combines results of queries vertically
SELECT country, region
FROM HR.Employees
UNION
SELECT country, region
FROM Sales.Customers;
Use UNION ALL when you want to keep duplicate rows (They are excluded if just use UNION)
If no duplicates, use UNION ALL so that SQL doesn’t have to check for duplicates and thus save computation cost
What does INTERSECT do?
Syntax for INTERSECT
Returns distinct rows that are in both queries
SELECT country, region, city
FROM HR.Employees
INTERSECT
SELECT country, region, city
FROM Sales.Customers;
What does EXCEPT do?
Syntax for EXCEPT
Returns rows that are in the first query but not in the second.
SELECT country, region, city
FROM HR.Employees
EXCEPT
SELECT country, region, city
FROM Sales.Customers;
What is the order of operations for the set operators?
INTERSECT precedes UNION and EXCEPT
UNION and EXCEPT are evaluated left to right based on their position in the expression.
Note: can use parenthesis to specify order
What is a CROSS JOIN?
CROSS JOIN syntax
It produces the Cartesian product of the two tables
SELECT D.n AS theday, S.n AS shiftno
FROM dbo.Nums AS D
CROSS JOIN dbo.Nums AS S
Consider this query:
SELECT D.n AS theday, S.n AS shiftno FROM dbo.Nums AS D CROSS JOIN dbo.Nums AS S WHERE D.n <= 7 AND S.n <= 3 ORDER BY theday, shiftno;
SQL Server performs optimization technique called predicate pushdown. What does this mean?
SQL Server knows that with a cross join followed by a filter it can evaluate the filters first.
What is an INNER JOIN?
INNER JOIN syntax
What’s equivalent to INNER JOIN?
Combines tables horizontally based on matching values in both tables.
SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;
JOIN is equivalent to INNER JOIN
For inner joins, what’s the difference between ON and WHERE clauses?
What about outer joins?
There’s no difference. Instead of using a WHERE clause, you can simply put the expression on the ON clause
In outer joins, ON and WHERE play different roles. ON matches while WHERE filters.
What is a left join?
What is a right join?
Syntax
Left join returns all records from the left table (table1), and the matched records from the right table (table2).
Right join returns all records from the right table (table2), and the matched records from the left table (table1)
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;
Note: can also say LEFT OUTER JOIN
What is a full join?
Syntax
A full join returns matched rows plus all the unmatched rows from the left and right tables.
SELECT column_name(s) FROM table1 FULL OUTER JOIN table2 ON table1.column_name = table2.column_name WHERE condition;
Note: can also just say FULL JOIN
Between a unique constraint and a primary key constraint, which allows NULLs?
Only the unique constraint allows NULLs
What does the ISNULL function do?
Syntax
The ISNULL function checks whether an expression is NULL and if so, replaces it by the value specified. Otherwise, it returns the expression
ISNULL(expression, value)
e.g.
ISNULL(col1, N’N/A’)
This table has a clustered index. It also has NULLs in the ‘region’ column. What’s the problem with the following query (Two cases: when SQL Server uses nested loops and when it uses merge for join)? What should you do instead?
SELECT EL.country, EL.region, EL.city, EL.numemps, CL.numcusts
FROM dbo.EmpLocations AS EL
INNER JOIN dbo.CustLocations AS CL
ON EL.country = CL.country
AND ISNULL(EL.region, N’N/A’) = ISNULL(CL.region, N’N/A’)
AND EL.city = CL.city;
The problem is that once you manipulate a column with the ISNULL function, SQL Server cannot trust that the result values preserve the same ordering as the original values.
For nested loops, it will make the region and city predicates appear as residual predicates rather than as seek predicates. This increases computational cost.
For merge joins, after scanning the clustered index, SQL Server has to sort the results before merging. Also increases computational cost.
The following query does the same but is order-preserving:
SELECT EL.country, EL.region, EL.city, EL.numemps, CL.numcusts
FROM dbo.EmpLocations AS EL
INNER JOIN dbo.CustLocations AS CL
ON EL.country = CL.country
AND(EL.region = CL.region OR (EL.region IS NULL AND CL.region IS NULL))
AND EL.city = CL.city;
Then all predicates will appear under seek predicates.
Consider the following query. Is it order-preserving or not?
Explain what happens in the ON clause
What is the advantage of using INTERSECT instead of = here?
SELECT EL.country, EL.region, EL.city, EL.numemps, CL.numcusts
FROM dbo.EmpLocations AS EL
INNER JOIN dbo.CustLocations AS CL
ON EXISTS (SELECT EL.country, EL.region, EL.city
INTERSECT
SELECT CL.country, CL.region, CL.city);
Yes, it is order-preserving.
For each row evaluated by the join, the set operator performs an intersection of the employee location attributes and the customer location attributes using FROM-less SELECT statements, each producing one row. If the locations intersect, the result is one row, in which case the EXISTS predicate returns true. If the locations don’t intersect, the result is an empty set, in which case the EXISTS predicate returns false.
This take advantage of the distinctness-based comparison of set operators. This query will return rows where NULLs match between the two tables.
In what order does SQL Server evaluate multi-join queries?
From left to right, so the result of one join is used as the left input for the next join
Assume that not all suppliers have products and therefore no categoryid. What is the problem with this query?
SELECT companyname AS supplier, S.country, P.productid, P.productname, P.unitprice, C.categoryname
FROM Production.Suppliers AS S
LEFT OUTER JOIN Production.Products AS P
ON S.supplierid = P.supplierid
INNER JOIN Production.Categories AS C
ON C.categoryid = P.categoryid
WHERE S.country = N’Japan’;
Provide two solutions to the problem. One using outer joins and one separating joins into their own independent logical phase.
The inner join that followed the outer join nullified the outer part of the join. The first join returns all suppliers, including NULLs for those without products. Then the inner join compared the NULLs in the categoryid column in the outer rows to categoryid values in Production.Categories and discarded those rows.
Solution 1: outer joins
SELECT companyname AS supplier, S.country, P.productid, P.productname, P.unitprice, C.categoryname
FROM Production.Suppliers AS S
LEFT OUTER JOIN Production.Products AS P
ON S.supplierid = P.supplierid
LEFT OUTER JOIN Production.Categories AS C
ON C.categoryid = P.categoryid
WHERE S.country = N’Japan’;
Solution 2: separating into own phase
SELECT companyname AS supplier, S.country, P.productid, P.productname, P.unitprice, C.categoryname
FROM Production.Suppliers AS S
LEFT OUTER JOIN
(Production.Products as P
INNER JOIN Production.Categories as C
ON C.categoryid = P.categoryid)
ON S.supplierid = P.supplierid
WHERE S.country = N’Japan’;
CAST function syntax
CONVERT function syntax
Which one is standard and which one is proprietary in T-SQL?
When is it useful to use CONVERT over CAST?
CAST(source_expression AS target_type)
e.g. CAST(‘100’ AS INT)
CONVERT(target_type, source_expression [,style_number])
e.g. CONVERT(DATE, ‘01/02/2018’, 101)
CAST is standard and CONVERT is proprietary in T-SQL.
It’s useful to use CONVERT when you need to specify a style for the conversion.
PARSE function syntax
How is it different from CONVERT?
What is one drawback of this function?
PARSE(string_input AS target_type [USING style]
Instead of style numbers, it uses the more user-friendly .NET culture names.
e.g. PARSE(‘01/02/2019’ AS DATE USING ‘en-US’)
A drawback is that it’s significantly slower than CONVERT.
What is a problem with CAST, CONVERT, and PARSE?
What can you do about it?
If the function fails to convert a value within a query, the whole query fails and stops processing.
Instead, can use TRY_CAST, TRY_CONVERT, and TRY_PARSE, which return a NULL when the conversion isn’t valid.
FORMAT function syntax
FORMAT(value, format)
e.g. FORMAT(SYSDATETIME(), ‘yyyy-MM-dd’)
Uses a .NET format string
Note: this function is pretty slow
Which functions return the current date and time?
In what format do they give you the date and time?
Which one is T-SQL and which one is standard?
GETDATE() and CURRENT_TIMESTAMP
Note: no parenthesis for CURRENT_TIMESTAMP
DATETIME format
GETDATE() is T-SQL and CURRENT_TIMESTAMP is standard.
Which date functions return the current date and time in the more precise DATETIME2 and DATETIMEOFFSET data types?
SYSDATETIME()
SYSDATETIMEOFFSET()
Which functions return the current date and time in UTC terms?
In what format do they give you the date and time?
GETUTCDATE() and SYSUTCDATETIME()
DATETIME and DATETIME2, respectively.