SQL optimizations Flashcards

Question 1

Q

Which is better to use: joins or subqueries? Why

Answer

A

Prefer joins over subqueries and correlated subqueries, especially if the tables involved have indexes that can be leveraged by the join

Question 2

Q

Why should select specific columns?

Answer

A

reduce the amount of data that needs to be processed and transferred.

Question 3

Q

How do you exploit and improve where clause performance?

Answer

A

Conditions in WHERE clauses should be written to exploit indexes and to minimize the number of rows processed

Question 4

Q

How do you optimize aggregate functions and group by?

Answer

A

When using functions like COUNT, SUM, and AVG, or when grouping data, make sure that the fields involved are indexed if possible

Question 5

Q

Should you use wildcards?

Answer

A

If using LIKE, place wildcards (%) judiciously. A leading wildcard prevents index usage and can slow down searches

Question 6

Q

T or F. You can use functions on where columns that are indexed.

Answer

A

False. Applying functions on indexed columns within the WHERE clause can lead to full table scans because the index cannot be used directly.

Question 7

Q

What is the proper way to use order by and having clauses?

Answer

A

ORDER BY and HAVING clauses can be expensive, use them only when necessary, and ensure there’s an index that can support these operations.

Question 8

Q

T or F. Batch updates and inserts. Why?

Answer

A

Instead of single row inserts or updates, batch operations can reduce the transaction overhead and increase performance

Question 9

Q

What are some advanced indexing techniques?

Answer

A

Partial indexes: Create indexes on a subset of the data that meets certain conditions, reducing index size and maintenance overhead.

Composite indexes: Use multiple columns in an index to improve the performance of queries that filter or sort on these columns.

Index-organized tables: Store table data in an index-based structure to improve access time, useful for OLTP systems with frequent short queries.

Question 10

Q

Explain the differences between the joins:

Answer

A

INNER JOIN (a.k.a. “simple join”): Returns all rows for which there is at least one match in BOTH tables. This is the default type of join if no specific JOIN type is specified.

LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table, and the matched rows from the right table; i.e., the results will contain all records from the left table, even if the JOIN condition doesn’t find any matching records in the right table. This means that if the ON clause doesn’t match any records in the right table, the JOIN will still return a row in the result for that record in the left table, but with NULL in each column from the right table.

RIGHT JOIN (or RIGHT OUTER JOIN): Returns all rows from the right table, and the matched rows from the left table. This is the exact opposite of a LEFT JOIN; i.e., the results will contain all records from the right table, even if the JOIN condition doesn’t find any matching records in the left table. This means that if the ON clause doesn’t match any records in the left table, the JOIN will still return a row in the result for that record in the right table, but with NULL in each column from the left table.

FULL JOIN (or FULL OUTER JOIN): Returns all rows for which there is a match in EITHER of the tables. Conceptually, a FULL JOIN combines the effect of applying both a LEFT JOIN and a RIGHT JOIN; i.e., its result set is equivalent to performing a UNION of the results of left and right outer queries.

CROSS JOIN: Returns all records where each row from the first table is combined with each row from the second table (i.e., returns the Cartesian product of the sets of rows from the joined tables). Note that a CROSS JOIN can either be specified using the CROSS JOIN syntax (“explicit join notation”) or (b) listing the tables in the FROM clause separated by commas without using a WHERE clause to supply join criteria (“implicit join notation”).

Question 11

Q

Why is data partitioning helpful?

Answer

A

Data partitioning splits tables into smaller, more manageable pieces, improving query performance by limiting the number of rows to scan. Partitioning can be based on range, list, or hash of a column.

Question 12

Q

What are materialized views?

Answer

A

Precompute and store the result of a complex query, and then query this precomputed result. It’s highly effective for data that does not change frequently but is read often.

Question 13

Q

What is an execution plan? When would you use it? How would you view the execution plan?

Answer

A

An execution plan is basically a road map that graphically or textually shows the data retrieval methods chosen by the SQL server’s query optimizer for a stored procedure or ad hoc query. Execution plans are very useful for helping a developer understand and analyze the performance characteristics of a query or stored procedure, since the plan is used to execute the query or stored procedure.

In many SQL systems, a textual execution plan can be obtained using a keyword such as EXPLAIN, and visual representations can often be obtained as well

Question 14

Q

What are the ACID properties that effectively guarentee that database transactions are processed reliably?

Answer

A

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. They are defined as follows:

Atomicity. Atomicity requires that each transaction be “all or nothing”: if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes.

Consistency. The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof.

Isolation. The isolation property ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e., one after the other. Providing isolation is the main goal of concurrency control. Depending on concurrency control method (i.e. if it uses strict - as opposed to relaxed - serializability), the effects of an incomplete transaction might not even be visible to another transaction.

Durability. Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently (even if the database crashes immediately thereafter). To defend against power loss, transactions (or their effects) must be recorded in a non-volatile memory.

Question 15

Q

What does UNION do? What is the difference between UNION and UNION ALL?

Answer

A

UNION merges the contents of two structurally-compatible tables into a single combined table. The difference between UNION and UNION ALL is that UNION will omit duplicate records whereas UNION ALL will include duplicate records.

It is important to note that the performance of UNION ALL will typically be better than UNION, since UNION requires the server to do the additional work of removing any duplicates. So, in cases where is is certain that there will not be any duplicates, or where having duplicates is not a problem, use of UNION ALL would be recommended for performance reasons.

Question 16

Q

Assume a schema of Emp ( Id, Name, DeptId ) , Dept ( Id, Name).

If there are 10 records in the Emp table and 5 records in the Dept table, how many rows will be displayed in the result of the following SQL query:

Select * From Emp, Dept

Answer

Study These Flashcards

A

The query will result in 50 rows as a “cartesian product” or “cross join”, which is the default whenever the ‘where’ clause is omitted.

Question 17

Q

Give an example of using the except clause

Answer

Study These Flashcards

A

select * from test_a
except
select * from test_b;

Question 18

Q

How can you select all the even number records from a table? All the odd number records?

Answer

Study These Flashcards

A

Select * from table where id % 2 = 0

Select * from table where id % 2 != 0

Question 19

Q

What is the difference between the WHERE and HAVING clauses?

Answer

Study These Flashcards

A

When GROUP BY is not used, the WHERE and HAVING clauses are essentially equivalent.

However, when GROUP BY is used:

The WHERE clause is used to filter records from a result. The filtering occurs before any groupings are made.

The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).

Question 20

Q

What is the difference between char and varchar2?

Answer

Study These Flashcards

A

When stored in a database, varchar2 uses only the allocated space. E.g. if you have a varchar2(1999) and put 50 bytes in the table, it will use 52 bytes.

But when stored in a database, char always uses the maximum length and is blank-padded. E.g. if you have char(1999) and put 50 bytes in the table, it will consume 2000 bytes.

Question 21

Q

What is the difference between IN and EXISTS?

Answer

Study These Flashcards

A

IN:

Works on List result set
Doesn’t work on subqueries resulting in Virtual tables with multiple columns
Compares every value in the result list
Performance is comparatively SLOW for larger resultset of subquery
EXISTS:

Works on Virtual tables
Is used with co-related queries
Exits comparison when match is found
Performance is comparatively FAST for larger resultset of subquery

SQL optimizations Flashcards

(21 cards)