SQL Flashcards

Question

What is the difference between an index and a key in SQL?

Answer 1

**1. Index** - An [**index**](https://www.geeksforgeeks.org/sql-indexes/) is a database object created to **speed up data retrieval**. It stores a sorted reference to table data, which helps the database engine find rows more quickly than scanning the entire table. - **Example:** A non-unique index on a column like `LastName` allows quick lookups of rows where the last name matches a specific value. 2. **Key** - A key is a logical concept that enforces rules for uniqueness or relationships in the data. - For instance, a **PRIMARY KEY** uniquely identifies each row in a table and ensures that no duplicate or NULL values exist in the key column(s). - A **FOREIGN KEY** maintains referential integrity by linking rows in one table to rows in another.

Answer 2

Indexing allows the [**database**](https://www.geeksforgeeks.org/what-is-database/) to locate and access the rows corresponding to a **query condition** much faster than scanning the entire table. Instead of reading each row sequentially, the database uses the index to **jump directly** to the relevant data pages. This reduces the number of disk **I/O operations** and speeds up query execution, especially for large tables. **Example:** ``` CREATE INDEX idx_lastname ON Employees(LastName); SELECT * FROM Employees WHERE LastName = 'Smith'; ``` The index on `LastName` lets the database quickly find all rows matching ‘Smith’ without scanning every record.

Answer 3

**Advantages** - Faster query performance, especially for SELECT queries with [**WHERE**](https://www.geeksforgeeks.org/sql-where-clause/) clauses, JOIN conditions, or ORDER BY clauses. - Improved sorting and filtering efficiency. **Disadvantages:** - Increased storage space for the index structures. - Additional overhead for write operations (INSERT, UPDATE, DELETE), as indexes must be updated whenever the underlying data changes. - Potentially **slower bulk data load**s or batch inserts due to the need to maintain index integrity.In short, indexes make read operations faster but can slow down write operations and increase storage requirements.

Answer 4

**1. Clustered Index:** - Organizes the physical data in the table itself in the order of the indexed column(s). - A table can have only one [**clustered index**](https://www.geeksforgeeks.org/difference-between-clustered-and-non-clustered-index/). - Improves range queries and queries that sort data. - Example: If `EmployeeID` is the clustered index, the rows in the table are stored physically sorted by `EmployeeID`. **2. Non-Clustered Index:** - Maintains a separate structure that contains a reference (or pointer) to the physical data in the table. - A table can have multiple non-clustered indexes. - Useful for specific query conditions that aren’t related to the primary ordering of the data. - Example: A non-clustered index on `LastName` allows fast lookups by last name even if the table is sorted by another column.

Answer 5

[**Temporary tables**](https://www.geeksforgeeks.org/what-is-temporary-table-in-sql/) are tables that exist only for the duration of a **session** or a **transaction**. They are useful for storing intermediate results, simplifying complex queries, or performing operations on subsets of data without modifying the main tables. **1. Local Temporary Tables:** - Prefixed with `#` (e.g., `#TempTable`). - Only visible to the session that created them. - Automatically dropped when the session ends. **2. Global Temporary Tables:** - Prefixed with `##` (e.g., `##GlobalTempTable`). - Visible to all sessions. - Dropped when all sessions that reference them are closed. **Example:** ``` CREATE TABLE #TempResults (ID INT, Value VARCHAR(50)); INSERT INTO #TempResults VALUES (1, 'Test'); SELECT * FROM #TempResults; ```

Answer 6

- **Standard View:** - A virtual table defined by a query. - Does not store data; the underlying query is executed each time the view is referenced. - A standard view shows real-time data. - **Materialized View:** - A physical table that stores the result of the query. - Data is precomputed and stored, making reads faster. - Requires periodic refreshes to keep data up to date. - materialized view is used to store aggregated sales data, updated nightly, for fast reporting.

Answer 7

Constraints enforce rules that the data must follow, preventing invalid or inconsistent data from being entered: - **NOT NULL:** Ensures that a column cannot contain NULL values. - **UNIQUE:** Ensures that all values in a column are distinct. - **PRIMARY KEY:** Combines NOT NULL and UNIQUE, guaranteeing that each row is uniquely identifiable. - **FOREIGN KEY:** Ensures referential integrity by requiring values in one table to match primary key values in another. - **CHECK:** Validates that values meet specific criteria (e.g., `CHECK (Salary > 0)`).By automatically enforcing these rules, constraints maintain data reliability and consistency.

Answer 8

- **Local Temporary Table:** - Prefixed with `#` (e.g., `#TempTable`). - Exists only within the session that created it. - Automatically dropped when the session ends. - **Global Temporary Table:** - Prefixed with `##` (e.g., `##GlobalTempTable`). - Visible to all sessions. - Dropped only when all sessions referencing it are closed. **Example:** ``` CREATE TABLE #LocalTemp (ID INT); CREATE TABLE ##GlobalTemp (ID INT); ```

Answer 9

The **`MERGE` statement** combines multiple operations INSERT, UPDATE, and DELETE into one. It is used to synchronize two tables by: - Inserting rows that don’t exist in the target table. - Updating rows that already exist. - Deleting rows from the target table based on conditions **Example:** ``` MERGE INTO TargetTable T USING SourceTable S ON T.ID = S.ID WHEN MATCHED THEN UPDATE SET T.Value = S.Value WHEN NOT MATCHED THEN INSERT (ID, Value) VALUES (S.ID, S.Value); ```

Answer 10

**1. GROUP BY:** Aggregate rows to eliminate duplicates ``` SELECT Column1, MAX(Column2) FROM TableName GROUP BY Column1; ``` 2. **ROW_NUMBER():** Assign a unique number to each row and filter by that ``` WITH CTE AS ( SELECT Column1, Column2, ROW_NUMBER() OVER (PARTITION BY Column1 ORDER BY Column2) AS RowNum FROM TableName ) SELECT * FROM CTE WHERE RowNum = 1; ```

Answer 11

**Partitioned tables** divide data into **smaller**, more **manageable segments** based on a column’s value (e.g., date or region). Each partition is stored separately, making queries that target a specific partition more efficient. It is used when - Large tables with millions or billions of rows. - Scenarios where queries frequently filter on partitioned columns (e.g., year, region). - To improve maintenance operations, such as archiving older partitions without affecting the rest of the table.

Answer 12

[**ACID**](https://www.geeksforgeeks.org/acid-properties-in-dbms/) is an acronym that stands for Atomicity, Consistency, Isolation, and Durability—four key properties that ensure database transactions are processed reliably. 1. **Atomicity:** - A transaction is treated as a single unit of work, meaning all operations must succeed or fail as a whole. - If any part of the transaction fails, the entire transaction is rolled back. 2. **Consistency:** - A transaction must take the database from one valid state to another, maintaining all defined rules and constraints. - This ensures data integrity is preserved throughout the transaction process. 3. **Isolation:** - Transactions should not interfere with each other. - Even if multiple transactions occur simultaneously, each must operate as if it were the only one in the system until it is complete. 4. **Durability:** - Once a transaction is committed, its changes must persist, even in the event of a system failure. - This ensures the data remains stable after the transaction is successfully completed.

Answer 13

- The **WITH (NOLOCK)** hint allows a query to read data without acquiring shared locks, effectively reading uncommitted data. - It can improve performance by **reducing contention for locks**, especially on large tables that are frequently updated. - Results may be inconsistent or unreliable, as the data read might change or be rolled back. **Example:** ``` SELECT * FROM Orders WITH (NOLOCK); ``` This query fetches data from the `Orders` table without waiting for other transactions to release their locks.

Answer 14

[**Deadlocks**](https://www.geeksforgeeks.org/deadlock-in-dbms/) occur when two or more transactions hold resources that the other transactions need, resulting in a cycle of dependency that prevents progress. Strategies to handle deadlocks include: 1. **Deadlock detection and retry:** - Many database systems have mechanisms to detect deadlocks and terminate one of the transactions to break the cycle. - The terminated transaction can be retried after the other transactions complete. 2. **Reducing lock contention:** - Use indexes and optimized queries to minimize the duration and scope of locks. - Break transactions into smaller steps to reduce the likelihood of conflicts. 3. **Using proper isolation levels:** - In some cases, lower isolation levels can help reduce locking. - Conversely, higher isolation levels (like Serializable) may ensure a predictable order of operations, reducing deadlock risk. 4. **Consistent ordering of resource access:** - Ensure that transactions acquire resources in the same order to prevent cyclical dependencies.

Answer 15

A [**database snapshot**](https://www.geeksforgeeks.org/sql-snapshots/) is a read-only, static view of a database at a specific point in time. - **Reporting:** Allowing users to query a consistent dataset without affecting live operations. - **Backup and recovery:** Snapshots can serve as a point-in-time recovery source if changes need to be reversed. - **Testing:** Providing a stable dataset for testing purposes without the risk of modifying the original data. **Example:** ``` CREATE DATABASE MySnapshot ON ( NAME = MyDatabase_Data, FILENAME = 'C:\Snapshots\MyDatabase_Snapshot.ss' ) AS SNAPSHOT OF MyDatabase; ```

Answer 16

**1. Live Lock** - Occurs when two or more transactions keep responding to each other’s changes, but no progress is made. - Unlike a deadlock, the transactions are not blocked; they are actively running, but they cannot complete. 2. **Deadlock** - Occurs when transactions are blocked waiting for each other to release locks. - No progress can be made unless one of the transactions is terminated

Answer 17

**1. Indexing Strategy:** - Focus on the most frequently queried columns or those involved in [**JOIN**](https://www.geeksforgeeks.org/sql-join-set-1-inner-left-right-and-full-joins/) and WHERE conditions. - Avoid indexing every column, as it increases storage and maintenance costs. **2. Index Types:** - Use clustered indexes for primary key lookups and range queries. - Use non-clustered indexes for filtering, ordering, and covering specific queries. **3. Partitioned Indexes:** - If the table is partitioned, consider creating **local indexes** for each partition. This improves manageability and can speed up queries targeting specific partitions. **4. Maintenance Overhead:** - Index rebuilding and updating can be resource-intensive. Plan for regular index maintenance during off-peak hours. - Monitor index fragmentation and rebuild indexes as necessary to maintain performance. **5. Monitoring and Tuning:** - Continuously evaluate query performance using execution plans and statistics. - Remove unused or rarely accessed indexes to reduce maintenance costs. 6. Indexing large tables requires a careful approach to ensure that performance gains from faster queries outweigh the costs of increased storage and maintenance effort.

Answer 18

**1. Write Simple, Clear Queries:** - Avoid overly complex joins and [**subqueries**](https://www.geeksforgeeks.org/sql-subquery/). - Use straightforward, well-structured SQL that is easy to read and maintain. **2. Filter Data Early:** - Apply WHERE clauses as early as possible to reduce the amount of data processed. - Consider using indexed columns in WHERE clauses for faster lookups. **3.** ****Avoid SELECT** ***:** - Retrieve only the columns needed. This reduces I/O and improves performance. **4. Use Indexes Wisely:** - Create indexes on columns that are frequently used in WHERE clauses, JOIN conditions, and ORDER BY clauses. - Regularly review index usage and remove unused indexes. **5. Leverage Query Execution Plans:** - Use execution plans to identify bottlenecks, missing indexes, or inefficient query patterns. **6. Use Appropriate Join Types:** - Choose INNER JOIN, LEFT JOIN, or OUTER JOIN based on the data relationships and performance requirements. **7. Break Down Complex Queries:** - Instead of a single monolithic query, use temporary tables or CTEs to process data in stages. **8. Optimize Aggregations:** - Use GROUP BY and aggregate functions efficiently. - Consider pre-aggregating data if queries frequently require the same computations. **9. Monitor Performance Regularly:** - Continuously analyze query performance and fine-tune as data volumes grow or usage patterns change.

Answer 19

The [**PIVOT operator**](https://www.geeksforgeeks.org/pivot-and-unpivot-in-sql/) transforms rows into columns, making it easier to summarize or rearrange data for reporting. **Example:** Converting a dataset that lists monthly sales into a format that displays each month as a separate column. ``` SELECT ProductID, [2021], [2022] FROM ( SELECT ProductID, YEAR(SaleDate) AS SaleYear, Amount FROM Sales ) AS Source PIVOT ( SUM(Amount) FOR SaleYear IN ([2021], [2022]) ) AS PivotTable; ```

SQL Flashcards

(43 cards)