2. SQL and Databases Flashcards

Question 1

Q

What are the differences between OLTP and OLAP systems?

Answer

A

OLTP (Online Transaction Processing) systems are optimized for managing transaction-oriented applications, while OLAP (Online Analytical Processing) systems are designed for complex queries and data analysis.

Question 2

Q

What is indexing in databases, and how does it improve performance?

Answer

A

Indexing is a data structure technique that improves the speed of data retrieval operations on a database table. It allows the database to find data faster without scanning the entire table.

Indexing in databases is a performance optimization technique that creates a data structure to quickly locate and access specific rows in a table. Here’s how it works:

Structure: An index is like a lookup table that contains a copy of selected columns and a pointer to the full row data.
Performance Improvement: Instead of scanning the entire table (a full table scan), the database can use the index to quickly find the exact rows matching a query, significantly reducing search time.

Example: In a table with millions of user records, an index on the email column allows near-instant retrieval of a specific user’s record without checking every single row.
Trade-offs:

Pros: Faster read operations
Cons: Slower write operations (as indexes must be updated), additional storage space required

Common Index Types:

B-Tree Index: Most common, good for equality and range queries
Hash Index: Excellent for exact match lookups
Bitmap Index: Efficient for columns with low cardinality

Typical performance improvement: From O(n) (linear search) to O(log n) query time for indexed columns.

Question 3

Q

Explain the concept of normalization and denormalization.

Answer

A

Normalization is the process of organizing data to reduce redundancy, while denormalization is the process of combining tables to improve read performance at the cost of increased redundancy.

Question 4

Q

How do you optimize complex SQL queries?

Answer

A

Optimizing complex SQL queries can involve indexing, rewriting queries for efficiency, reducing the number of joins, and using appropriate filtering and aggregation.

Question 5

Q

What is a clustered vs. non-clustered index?

Answer

A

A clustered index determines the physical order of data in a table, while a non-clustered index is a separate structure that points to the physical data.

Clustered Index:

Defines the physical order of table data.
Only one per table.
Faster for range queries & sorting.

Non-Clustered Index:
Does not change physical order.
Multiple indexes allowed.
Stores pointers to actual data.
Faster for search queries on specific columns.

Question 6

Q

What is a materialized view, and how does it differ from a regular view?

Answer

A

A materialized view is a database object that contains the results of a query and is stored on disk, whereas a regular view is a virtual table that does not store data.

Question 7

Q

How do you handle duplicates in SQL data?

Answer

A

Duplicates in SQL data can be handled using the DISTINCT keyword, GROUP BY clause, or by using the ROW_NUMBER() function to identify and remove duplicates.

Question 8

Q

Explain ACID properties in a relational database.

Answer

A

ACID properties ensure that database transactions are processed reliably: Atomicity, Consistency, Isolation, and Durability.

Question 9

Q

What is the difference between UNION and UNION ALL?

Answer

A

UNION combines the results of two queries and removes duplicates, while UNION ALL combines the results and includes all duplicates.

Question 10

Q

How does partitioning improve query performance?

Answer

A

Partitioning improves query performance by dividing a large table into smaller, more manageable pieces, allowing for faster data access and maintenance.