SQL: from 0 to hero Flashcards
Learn active concepts about SQL to succeed in product data science interviews.
What is the difference between create table
and create view
?
Table 2: departments (dept_id, de
Table 1: employees (id, name, dept_id)
easy
create table
: Stores data physically; faster for repeated access but uses storage.
- Use: Static data or precomputed results.
- Avoid: Dynamic data needing frequent updates.
create view
: Virtual table; reflects live data but slower for complex queries.
Use: Dynamic, query-based data.
Avoid: Performance-intensive queries.
What is the difference between INNER JOIN
and OUTER JOIN
?
T2: departments (dept_id, dept_name)
T1: employees (id, name, dept_id), T2: departments (dept_id, dept_name)
easy
INNER JOIN
returns matching rows from both tables, while OUTER JOIN
includes unmatched rows from one or both tables.
– INNER JOIN: Only matching rows
SELECT e.name, d.dept_name FROM employees e INNER JOIN departments d ON e.dept_id = d.dept_id;
– OUTER JOIN: Includes unmatched rows
SELECT e.name, d.dept_name FROM employees e LEFT OUTER JOIN departments d ON e.dept_id = d.dept_id;
How can you **find duplicate records **in a table?
Table: employees (id, name, dept_id)
difficulty: easy
Use GROUP BY on columns and HAVING COUNT(*) > 1
SELECT name, COUNT(*) FROM employees GROUP BY name HAVING COUNT(*) > 1;
How do window functions differ from aggregate functions?
employees (id, name, dept_id, salary)
**difficulty: ** medium
Window functions operate over a subset of rows defined by OVER() without collapsing them into one row, unlike aggregate functions.
-- Window Function SELECT name, salary, SUM(salary) OVER (PARTITION BY dept_id) AS dept_salary FROM employees;
-- Aggregate Function SELECT dept_id, SUM(salary) AS dept_salary FROM employees GROUP BY dept_id;
What is a Common Table Expression (CTE), and when would you use it?
Table: sales (region, sales)
**difficulty: ** medium
A CTE is a temporary result set used for modular, readable queries (e.g., recursive operations).
WITH SalesCTE AS ( SELECT region, SUM(sales) AS total_sales FROM sales GROUP BY region ) SELECT region, total_sales FROM SalesCTE;
How can you calculate the **cumulative sum **of a column?
sales (name, sales, sales_date)
**difficulty: ** medium
Use SUM(column) OVER (ORDER BY column) in a window function.
SELECT name, sales, SUM(sales) OVER (ORDER BY sales_date) AS cumulative_sales FROM sales;
What is the purpose of the EXPLAIN
statement?
Table: employees (id, name, dept_id)
**difficulty: ** medium
It shows the query execution plan to optimize performance.
How do you remove duplicate rows from a table?
employees (id, name, dept_id)
**difficulty: ** medium
Use DELETE with ROW_NUMBER() or DISTINCT.
DELETE FROM employees WHERE ROW_NUMBER() OVER (PARTITION BY name ORDER BY id) > 1
Write a query to get the second highest salary from an employees table.
Table: employees (id, name, salary)
**difficulty: ** medium
SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);
What is the difference between HAVING
and WHERE
?
employees (id, name, dept_id, salary)
**difficulty: ** easy
WHERE
filters rows before grouping and HAVING
filters groups after aggregation.
-- WHERE SELECT * FROM employees WHERE salary > 50000;
-- HAVING SELECT dept_id, AVG(salary) FROM employees GROUP BY dept_id HAVING AVG(salary) > 60000;
How do you handle nulls in SQL when performing calculations?
Table: employees (id, name, salary)
**difficulty: ** easy
Use COALESCE()
or ISNULL()
to replace nulls with default values.
SELECT name, COALESCE(salary, 0) AS salary FROM employees; -- COALESCE replaces NULL with a value
How do you find the 90th percentile in a dataset?
Table: employees (id, name, salary)
**difficulty: ** hard
With PERCENTILE_CONT
SELECT PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY salary) AS percentile_salary FROM employees;
How do you **rank rows **in a table?
Table: employees (id, name, salary)
**difficulty: ** medium
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank FROM employees;
What is the difference between TRUNCATE
and DELETE
?
Table: employees (id, name, dept_id)
**difficulty: ** easy
TRUNCATE TABLE employees; -- Deletes all rows
DELETE FROM employees WHERE dept_id = 101; -- Deletes specific rows
Table: employees (id, name, dept_id)
What is a primary key, and why is it important? Create a table with a primary key.
Table: employees (id, name)
**difficulty: ** easy
The primary key is a column in a relational database table that’s distinctive for each record. Primary keys must contain UNIQUE values, and cannot contain NULL values.
CREATE TABLE employees ( id INT PRIMARY KEY, name VARCHAR(50) )
Learn more about primary keys.
How do you pivot data in SQL?
Table: employees (id, dept_id, gender)
**difficulty: ** medium
SELECT dept_id, SUM(CASE WHEN gender = 'Male' THEN 1 ELSE 0 END) AS male_count, SUM(CASE WHEN gender = 'Female' THEN 1 ELSE 0 END) AS female_count FROM employees GROUP BY dept_id
Write a query to calculate the percentage of total sales per region?
Table: sales (region, sales)
**difficulty: ** hard
SELECT region, SUM(sales) / SUM(SUM(sales)) OVER () * 100 AS percentage FROM sales GROUP BY region;
What is the purpose of normalization in SQL?
employees (id, name, dept_id)
Tables: departments (dept_id, dept_name),
**difficulty: ** easy
To reduce redundancy and dependency by organizing data into logical tables.
CREATE TABLE departments (dept_id INT PRIMARY KEY, dept_name VARCHAR(50)); CREATE TABLE employees (id INT PRIMARY KEY, name VARCHAR(50), dept_id INT);
How do you get the total row count of a table or grouped result set?
Table: employees (id, name, dept_id)
**difficulty: ** easy
Use COUNT(*)
for the table or COUNT(column)
for grouped subsets.
-- Total row count of the table SELECT COUNT(*) AS total_rows FROM employees;
-- Row count grouped by department SELECT dept_id, COUNT(*) AS dept_row_count FROM employees GROUP BY dept_id;
What is the difference between RANK()
, DENSE_RANK()
, and ROW_NUMBER()
?
Table: employees (id, name, salary)
**difficulty: ** medium
- RANK() assigns ranks with gaps for tied values.
- DENSE_RANK() assigns consecutive ranks without gaps.
- ROW_NUMBER() assigns unique row numbers regardless of ties.
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS rank, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_number FROM employees;
How can you replace rows based on a condition while updating a table?
Table: employees (id, name, salary, dept_id)
**difficulty: ** medium
Use the UPDATE
statement with a CASE
expression in the SET
clause to selectively update rows based on conditions.
UPDATE employees SET salary = CASE WHEN dept_id = 101 THEN salary * 1.1 ELSE salary END WHERE dept_id IN (101, 102);
What are common SQL data types, and how are they used?
sales (id INT, product STRING, amount DOUBLE, sale_date TIMESTAMP
**difficulty: ** easy
-
INT
: Whole numbers -
STRING
: Text -
DOUBLE
: Decimal numbers -
TIMESTAMP
: Date and time
CREATE TABLE sales ( id INT, product STRING, amount DOUBLE, sale_date TIMESTAMP );
How do you query across multiple databases in Databricks?
Tables: sales, campaigns
Databases: sales_db, marketing_db
**difficulty: ** medium
-- Joining tables across databases SELECT s.id, s.amount, c.campaign_name FROM sales_db.sales s JOIN marketing_db.campaigns c ON s.id = c.sale_id;
What is advanced SQL functions like NTILE
?
Table: sales (id, amount, sale_date)
**difficulty: ** medium
NTILE
: Divide rows into quartiles
SELECT id, amount, NTILE(4) OVER (ORDER BY amount) AS quartile FROM sales;
What is the advanced SQL function LEAD
?
Table: sales (id, amount, sale_date)
**difficulty: ** medium
LEAD
: Access the next row’s value
SELECT id, amount, LEAD(amount) OVER (ORDER BY sale_date) AS next_amount FROM sales;
How can you use CASE
for advanced query logic?
Table: sales (id, amount)
**difficulty: ** easy
Use CASE
to create conditional columns or filters in queries.
SELECT id, amount, CASE WHEN amount > 100 THEN 'High' WHEN amount > 50 THEN 'Medium' ELSE 'Low' END AS sales_category FROM sales;
What does the LAG()
function do in SQL?
Table: sales (id, amount, sale_date)
**difficulty: ** medium
LAG()
retrieves the value of a column from the previous row in a window.
SELECT id, amount, sale_date, LAG(amount) OVER (ORDER BY sale_date) AS prev_sale FROM sales;
What does UNBOUNDED PRECEDING
mean in window functions?
sales (id, amount, sale_date)
**difficulty: ** hard
It refers to the start of the window, including all rows before the current one.
To create a cumulative sum with UNBOUNDED PRECEDING
:
SELECT id, amount, SUM(amount) OVER (ORDER BY sale_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumulative_sales FROM sales;
How does PARTITION BY
work in SQL window functions?
Table: sales (id, amount, sale_date, region)
**difficulty: ** medium
PARTITION BY
divides data into partitions to apply window functions separately to each group.
SELECT region, id, amount, RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS region_rank FROM sales;
What is the difference between RANGE
and ROWS
in windowing?
Table: sales (id, amount, sale_date)
**difficulty: ** hard
ROWS
operates on a physical set of rows (e.g., current and previous).RANGE
operates on a logical range of values (e.g., same amount).
– Using ROWS
SELECT id, amount, SUM(amount) OVER (ORDER BY sale_date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS moving_sum FROM sales;
– Using RANGE
SELECT id, amount, SUM(amount) OVER (ORDER BY sale_date RANGE BETWEEN INTERVAL 1 DAY PRECEDING AND CURRENT ROW) AS moving_sum FROM sales;
How do FIRST_VALUE()
and LAST_VALUE()
work in SQL?
Table: sales (id, amount, sale_date)
**difficulty: ** medium
FIRST_VALUE()
returns the first value in a window; LAST_VALUE()
returns the last value.
SELECT id, amount, sale_date, FIRST_VALUE(amount) OVER (ORDER BY sale_date) AS first_sale, LAST_VALUE(amount) OVER (ORDER BY sale_date ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS last_sale FROM sales;
How do you calculate a moving average in SQL using window functions?
Table: sales (id, amount, sale_date)
**difficulty: ** hard
Use AVG()
over a window frame to calculate a moving average.
– 3-day moving average
SELECT id, amount, sale_date, AVG(amount) OVER (ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg FROM sales;
How can you calculate percentiles with SQL window functions?
**difficulty: ** medium
Use PERCENT_RANK()
or CUME_DIST()
to calculate percentile rankings.
SELECT id, amount, PERCENT_RANK() OVER (ORDER BY amount) AS percent_rank, CUME_DIST() OVER (ORDER BY amount) AS cumulative_dist FROM sales;
What is the QUALIFY
statement in SQL, and how is it used?
sales (id, amount, sale_date)
**difficulty: ** medium
The QUALIFY
statement is used to filter rows after applying window functions, simplifying complex filtering logic.
In the example below, we only keep the row with the highest rank:
SELECT id, amount, RANK() OVER (ORDER BY amount DESC) AS rank FROM sales QUALIFY rank = 1;
Exercise
How would you calculate the total revenue for each product for the past 7 days?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- status
STRING (values: ‘completed’, ‘pending’, ‘cancelled’)
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- product_name
STRING
- category
STRING
- price
DOUBLE
medium
SELECT p.product_name, SUM(o.total_amount) AS total_revenue FROM orders o JOIN products p ON o.product_id = p.product_id WHERE o.status = 'completed' AND o.order_date >= CURRENT_DATE - INTERVAL 7 DAY GROUP BY p.product_name;
How would you calculate the average order value for each product category in the last 60 days?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- status
STRING (values: ‘completed’, ‘pending’, ‘cancelled’)
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
medium
SELECT p.category, AVG(o.total_amount) AS avg_order_value FROM orders o JOIN products p ON o.product_id = p.product_id WHERE o.status = 'completed' AND o.order_date >= CURRENT_DATE - INTERVAL 60 DAY GROUP BY p.category
How would you find the most recent order placed by each customer?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
medium
SELECT customer_id, MAX(order_date) AS latest_order FROM orders GROUP BY customer_id;
How would you count the total number of orders by order status for the last 90 days?
Table Name: orders
- order_id
INT
- order_date
TIMESTAMP
- status
STRING (values: ‘completed’, ‘pending’, ‘cancelled’)
SELECT status, COUNT(order_id) AS total_orders FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL 90 DAY GROUP BY status
How would you calculate the 7-day rolling average of total sales for each product category?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
advanced
SELECT p.category, o.order_date, AVG(o.total_amount) OVER (PARTITION BY p.category ORDER BY o.order_date RANGE BETWEEN INTERVAL 6 DAY PRECEDING AND CURRENT ROW) AS rolling_avg FROM orders o JOIN products p ON o.product_id = p.product_id;
How would you compare the current order total with the previous order total for each customer and calculate the difference?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
advanced
SELECT customer_id, order_id, total_amount, LAG(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS previous_order, total_amount - LAG(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS difference FROM orders;
How would you calculate the cumulative sales total for each product category?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
advanced
SELECT p.category, o.order_date, SUM(o.total_amount) OVER (PARTITION BY p.category ORDER BY o.order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumulative_sales FROM orders o JOIN products p ON o.product_id = p.product_id
How would you retrieve the first order placed by each customer?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
advanced
WITH ranked_orders AS ( SELECT customer_id, order_id, order_date, total_amount, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS row_num FROM orders ) SELECT customer_id, order_id, order_date, total_amount FROM ranked_orders WHERE row_num = 1
How would you rank products by total sales within each category for the last 30 days?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
advanced
SELECT p.category, p.product_id, SUM(o.total_amount) AS total_sales, RANK() OVER (PARTITION BY p.category ORDER BY SUM(o.total_amount) DESC) AS sales_rank FROM orders o JOIN products p ON o.product_id = p.product_id WHERE o.order_date >= CURRENT_DATE - INTERVAL 30 DAY GROUP BY p.category, p.product_id
How would you rank products by their sales total in each category, ensuring no gaps in ranking?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
advanced
SELECT p.category, p.product_id, SUM(o.total_amount) AS total_sales, DENSE_RANK() OVER (PARTITION BY p.category ORDER BY SUM(o.total_amount) DESC) AS sales_rank FROM orders o JOIN products p ON o.product_id = p.product_id GROUP BY p.category, p.product_id
How would you divide customers into 4 quartiles based on their total order value over the last 90 days?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
advanced
SELECT customer_id, SUM(total_amount) AS total_spent, NTILE(4) OVER (ORDER BY SUM(total_amount) DESC) AS quartile FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL 90 DAY GROUP BY customer_id
How would you calculate the total sales by category and overall sales using GROUPING SETS?
Table Name: orders
- order_id
INT
- product_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: products
- product_id
INT
- category
STRING
advanced
SELECT p.category, SUM(o.total_amount) AS total_sales FROM orders o JOIN products p ON o.product_id = p.product_id GROUP BY GROUPING SETS (p.category, ())
How would you generate a list of all possible combinations of products and customers?
Table Name: orders
- order_id
INT
- customer_id
INT
- product_id
INT
Table Name: products
- product_id
INT
- category
STRING
Table Name: customers
- customer_id
INT
- customer_name
STRING
advanced
SELECT c.customer_id, p.product_id FROM customers c CROSS JOIN products p
How would you get a list of all customers with their order details (including those who didn’t place any order)?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
Table Name: customers
- customer_id
INT
- customer_name
STRING
advanced
SELECT c.customer_name, o.order_id, o.total_amount FROM customers c FULL OUTER JOIN orders o ON c.customer_id = o.customer_id
How would you extract values from a JSON column to find the product name from a JSON structure in the orders table?
Table Name: orders
- order_id
INT
- order_date
TIMESTAMP
- product_info
STRING (JSON format: {"product_id": 1, "product_name": "Widget"}
)
advanced
SELECT order_id, JSON_EXTRACT(product_info, '$.product_name') AS product_name FROM orders
How would you retrieve only the latest order for each customer using the QUALIFY statement?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- total_amount
DOUBLE
advanced
SELECT customer_id, order_id, order_date, total_amount FROM ( SELECT customer_id, order_id, order_date, total_amount, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS row_num FROM orders ) QUALIFY row_num = 1
How would you flatten an array of product IDs from the orders
table to see each individual product in the order?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- product_ids
ARRAY<INT> (example: `[1, 2, 3]`)</INT>
medium
SELECT order_id, customer_id, order_date, EXplode(product_ids) AS product_id FROM orders
How would you join orders with a list of products using an exploded field for product_ids
in the orders
table?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- product_ids
ARRAY<INT> (example: `[1, 2, 3]`)</INT>
Table Name: products
- product_id
INT
- product_name
STRING
advanced
SELECT o.order_id, o.customer_id, o.order_date, p.product_name FROM orders o LATERAL VIEW EXPLODE(o.product_ids) exploded AS product_id JOIN products p ON o.product_id = p.product_id
How would you join orders with products and filter only the orders containing a specific product (e.g., product_id = 2) from the exploded product_ids
field?
Table Name: orders
- order_id
INT
- customer_id
INT
- order_date
TIMESTAMP
- product_ids
ARRAY<INT> (example: `[1, 2, 3]`)</INT>
Table Name: products
- product_id
INT
- product_name
STRING
advanced
SELECT o.order_id, o.customer_id, o.order_date, p.product_name FROM orders o LATERAL VIEW EXPLODE(o.product_ids) exploded AS product_id JOIN products p ON exploded.product_id = p.product_id WHERE exploded.product_id = 2
How would you check the number of NULL values in the email
column of the users
table?
Table Name: users
- user_id
INT
- email
STRING
medium
SELECT COUNT(*) AS null_count FROM users WHERE email IS NULL
How would you identify rows with missing values in the first_name
or last_name
columns?
Table Name: users
- user_id
INT
- first_name
STRING
- last_name
STRING
medium
SELECT user_id, first_name, last_name FROM users WHERE first_name IS NULL OR last_name IS NULL;
How would you check for duplicate entries in the email
column of the users
table?
Table Name: users
- user_id
INT
- email
STRING
easy
SELECT email, COUNT(*) AS duplicate_count FROM users GROUP BY email HAVING COUNT(*) > 1
How would you find non-numeric values in a column amount
which should contain numeric data?
Table Name: transactions
- transaction_id
INT
- amount
STRING
advanced
SELECT transaction_id, amount FROM transactions WHERE NOT amount REGEXP '^[0-9]+(\.[0-9]+)?$'
How would you identify if there are duplicate rows based on the combination of email
and user_id
in the users
table?
Table Name: users
- user_id
INT
- email
STRING
medium
SELECT email, user_id, COUNT(*) AS duplicate_count FROM users GROUP BY email, user_id HAVING COUNT(*) > 1
How would you detect invalid date formats in the birth_date
column (which should contain valid DATE
values)?
Table Name: users
- user_id
INT
- birth_date
STRING
SELECT user_id, birth_date FROM users WHERE NOT ISDATE(birth_date)
How would you check if the values in the product_id
column in the orders
table match the valid IDs in the products
table?
Table Name: orders
- order_id
INT
- product_id
INT
Table Name: products
- product_id
INT
- product_name
STRING
SELECT o.order_id, o.product_id FROM orders o LEFT JOIN products p ON o.product_id = p.product_id WHERE p.product_id IS NULL
How would you identify orders with unusually high amounts greater than 1000 in the orders
table?
Table Name: orders
- order_id
INT
- total_amount
DOUBLE
SELECT order_id, total_amount FROM orders WHERE total_amount > 1000
Handling Missing Values (Imputation Techniques)
How would you fill in missing total_amount
values in the orders
table with the average total_amount
?
Table Name: orders
- order_id
INT
- total_amount
DOUBLE
SELECT order_id, COALESCE(total_amount, (SELECT AVG(total_amount) FROM orders)) AS filled_total_amount FROM orders
Handling Outliers in Numerical Data
How would you identify outliers in the total_amount
column of the orders
table using the Interquartile Range (IQR) method?
Table Name: orders
- order_id
INT
- total_amount
DOUBLE
advanced
WITH percentiles AS ( SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY total_amount) AS Q1, PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY total_amount) AS Q3 FROM orders ) SELECT order_id, total_amount FROM orders, percentiles WHERE total_amount < Q1 - 1.5 * (Q3 - Q1) OR total_amount > Q3 + 1.5 * (Q3 - Q1)
Removing Duplicates in a Large Dataset
How would you remove duplicate entries in the users
table, keeping only the first occurrence based on email
?
Table Name: users
- user_id
INT
- email
STRING
advanced
WITH ranked_users AS ( SELECT user_id, email, ROW_NUMBER() OVER (PARTITION BY email ORDER BY user_id) AS row_num FROM users ) DELETE FROM ranked_users WHERE row_num > 1;
Data Type Conversions and Implicit Casting
How would you convert the amount
column from STRING to FLOAT in the transactions
table and check for any conversion errors?
Table Name: transactions
- transaction_id
INT
- amount
STRING
SELECT transaction_id, CAST(amount AS FLOAT) AS amount_float FROM transactions WHERE TRY_CAST(amount AS FLOAT) IS NULL
Empty Strings vs NULL Values
How would you find rows where the email
is either NULL or an empty string in the users
table?
Table Name: users
- user_id
INT
- email
STRING
SELECT user_id, email FROM users WHERE email IS NULL OR email = ''
Data Redundancy
How would you identify redundant rows where the same email
appears multiple times in the users
table?
Table Name: users
- user_id
INT
- email
STRING
SELECT email, COUNT(*) AS duplicate_count FROM users GROUP BY email HAVING COUNT(*) > 1
A product manager needs to update the price of a product, insert a new product, and delete an outdated product in the products
table.
-
Insert a new product with
product_id
= 101,product_name
= ‘Wireless Mouse’, andprice
= 29.99. -
Update the price of the product with
product_id
= 101 to 24.99. -
Delete the product with
product_id
= 101 as it’s outdated.
Table Name: products
- product_id
INT
- product_name
STRING
- price
DOUBLE
easy
insert
INSERT INTO products (product_id, product_name, price) VALUES (101, 'Wireless Mouse', 29.99);
** update**
UPDATE products SET price = 24.99 WHERE product_id = 101;
delete
DELETE FROM products WHERE product_id = 101;