Intermediate SQL Flashcards

1
Q

If I wanted to only get a certain number of records back like the top 5 salaries in my company. (What command could I use to get this information)

A

To get a certain number of records back, such as the top 5 salaries in your company, you could use the LIMIT keyword in a SQL query.

Assuming you have a table called employees with columns name and salary, you could use the following query to retrieve the top 5 salaries:

SELECT name, salary FROM employees ORDER BY salary DESC LIMIT 5;

In this query, the ORDER BY clause orders the results by the salary column in descending order (highest salary first), and the LIMIT keyword limits the results to the top 5 records.

The exact syntax of the LIMIT keyword may vary slightly depending on the specific database system you are using.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of a Foreign Key?

A

A foreign key is a database constraint that is used to enforce referential integrity between two tables in a relational database.

The purpose of a foreign key is to ensure that data in one table (the “child” table) corresponds to data in another table (the “parent” table) by establishing a link between the two tables.

When a foreign key is defined in a table, it references the primary key of another table, and it ensures that only valid data can be inserted into the child table. Specifically, the foreign key ensures that any value inserted into the child table’s foreign key column matches a value that already exists in the parent table’s primary key column.

For example, if you have a table of orders and a table of customers, you might create a foreign key on the orders table that references the customer_id column in the customers table. This ensures that any order placed in the orders table must correspond to a valid customer in the customers table.

By enforcing referential integrity through foreign keys, you can ensure that your database contains accurate and consistent data, which is essential for data integrity and the reliability of your applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you write out a JOIN statement?

A

o write a JOIN statement in SQL, you would use the following syntax:

SELECT column_name(s)
FROM table1
JOIN table2
ON table1.column_name = table2.column_name;

In this syntax, JOIN is used to combine rows from two or more tables based on a related column between them. The ON keyword is used to specify the condition that determines how the tables are joined.

For example, let’s say you have two tables: orders and customers. If you wanted to join these tables based on the customer_id column, you could use the following query:

SELECT orders.order_id, orders.order_date, customers.customer_name
FROM orders
JOIN customers
ON orders.customer_id = customers.customer_id;

This query would select the order_id, order_date, and customer_name columns from both the orders and customers tables and combine the rows based on the matching customer_id values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are a few of the different JOIN clauses and their differences?

A

In SQL, there are several types of JOIN clauses that can be used to combine rows from two or more tables. Here are a few of the most common JOIN clauses and their differences:

Inner Join: This type of join returns only the matching rows from both tables based on a specified join condition.

SELECT *
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;

Left Join: This type of join returns all the rows from the left table and matching rows from the right table based on a specified join condition. If there is no match in the right table, NULL values are returned.

SELECT *
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;

Right Join: This type of join returns all the rows from the right table and matching rows from the left table based on a specified join condition. If there is no match in the left table, NULL values are returned.

SELECT *
FROM table1
RIGHT JOIN table2
ON table1.column_name = table2.column_name;

Full Outer Join: This type of join returns all the rows from both tables and matching rows based on a specified join condition. If there is no match in either table, NULL values are returned.

SELECT *
FROM table1
FULL OUTER JOIN table2
ON table1.column_name = table2.column_name;

Each type of JOIN clause has its own specific use case, depending on the data you are working with and the results you want to achieve. It’s important to choose the right type of JOIN clause to get the desired results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A self join requires what to be able to join on itself?

A

To perform a self join in SQL, you need to have a table that has a foreign key relationship with itself. This means that there is a column in the table that refers to another row in the same table.

For example, let’s say you have a table called employees that contains information about employees in a company. If each employee in the table has a manager who is also an employee, you could use a self join to connect each employee to their manager by joining the table to itself using the manager_id column:

SELECT e.employee_name, m.employee_name AS manager_name
FROM employees e
JOIN employees m
ON e.manager_id = m.employee_id;

In this query, the employees table is joined to itself using the manager_id column, which contains the employee_id of the employee’s manager. By aliasing the employees table as e and the manager table as m, we can select both the employee’s name and their manager’s name in the same query.

So in summary, a self join requires a table that has a foreign key relationship with itself, which allows you to join the table to itself using a column that refers to another row in the same table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a cross join?

A

A cross join, also known as a Cartesian join, is a type of join operation in SQL that returns the Cartesian product of the two tables being joined. In other words, it returns all possible combinations of rows from each table.

To perform a cross join, you can use the CROSS JOIN keyword in your SQL query, like this:

SELECT *
FROM table1
CROSS JOIN table2;

In this example, the CROSS JOIN operation combines every row from table1 with every row from table2. The resulting table will have a number of rows equal to the number of rows in table1 multiplied by the number of rows in table2.

Cross joins are not commonly used in practice, as they tend to produce very large result sets and can be slow to execute on large tables. However, they can be useful in certain situations, such as when you need to generate all possible combinations of data for testing or analysis purposes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Tell us about normalization?

A

Normalization is a process of organizing data in a database in such a way that reduces redundancy and dependency. It is a part of the database design process that aims to eliminate data anomalies, inconsistencies and improve data integrity.

Normalization involves breaking down a table into smaller, more manageable tables, and defining relationships between them. There are several normal forms in database normalization, each with a set of rules that need to be followed to achieve the desired level of normalization.

The most commonly used normal forms are:

First Normal Form (1NF): This involves eliminating duplicate columns from the same table and creating a separate table for each set of related data.

Second Normal Form (2NF): This involves ensuring that each non-key column in a table is dependent on the entire primary key, rather than just part of it.

Third Normal Form (3NF): This involves removing columns that are not dependent on the primary key, and placing them in separate tables.

The goal of normalization is to improve data integrity and reduce the likelihood of data anomalies and inconsistencies. By organizing data in a logical, consistent manner, it becomes easier to manage and maintain, and also provides a more efficient way of retrieving data. However, normalization can also increase the complexity of a database schema, which can make it more difficult to write complex queries and maintain over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is multiplicity/cardinality?

A

In the context of data modeling and database design, multiplicity (also called cardinality) refers to the number of instances of one entity that can be related to another entity. It describes the relationship between two entities and the number of occurrences of one entity that can be associated with one occurrence of another entity.

There are several types of multiplicity:

One-to-one (1:1): In this type of relationship, one instance of an entity is associated with one instance of another entity.

One-to-many (1:N): In this type of relationship, one instance of an entity can be associated with multiple instances of another entity, but each instance of the second entity can only be associated with one instance of the first entity.

Many-to-one (N:1): In this type of relationship, multiple instances of an entity can be associated with one instance of another entity.

Many-to-many (N:N): In this type of relationship, multiple instances of one entity can be associated with multiple instances of another entity.

Multiplicity is important in database design because it helps to define the rules for how data can be related between different entities. It helps to ensure data integrity by preventing data inconsistencies and anomalies, and it also helps to optimize the database schema for performance and scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of using an alias in SQL?

A

A SQL alias is a temporary name assigned to a table or a column in a SQL query. The primary purpose of using aliases in SQL is to provide a shorter or more readable name for a table or a column. By using aliases, you can make your SQL queries more concise and easier to read, especially when dealing with long or complex queries.

But aliases also have other important uses. For example, when joining tables, aliases can help to differentiate between columns with the same name in different tables. Aliases can also be used to rename columns or to create calculated columns, which can be useful for data manipulation and analysis.

In summary, using aliases in SQL can improve the readability, clarity, and organization of your SQL queries, as well as provide additional functionality and flexibility for data manipulation and analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a view?

A

In SQL, a view is a virtual table based on the result of a SELECT query. It is a named query that is stored in the database and can be used like a table in SQL queries, but does not actually store any data.

The purpose of a view is to provide a way to simplify complex SQL queries, by allowing the user to define a named query that can be reused in multiple SQL statements. A view can also be used to restrict access to sensitive data, by providing a subset of data from a table and hiding certain columns or rows.

A view is created by defining a SELECT statement, and the resulting virtual table can be queried like any other table in the database. Views can also be used to encapsulate complex business logic, calculations, and data transformations, making them a powerful tool for data modeling and analysis.

Views are commonly used in database design and development, especially in scenarios where complex queries are required or where security and data access control are a concern.
CREATE VIEW view_name AS
SELECT column1, column2, …
FROM table_name
WHERE condition;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of constraints?

A

In the context of database design and development, constraints are rules that are enforced on the data in a database to ensure data integrity and consistency. Constraints define limitations and requirements that the data must meet, and prevent the database from accepting data that does not meet those requirements.

The purpose of constraints is to ensure that the data in a database is accurate, consistent, and valid. Constraints can help to prevent data entry errors, protect the integrity of the data, and enforce business rules and logic.

There are several types of constraints that can be applied to a database table, including:

Primary key constraint - ensures that each record in a table has a unique identifier.
Foreign key constraint - ensures that values in a column match values in another table’s primary key column.
Unique constraint - ensures that each value in a column is unique.
Check constraint - ensures that the data in a column meets a specified condition or range of values.
Not null constraint - ensures that a column cannot contain null values.
Constraints play a critical role in ensuring the accuracy, consistency, and reliability of data in a database, and are an essential part of database design and development.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When would we use a nested query?

A

A nested query, also known as a subquery, is a query within another query. It is used to retrieve data from one or more tables, and then use that data in a subsequent query to further filter, sort, or group the results.

There are several situations where a nested query might be useful:

To filter data based on the results of another query - for example, to find all employees who have a higher salary than the average salary for their department.

To compare data across different tables - for example, to find all customers who have placed an order in the last month, but who have not previously placed an order.

To perform aggregate functions on a subset of data - for example, to find the total sales for a particular product, but only for orders placed in a certain region.

To retrieve data that cannot be obtained with a simple query - for example, to find the second highest salary in a department, which requires sorting the data and then selecting the second row.

Nested queries are a powerful tool for retrieving and manipulating data in a database, and are commonly used in complex data analysis and reporting scenarios. They allow you to combine multiple queries and perform complex operations on the data, while still maintaining the integrity and security of the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of Auto-Increment / Serial?

A

Auto-increment/serial is a feature in a database that automatically generates a unique, incremental value for a specific column in a table. This is typically used to provide a primary key value for each row in the table, ensuring that each record has a unique identifier.

The purpose of using an auto-increment/serial column is to simplify data management, improve data consistency, and ensure that there are no duplicate values in the primary key column. This is especially useful when working with large databases that contain many records, where it can be difficult to manually manage primary key values.

When a new row is inserted into a table with an auto-increment/serial column, the database automatically generates a new value for that column, starting from a specified initial value and incrementing by one for each subsequent row. This ensures that each new record has a unique primary key value, even if multiple records are inserted at the same time.

Overall, auto-increment/serial is a simple and effective way to manage primary key values in a database, making it easier to maintain data integrity and consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is referential integrity?

A

Referential integrity is a concept in database design that ensures that relationships between tables remain consistent and valid. It is based on the idea that if one table contains a foreign key that references a primary key in another table, the values of the foreign key must correspond to the values of the primary key in the other table.

In practice, referential integrity is enforced through the use of constraints, such as foreign key constraints, which ensure that data in related tables remains consistent. For example, if a foreign key in one table references a primary key in another table, the database will prevent any updates or deletions in the primary key table that would invalidate the relationship between the two tables.

The purpose of referential integrity is to maintain data consistency and prevent errors in the database. By ensuring that relationships between tables are valid and consistent, referential integrity helps to prevent data corruption, inconsistencies, and other issues that can lead to errors and inaccuracies in the database.

Overall, referential integrity is an important concept in database design and plays a critical role in ensuring the accuracy and consistency of data in a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does it mean for something to be atomic?

A

In the context of databases, atomic refers to the indivisible and irreducible nature of an operation. An atomic operation is one that is either executed in full or not at all.

In databases, atomicity is often associated with transactions, which are a series of database operations that are executed as a single unit of work. This means that either all operations within a transaction are executed successfully, or none of them are. If any part of a transaction fails, all changes made during the transaction are rolled back and the database is returned to its previous state.

The concept of atomicity is crucial in ensuring data consistency and reliability. In situations where multiple users or applications may be accessing the same data simultaneously, it helps to prevent issues such as data corruption or inconsistent data.

Overall, atomicity is a fundamental property of database transactions, and is an important aspect of ensuring data integrity and reliability in database systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does transitive dependency mean?

A

Transitive dependency is a term used in database normalization to describe a type of relationship between columns in a table. It occurs when a non-key column is functionally dependent on another non-key column, which is itself functionally dependent on the primary key column of the same table.

To put it simply, a transitive dependency occurs when a column value can be determined by another non-key column value, which in turn can be determined by the primary key value. This type of dependency can lead to data redundancy and inconsistency, and can make it difficult to update or delete records without affecting other records in the table.

The solution to transitive dependency is to split the table into multiple tables, each with a single-purpose and containing only columns that are functionally dependent on the primary key column. This process is known as normalization and helps to reduce data redundancy, improve data consistency, and simplify data management.

Overall, understanding transitive dependency and normalization is an important aspect of database design and can help to ensure efficient and effective use of data.

17
Q

What is a Scalar function? Give some examples of Scalar functions, and how you could use them.

A

A scalar function is a type of function in SQL that returns a single value based on the input arguments. Scalar functions are commonly used to perform calculations, manipulate data, or validate input parameters. They can be called from within SELECT statements, WHERE clauses, and other SQL statements where a single value is required.

Here are a few examples of scalar functions and how they can be used:

ABS(): returns the absolute value of a given numeric expression. This can be useful when you need to calculate the distance between two points or the difference between two values regardless of their sign.

UPPER()/LOWER(): converts a string to all uppercase or lowercase letters, respectively. This can be useful for formatting or standardizing data, such as when you need to compare or sort strings that are in different cases.

LEN(): returns the length of a given string expression. This can be useful when you need to validate input parameters or ensure that data fits within a certain field size or data type.

DATEPART(): returns a specified part of a date or time value, such as year, month, day, hour, minute, or second. This can be useful when you need to extract or manipulate specific parts of a datetime value.

Overall, scalar functions can be a powerful tool in SQL programming and can help to simplify complex calculations or data manipulations. When used correctly, scalar functions can improve performance, reduce code complexity, and improve the accuracy and reliability of your SQL statements.

18
Q

What is an Aggregate function? Give some examples of Aggregate functions, and how you could use them.

A

An Aggregate function is a type of SQL function that operates on a set of values to return a single value. It is often used in conjunction with the GROUP BY clause to perform calculations on groups of rows in a table.

Examples of Aggregate functions include:

COUNT - returns the number of rows in a table or the number of non-null values in a specific column.
SUM - calculates the sum of all the values in a specified column.
AVG - calculates the average of all the values in a specified column.
MIN - returns the smallest value in a specified column.
MAX - returns the largest value in a specified column.
For example, if you have a table of sales data, you could use the SUM function to calculate the total sales for each region by grouping the data by region:

SELECT region, SUM(sales)
FROM sales_data
GROUP BY region;
This would return a result set with two columns: region and the sum of sales for each region.

Aggregate functions can also be used in conjunction with other SQL functions to perform more complex calculations. For example, you could use the AVG function to calculate the average sales per day for each region:

SELECT region, AVG(sales / days)
FROM sales_data
GROUP BY region;
This would return a result set with two columns: region and the average sales per day for each region.

19
Q

What is an Index?

A

An Index is a database object that helps to speed up data retrieval operations on a database table. It is similar to the index found in a book, where the page numbers are stored for each term, so that when you want to search for a particular term, you can quickly find the page number where that term appears. Similarly, in a database, an Index stores the location of the data, so that when a query is executed, it can quickly find the matching records. Indexes are created on one or more columns of a table, and they can significantly improve the performance of SELECT, JOIN, and WHERE clauses. However, they can also increase the overhead of INSERT, UPDATE, and DELETE operations, as the Indexes need to be updated whenever data is modified.
An Index in a database is like an index in a book, storing the location of data for quick retrieval. It is created on one or more columns of a table to speed up SELECT, JOIN, and WHERE clauses, but can slow down INSERT, UPDATE, and DELETE operations because the indexes must be updated when data is modified.

20
Q

What are Set Operations? How do they differ from Join statements?

A

Set operations in SQL are used to combine the results of two or more SELECT statements into a single result set. The set operations include UNION, UNION ALL, INTERSECT, and EXCEPT.

UNION: combines the results of two or more SELECT statements into a single result set that includes all the rows from each SELECT statement, eliminating duplicates.

UNION ALL: combines the results of two or more SELECT statements into a single result set that includes all the rows from each SELECT statement, including duplicates.

INTERSECT: returns only the rows that appear in both SELECT statements.

EXCEPT: returns only the rows that appear in the first SELECT statement but not in the second SELECT statement.

Join statements, on the other hand, are used to combine rows from two or more tables based on a related column between them. The join types include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. Joins are used to extract data from multiple tables and combine them into a single result set, whereas set operations are used to combine the results of two or more SELECT statements.

SQL set operations (UNION, UNION ALL, INTERSECT, and EXCEPT) combine the results of SELECT statements. UNION combines results and eliminates duplicates, UNION ALL includes duplicates. INTERSECT returns rows present in both SELECTs, while EXCEPT returns rows in the first SELECT but not the second. Join statements combine rows from two or more tables based on a related column. Join types include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. Set operations combine results of SELECTs, while joins combine data from multiple tables.