SQL & ORM Flashcards

1
Q

What is SQL?

A

Structured Query Language, is a language designed to allow both technical and non-technical users query, manipulate, and transform data from a relational database. And due to its simplicity, SQL databases provide safe and scalable storage for millions of websites and mobile applications.

!!As you might have noticed by now, SQL doesn’t require you to write the keywords all capitalized, but as a convention, it helps people distinguish SQL keywords from column and tables names, and makes the query easier to read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a relational DB?

A

A relational database represents a collection of related (two-dimensional) tables. Each of the tables are similar to an Excel spreadsheet, with a fixed number of named columns (the attributes or properties of the table) and any number of rows of data.

For example, if the Department of Motor Vehicles had a database, you might find a table containing all the known vehicles that people in the state are driving. This table might need to store the model name, type, number of wheels, and number of doors of each vehicle for example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a SQL query?

A

A query in itself is just a statement which declares what data we are looking for, where to find it in the database, and optionally, how to transform it before it is returned.

To retrieve data from a SQL database, we need to write SELECT statements

As we mentioned in the introduction, you can think of a table in SQL as a type of an entity (ie. Dogs), and each row in that table as a specific instance of that type (ie. A pug, a beagle, a different colored pug, etc). This means that the columns would then represent the common properties shared by all instances of that entity (ie. Color of fur, length of tail, etc).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does it mean to query with a constraint?

A

to filter certain results from being returned, we need to use a WHERE clause in the query. The clause is applied to each row of data by checking specific column values to determine whether it should be included in the results or not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we filter duplicates from a row in SQL?

A

Use DISTINCT after SELECT query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we sort in SQL?

A

Use ORDER BY (col) ASC/DESC;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we limit results to a Subset in SQL?

A

Using LIMIT (reduce num of rows) and OFFSET (specify where to begin)

the LIMIT and OFFSET are applied relative to the other parts of a query, they are generally done last after the other clauses have been applied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is database Normalization?

A

Database normalization is useful because it minimizes duplicate data in any single table, and allows for data in the database to grow independently of each other (ie. Types of car engines can grow independent of each type of car). As a trade-off, queries get slightly more complex since they have to be able to find data from different parts of the database, and performance issues can arise when working with many large tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe Multi-table queries with JOINs

A

Tables that share information about a single entity need to have a primary key that identifies that entity uniquely across the database. One common primary key type is an auto-incrementing integer (because they are space efficient), but it can also be a string, hashed value, so long as it is unique.

Using the JOIN clause in a query, we can combine row data across two separate tables using this unique key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an inner join?

A

Inner join produces only the set of records that match in both Table A and Table B.

The INNER JOIN is a process that matches rows from the first table and the second table which have the same key (as defined by the ON constraint) to create a result row with the combined columns from both tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the essence of joining a table?

A

Combining all pertinent columns into one big super table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an outer join?

A

Full outer join produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null.

When joining table A to table B, a LEFT JOIN simply includes rows from A regardless of whether a matching row is found in B. The RIGHT JOIN is the same, but reversed, keeping rows in B regardless of whether a match is found in A. Finally, a FULL JOIN simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table.

When using any of these new joins, you will likely have to write additional logic to deal with NULLs in the result and constraints (more on this in the next lesson).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why use Inner VS outer join

A

the INNER JOIN we used last lesson might not be sufficient because the resulting table only contains data that belongs in both of the tables.

If the two tables have asymmetric data, which can easily happen when data is entered in different stages, then we would have to use a LEFT JOIN, RIGHT JOIN or FULL JOIN instead to ensure that the data you need is not left out of the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe outer join

A

LEFT outer join includes rows from table A regardless of matching tables from B.
Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.

RIGHT outer join does the opposite. It includes rows from table B regardless if it matches with table A.

FULL combines the entirety of both tables, but there may be a bunch of nulls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When to use NULL

A

Generally we want to reduce the possibility of null because of the attention they brgin when constructing queries and constraints. We normally want data-appropriate values like 0 or “”.
If we absolutely need to use NULL, IS/IS NOT NULL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are queries with Aggregates?

A

functions) that allow you to summarize information about a group of rows of data

AVG(), SUM(), etc…
each aggregate function is going to run on the whole set of result rows and return a single value. And like normal expressions, giving your aggregate functions an alias ensures that the results will be easier to read and process.

In addition to aggregating across all the rows, you can instead apply the aggregate functions to individual groups of data within that group (ie. box office sales for Comedies vs Action movies).

17
Q

What is the HAVING clause

A

The HAVING clause constraints are written the same way as the WHERE clause constraints, and are applied to the grouped rows. With our examples, this might not seem like a particularly useful construct, but if you imagine data with millions of rows with different properties, being able to apply additional constraints is often necessary to quickly make sense of the data.

18
Q

Describe the query order of execution

A
  1. FROM and JOINs
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. DISTINCT
  7. ORDER BY
  8. LIMIT / OFFSET
19
Q

What is a Schema?

A

In SQL, the database schema is what describes the structure of each table, and the datatypes that each column of the table can contain.

20
Q

What is a self-join and how do I perform one?

A

A self join is exactly what it sounds like: an instance of a table joining with itself. The way you should visualize a self join for a given table is by imagining a join performed between two identical copies of that table.

Since we only deal with one table in a self join, we have to use aliases.

An alias is essentially a nickname for a table (or, in some cases, a column). This is necessary because the query processor needs to make a distinction between the duplicates of the same table to JOIN them. Keep in mind, the keyword AS is not necessary to alias tables or columns.

21
Q

What are subqueries? How and when do I use them?

A

A subquery can be referenced anywhere a normal table can be referenced. Inside a FROM clause, you can JOIN subqueries with other tables, inside a WHERE or HAVING constraint, you can test expressions against the results of the subquery, and even in expressions in the SELECT clause, which allow you to return data directly from the subquery. They are generally executed in the same logical order as the part of the query that they appear in, as described in the last lesson.

Because subqueries can be nested, each subquery must be fully enclosed in parentheses in order to establish proper hierarchy. Subqueries can otherwise reference any tables in the database, and make use of the constructs of a normal query (though some implementations don’t allow subqueries to use LIMIT or OFFSET).

22
Q

What is a correlated subquery?

A

A more powerful type of subquery is the correlated subquery in which the inner query references, and is dependent on, a column or alias from the outer query. Unlike the subqueries above, each of these inner queries need to be run for each of the rows in the outer query, since the inner query is dependent on the current outer query row.

23
Q

How is the IN operator related to subqueries?

A

the IN operator was used to test whether the column value in the current row existed in a fixed list of values. In complex queries, this can be extended using subqueries to test whether a column value exists in a dynamic list of values.

Select query with subquery constraint
SELECT *, …
FROM mytable
WHERE column
    IN/NOT IN (SELECT another_column
               FROM another_table);
When doing this, notice that the inner subquery must select for a column value or expression to produce a list that the outer column value can be tested against. This type of constraint is powerful when the constraints are based on current data.
24
Q

Describe SQL and its NULL and Ternary Logic

A

SQL uses ternary logic. This means that a conditional statement can evaluate to TRUE, FALSE or NULL (unknown). Whaaaa? 😮 And somehow NULL is still ‘falsy’? Unfortunately, this won’t be the only time you run into logic that defies intuition. Stay tuned for Javascript quirks.

If we ask if a NULL value == NULL, we will always get false. This is because NULL was derived to represent an unknown value. How can we know if two unknowns are the same? We can’t. Given that this sort of comparison doesn’t yield any useful information, always use IS NULL or IS NOT NULL in place of the traditional (== or !=) comparisons.

25
Q

What is COALESCE?

A

PostgreSQL COALESCE function syntax
The syntax of the COALESCE function is as follows:

1
COALESCE (argument_1, argument_2, …);
The COALESCE function accepts an unlimited number of arguments. It returns the first argument that is not null. If all arguments are null, the COALESCE function will return null.

The COALESCE function evaluates arguments from left to right until it finds the first non-null argument. All the remaining arguments from the first non-null argument are not evaluated.

The COALESCE function provides the same functionality as NVL or IFNULL function provided by SQL-standard. MySQL has IFNULL function, while Oracle provides NVL function.

We often use the COLAESCE function to substitute a default value for null values when we querying the data. For example, we want to display the excerpt from a blog post, if the excerpt is not provided, we can use the first 150 characters of the of the content of the post. To achieve this, we can use the COALESCE function as follows:

26
Q

What is an ORM?

A

An object relational mapping is the system that translates between SQL records and Ruby (or Java, or Lisp…) objects. The ActiveRecord ORM translates rows from your SQL tables into Ruby objects on fetch, and translates your Ruby objects back to rows on save. The ORM also empowers your Ruby classes with convenient methods to perform common SQL operations: for instance, if the table physicians contains a foreign key referring to offices, ActiveRecord will be able to provide your Physician class a method, #office, which will fetch the associated record. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.

27
Q

What is RDBMS? And what are its advantages?

A

Relational database management System.

1) Data Structure - search columns for matching entries
2) Multi-User Access - smultaneous access
3) Priviledges - admin can restrict users via auth
4) Network Access - listens to requests on a network. Users do not need to use the physical computer and it allows devs to create tools to interact with the databases.
5) Speed - not the fastest (this is the tradeoff for simplicity). Optimizations make it palatable.
6) Maintenance - Built in automation or OS automation tools make testing and repairing easy for admins
7) Language - RDBMS support SQL. Many RDBMS add new methods and keyworrds to SQL. An example is Active Record.

28
Q

What is SQL lite?

A

A server-less RDBMS. Not sufficient for large scale apps.

29
Q

What is DDL?

A

Data Definition language. Defines the structure of our DB tables & where we build our tables.
CREATE table
DROP table

30
Q

What is DML?

A
Data manipulation Language
SELECT
INSERT
UPDATE
DELETE
31
Q

SQL Injection attack

A

One of the most common type of internet attack. The

“?” for our values and sanitize our query.

32
Q

Why is it recommended to use Indexing? How do we do it?

A

by definition, it indexes the primary key by id

it’s important to index columns that are heavily used for lookups in queries. When you index a column, it creates a sorted data structure with pointers to the actual table. Since it’s sorted, lookups can use binary search, which as you recall runs in O(log n) time. Log base 2 of 10 million is about 23, so as you can imagine, this improves database performance (and our career prospects) dramatically.

Remember though, before you go all index happy, that indices do have a cost. They make writes (INSERTs, DELETEs, and UPDATEs) a little more taxing because the index must be updated. Furthermore, optimizations made outside of the bottleneck are no optimizations at all. So it’s important to index the right things, or you might as well have indexed nothing.

On that note, foreign keys are pretty much always a good choice for indexing because they’re frequently used in both WHERE clauses and in JOIN conditions, both of which can be incredibly taxing when not indexed.

To index, we use an Active Record migration and the add index method (eg. AddUserIdIndexToModel)