SQL/Databases Flashcards
creating tables
- Use CREATE TABLE statement
- First column is usually the primary key
Syntax
CREATE TABLE table_name (
id INTEGER PRIMARY KEY,
col1 TYPE,
col2 TYPE
)
primary key
- Unique identifier for a table record
- Added to column during creation of table
Syntax [2]:
- id INTEGER PRIMARY KEY
- id INTEGER PRIMARY KEY AUTOINCREMENT
- This syntax will leave id creation to the database
INSERT INTO
- Adds data to database
Syntax [2]:
INSERT INTO table_name
VALUES (datum1, datum2, datum3);
INSERT INTO table_name (col2, col3)
VALUES (datum2, datum3);
- With this syntax, can exclude certain values when inserting data, in particular the id
- Must have set PRIMARY KEY to AUTOINCREMENT during table creation
SELECT
- Used to get data from table
- add DISTINCT to return only unique results
Syntax:
SELECT col1, col2, col3 FROM table_name
clauses
- basically sql commands
4 clauses:
- DISTINCT
- Add to SELECT to remove duplicates
- SELECT DISTINCT city FROM country
- Add to COUNT clause to only count values once
- SELECT COUNT (DISTICT city) FROM country
- Add to SELECT to remove duplicates
- ORDER BY col_name
- sorts results
- sort in reverse order by adding negative
- WHERE col_name ><= value
- filters out results
- HAVING agg_col ><= value
- just like where, but used on alias of column created by aggregate function (agg_col)
- ex: SELECT SUM(sales) as total_sales FROM store HAVING total_sales>1000
aggregate functions
- functions that combine values from multiple rows
- Functions
- SUM
- MAX
- MIN
- COUNT
- AVG
SELECT FUNC(col_name) FROM table_name
GROUP BY
- used in conjuctino with aggregate functions
- basically, creates groups on which aggregate functions are performed
- added to the end of a SELECT statement
ex: SELECT SUM(quantity) FROM groceries GROUP BY aisle;
* instead of providing the sum over entire column, will generate sum for each aisle
operators
- Use with WHERE to add additional conditions to filter by
- Can include parantheses to control how logic is evaluated
5 operations
- AND
- Returns rows if all conditions are met
- WHERE col1>5 AND col2<5
- OR
- Returns rows if any of conditions are met
- WHERE col1>5 OR col2=10
- IN
- Shorthand for multiple or operators
- Returns rows if the column matches any of the values in the list
- WHERE genre IN (“disco”, “jazz”, “funk”)
- Values inside parentheses can also be a subquery
- NOT IN
- Like IN, but returns row if none of conditions are met
- WHERE genre NOT IN (“electronic”, “new age”)
- LIKE
- Used for inexact matching
- SELECT * FROM genre WHERE genre LIKE “%hip%”
- Often used with percentage signs (which act as wildcards)
subqueries
- using the result of on query inside another query
- the queries can be from different tables
- often used with IN operator or FROM
- ex: SELECT * FROM music WHERE artist IN (SELECT stars FROM billboards);
AS
- Assigns an alias to a column, usually an aggregate function column
- Used to rename columns on the fly
- SELECT col1 AS new_name
CASE
- Adds a label to a record based on some condition
- SQL’s way of handling if/then logic
- creates a new column, like an aggregate function
Syntax:
CASE
WHEN col1>‘val1’ THEN ‘resultA’
ELSE ‘resultB’
END AS col_output
- 4 parts: WHEN, THEN, ELSE, END AS
- WHEN: evaluate the row
- THEN: assigns a value if condition is met
- ELSE: assign as value if condition isn’t met
- END AS: assigns an alias to the new column
cross join
- naively matches every row with every other row
- put both table names after FROM
SELECT * FROM table_1, table_2
order of execution
- Order in which sql actions are performed
- FROM
- ON
- OUTER
- WHERE
- GROUP BY
- HAVING
- SELECT
- DISTINCT
- ORDER BY
inner join
- finds records that have matching values in both tables
- and combines them in one row in output
- query will only return records from either table when match is found
2 approaches:
1. Implicit inner join
SELECT *
FROM table_1, table_2
WHERE table_1.col = table_2.col
- Explicit inner join*
SELECT *
FROM table1
INNER JOIN table2
ON table_1.col = table2.col
- Explicit inner join is best practice
outer join
- find records that have matching values in both tables
- and combines them in one row in output
- but keeps all rows from one table
- and populates missing field in the other table with null
2 flavors:
- left outer join
- returns left table, and matches in right table
- right outer join
- returns right table, and matches in left table
Syntax [very similar to inner join]:
SELECT *
FROM table1
LEFT OUTER JOIN table2
ON table_1.col = table2.col
SQL query lifecycle
3 steps
- Parse
- makes sure that query is syntactically correct and that tables exist
- turns query into algebraic expression
- Optimize
- optimizes command (ex, turns 5*10 into 50)
- estimates cost (CPU and time) for each potential approach to fulfilling query
- chooses optimal approach
- Execute
- takes plan and turns it into operations on database
query tuning
- process of optimizing queries
Steps
- Use an SQL profiler to identity which queries are taking the longest amount of time
- Get execution plan for a specific command
- syntax varies based on environment
- ex: EXPLAIN QUERY PLAN select * from movies
- Optimize query manually
changing records
- Use UPDATE
- changes data in database
- tell databse what content to change (with SET) and which row to update (with WHERE)
Syntax
UPDATE table_name
SET co1_1 = val_1, col_2 = val_2,
WHERE condition;
deleting records
- Use DELETE
- remove records from database
- if WHERE is omitted, will remove all records
Syntax:
DELETE FROM table_name
WHERE condition;
changing schema
- ALTER TABLE
- changes the composition of the table after creation
- adds a new column
Syntax:
ALTER TABLE table_name
ADD col_name TYPE;
deleteing table
- Use DROP TABLE
Syntax
DROP TABLE table_name
get string length
- Use LENGTH function (LEN in some variations)
SELECT LENGTH(col_name) FROM table
NoSQL
- a variety of technologies that don’t use sql or tables
- one type is a document database
- ex:
- MongoDB
Mongoose
- NPM package for connecting to a MongoDB database
- It’s a node thing (whereas MondoDB is just a database in general)
ACID properties
- Atomicity
- either all operatins in transactions occur, or none
- no half transactions
- Consistency
- correctness
- any transaction will bring the db to a valid state
- Isolation
- if a database is queried concurrently, you get the same result as if you’d quereid sequentially
- Durability
- persistence
- changes are permanent
- once a record is created/updated, it will remain, even if there power loss, crash, etc.