SQLMastery Flashcards
What is a database?
database -
Collection of data stored in a format that can be easily accesssed
How do we manage our databases?
Using a software system -
Database Management System (DBMS)
How?
We connect to a DBMS, give it instructions like query / modify, DBMS executes instructions and sends back results

What two types of Database Management Systems (DBMS) do we have?
Relational and Non-Relational (noSQL)
How is data stored in relational databases?
In tables that are linked to eachother using relationships
Each table stores data about a specific object
SQL is language we use to work with these

What are the popular relational database management systems?
MySQL
SQL Server
Oracle
**all based on standard SQL specification
What is the most popular, open source database management system in the world?
MySQL
Used in this course!
Do noSQL (non relational database management systems) understand SQL?
NO!
NoSQL systems don’t understand SQL
They have their own query language!!
Where was SQL invented?
1970 at IBM
Structured English Query Language
What will we be learning in this course?
SQL (Query language) with MySQL (DB management)
What is covered in this course?
Essential Skills -
Retrieve, insert, update, delete data
(90% will apply to other DBMS)
Core skills -
summarize data, complex queries and subqueries, built in MySQL functions, views, stored procedues
(will apply to other DBMS)
Advanced skills -
triggers, events, transactions, concurrency
Top performer skills -
designing databases (systematic approach), indexing for high performance (speed queries), securing databases (user accounts, privledges)
**if you follow and complete excercise - can use SQL in any project and pass any interview questions
What is the first step to getting data from a database?
Select the database
query will be executed against database

Is SQL a case sensitive language?
No.
However, by convention:
SQL statments are uppercase,
lowercase for everything else

What is the SELECT clause?
Allows us to choose the columns
Select
* - all columns in a table
column_names - certain columns

How do we put a comment in SQL syntax?
double hyphen “ – “

Does order matter when using the SELECT statement (SELECT, FROM, WHERE, ORDER BY)?
Yes, it does:
SELECT – column or *
FROM – table name
WHERE –filter data
ORDER BY – column to order entries
**cannot change the order of these clauses or get syntax error
**FROM, WHERE, ORDER BY ARE OPTIONAL

Can we use arithmetic operators with SQL?
Yes.

How is math executed in SQL?

Using order of operations
If you want to change order, can use parentheses

How can we change the name of a result column?

AS keyword
can give alias
can use ’ ‘ around word if has parentheses
Ex. ‘discount factor’

What is the DISTINCT keyword?

Removes duplicates

What is a SQL excercise?


What is the WHERE clause?

Used to filter data
Conditional Operaters:
>
<
<=
=
!= or <>

What is an excercise using the WHERE statement?


What is the AND operator?
Used to filter data based on multiple conditions
WHERE (condition) AND (condition)
**BOTH Conditions are TRUE

What is the OR operator?
WHERE (condition) OR (condition)
satisfy one condition only!

How can we combine AND, OR and WHERE clauses?

Combining logical operators!
Query execution engine evaluates in order:
AND operator evaluated first!!
Can use ( ) to change order and make code cleaner!

What is the NOT operator?
Changes expression:
WHERE NOT (birth_date >=‘1990-01-01’ OR points > 1000)
Becomes (flips signs):
WHERE (birth_date <=‘1990-01-01’ AND points < 1000)

What is an excercise using logical operators?

Can use an arithmetic expression in logical operators

What is the IN operator?

like chaining OR statements
Compare an attribute to a list of values

What is an excercise using the IN operator?


What is the BETWEEN operator?


What is an excercise using the BETWEEN operator?
Can use BETWEEN operator with dates, not only numbers!

What is the LIKE operator?

Allows us to search for string patterns
% any number of characters
_ single character

How can we get all the customers whose last name ends in “y”?
LIKE %y

What is an excercise using LIKE?


What is another excercise using the LIKE keyword?
Get all the customers whose phone numbers end in 9

What is the REGEX keyword?

regular expression
more powerful than LIKE statement
^ represents beginning of a string
$ represents end of a string
[a-h] multiple characters
represents two search criteria

What are excercises using the REGEX syntax?


What is the NULL keyword?

allows us to search for null values

What is an excercise using the NULL keyword?
get all orders that have not been shipped

How can we sort data in our SQL queries?

What is an excercise using ORDER BY?
Write a query produces result:
item #2, sorted by total price of item descending order

What is the LIMIT clause?

Allows us to limit number of entries retrieved

LIMIT clause should always come at end!
ORDER matters!
SELECT
FROM
WHERE
ORDER BY
LIMIT
How can we offset using LIMIT clause?

What is an excercise using the LIMIT clause?
Get top 3 customers (highest points)!

What is the JOIN clause?
Let’s us join tables
JOIN (table) ON (condition)

How do we select specific columns across multiple tables using the JOIN clause?
qualify them with specific table

How can we simplify our code using alias?

What is an excercise using the JOIN clause?
join the order_items table with the products table
display order ID, product ID, quantity and price

How can we join across databases?
Real world db admin must do this!
prefix with the database name

What is a self JOIN?

Joining a table with itself!
HOW?
pre-fix each column with an alias
use a different alias for self join table
Ex. join employee table with itself to get the name of the manager
JOIN reports_to column with employee_id column in employees table

How do we join multiple tables?
In real world, you’ll join 10 tables!
Ex. Join
orders table
w/ customers table
w/ order_status table

What’s an excercise using multiple table JOINS?
Show payments with name of client, payment method
JOIN
payments table
w/ clients table
w/ payment_methods table

What are compound JOIN conditions?

When we have a composite primary key
cannot uniquely identify an order by one column, can use two combined to uniquely identify
Ex. Order_Id and product_ID are duplicated

What is another way to write this query using implicit JOIN syntax?

Better to use explicit JOIN syntax!
Two queries are equivalent (explicit and implicit JOIN syntax)
HOWEVER:
Mosh, doesn’t recommend using implicity JOIN syntax
WHY?
if forget WHERE clause, get a cross join

When do we use an outer JOIN?

OUTER keyword is optional
Ex. When we want to see all customers, not only customers who have an order (JOIN condition is based on Order table)

What two types of outer JOIN do we have?

As a best practice use LEFT joins
avoid RIGHT joins
WHY?
makes code cleaner, easier to read
WHAT?
OUTER keyword is optional
LEFT - all records from left table returned whether JOIN ON condition is true or not (null if not)
RIGHT - all records from right table

What’s an excercise using outer JOIN?
JOIN products table with order_items table

How can we solve this issue?

Replace inner JOIN with left JOIN

When do we use self outer JOIN?

When we are doing a self join
and want to see all entries (even those that do not match our criteria)

What is the USING clause?

replaces ON clause
ONLY WORKS when column name is exactly the same across different tables!
EX. Cannot use Orders.status = Order_Statuses.order_status_id

How can the USING clause replace a compound JOIN?


Excercise: use the USING clause to write a query that produces order_date, client_name, amount and payment_method.


What are NATURAL JOINS?
ALT. Syntax for JOINS
Replaces USING and ON clause
HOWEVER,
not recommended!
WHY?
Dangerous because letting DB engine guess the JOIN
Can produce unexpected RESULTS!
DISCOURAGED FROM USING!!

What is a CROSS JOIN?

Combine every record from first table with second table

Explicit Synax is more clear!!
Excercise: DO a CROSS JOIN between shippers and products

What is the UNION operator?

Can combine results from multiple queries
Across same table or different tables
CAUTION:
Queries must return same # of columns or you’ll get an error

Excercise: Write a Query to product this report (UNION).


How do we INSERT a row into a table?


What is another way to write this INSERT statement?

Don’t need to explicitly state DEFAULT or NULL

How can we INSERT multiple rows in a table?

EXCERCISE: INSERT three rows into the products table

How do we insert data into multiple tables?

How can we quickly copy data from one table into another?

How do we select a subset from a table to copy into another table?


What is a subquery?
a SELECT statement used in another query (ex. INSERT)

What’s a really cool excercise using INSERT?
EXCERCISE: Create a copy of the invoices with a date in a new table, in new table, display client name.
JOIN with clients table to display client_name in new table

How do we update a single row (record) in a table?

Undo:
reset payment_total = DEFAULT, payment_date = NULL
*DEFAULT value is set to 0, MySQL will set
*can insert NULL into columns that accept NULL values

Can we use an expression to update a value in a row?
Yes

How can we update multiple rows in a table?

What’s an excercise with updating multiple rows?
WRITE SQL statement:
Give any customer born befor 1990 an extra 50 points


How can we use subqueries in an UPDATE statement?

subquery = SELECT statement in another query

What is an excercise using a subquery?
Ex. set the status of a customer to gold if they have over 3000 points.

How can we delete data in SQL?
DELETE FROM table
WHERE condition

How do we recreate our databases?
execute .sql for creating databases

What is a function?
A piece of pre-built code we can re-use!
MySQL comes with a bunch of pre-built functions.
Ex. Aggregate Functions - take a series of values and aggregate to produce a single value
MAX ( )
MIN ( )
AVG ( )
SUM ( )
What are aggregate functions?
Pre-built code we can use!
Aggregate Functions -
take a series of values and aggregate to produce a single value
MAX ( )
MIN ( )
AVG ( )
SUM ( )
How can we find the MAX, MIN and AVG invoice total?
Use pre-built aggregate functions!

MAX ( )
MIN ( )
AVG ( )
Can we use MAX ( ) on non-numeric values?
Yes. Can use on dates and strings!

What is the COUNT function?
Returns count of items with non NULL values
Count(*) give all items count irrespective of NULL values

What is the SUM function?
give total
can use an expression
Ex. (invoice_total * 1.1) will multiple each row (value) by 1.1 then add them!
Can we use aggregate functions on filtered data?
Yes, using WHERE clause

How can we exclude duplicate data?
DISTINCT keyword

What is an excercise using aggregate functions?
Generate this table


What is the GROUP BY keyword using aggregate functions?

can see total sales per client
group by one or more columns
to group data
Ex. by client_id, total sales for each client

How can we get the total invoices per client for the first half of the year?

How can we see the total sales for each state and city?

What is an excercise using the GROUP BY clause?

When do we use the HAVING clause?
to filter data after we group our rows

What is the difference between the HAVING clause and the WHERE clause?
- the HAVING clause is used to filter data AFTER grouping
the WHERE clause is used to filter data BEFORE grouping
- HAVING clause the column must be included in SELECT clause
using a WHERE clause can select any column

How can we write a compound search condition in the HAVING clause?
can use one or more conditions
HAVING column must be included in SELECT clause
using a WHERE

How can we get the customers located in Virginia who have spent more than $100? (use HAVING clause)
As a rule of thumb:
whenever you have an aggregate function in a SELECT statement, grouping data, group by all columns in SELECT clause
This excercise very similar to REAL WORLD queries!!

What is the with ROLLUP operator?

powerful operator for summarizing data!!
Calculates summary for each group and entire result set
ONLY AVAILABLE IN MYSQL (NOT IN ORACLE, other DBMS!!)
Ex. sum for San Francisco, sum for all cities in CA, finally sum for all states (entire result set)

Write a query to produce this result (ROLLUP operator):

Total payments received for each payment method (cash, credit card, wire transfer, PayPal)

What is a subquery?
Can write subqueries in SELECT, FROM or WHERE clauses!
A statement within another SQL query
writing complex queries involves subqueries!!
Ex. get products more expensive than lettuce

How can we get all the products that are more expensive than lettuce?
subquery!
MySQL will evaluate subquery first, then return result to our main query!

How can we find the employees who earn more than average? (HINT: subquery)

How can we find the products that have never been ordered?
- Get unique list of products ordered from the order_items table (subquery)
- Get all products NOT IN subquery
**In this example, subquery is returning a list of objects!
(vs. single value)

How can we write a query to find the clients without invoices?

How can we re-write this subquery as a JOIN?


When should we use a JOIN vs. a subquery?
Performance vs. readability
Some queries get executed faster! (execution order)
query that is most readable!
Ex. Clients without invoices, first approach is more readable!
How can we find the customers who have ordered lettuce using both a subquery and a JOIN?

Second query is more readable
Why?
There is a natural relationship between customers, orders and order_items!
Filter clearly explains what customers!

How can we select all invoices larger than client 3’s max invoice? (HINT: ALL keyword)

subqueries!
ALL keyword
Subquery returns list of values

What is the ANY keyword?

= ANY is equivalent to IN
Ex. Select clients with at least two invoices

How can we write a query to return the employees whose salary is above average in their office?
for each employee, will execute subquery (returns value)

Correlated subquery!
What is a correlated subquery?
a subquery that references the outer query!
It’s executed multiple times! (for each row)
can be slow!
But very powerful!
Vs.
uncorrelated subquery - executed only once! value passed to outter query

How can we get the invoices that are larger than the client’s average invoice amount? (HINT: Correlated subquery)

What is the EXISTS operator?

Ex. select clients that have an invoice
For each client in client table, will check to see if there is a record that matches the subquery
Efficiency?
using the IN operator returns a list (could be millions long)
more efficient to use the EXISTS operator - doesn’t return a list, just a true/false, if true returns the query data

When is it more efficient to use the EXISTS operator vs. IN operator?

If the subquery after IN operator produces a large results set
more efficient to use EXISTS operator
WHY?
Using the EXISTS operator, subquery doesn’t return a result set to the outter query to the main query

How can we find the products that have never been ordered?
(HINT: use EXISTS operator)

For each product, checks if condition is true
if we have a product in order items table, subquery returns true, NOT operator becomes FALSE, item not in result set!

How can we use subqueries in the SELECT clause?

How can we get the total sales, average and difference for each customer?
SELECT clause subqueries
Virtual table!
Can store in memory and JOIN with other tables using client_id!

How do we use a subquery in the FROM clause?
Reserve SELECT clause in FROM clause for simple queries
Why?
Makes our main query complex
Views can replace this!

What is the ROUND function?
A built in function for working with numeric values

What is the TRUNCATE function?

What is the CEILING function?
Returns the smallest integer greater than or equal to a number

What is the FLOOR function?
returns largest integer less than or equal to number

What is the ABS function?
calculates absolute value of a number, always returning positive

What is the RAND ( ) function?
returns a random number between 0 - 1

What is the LENGTH ( ) function?
returns length of a string

What is the UPPER ( ) function?
also have LOWER ( ) convert to lowercase!

What is the LTRIM ( ) function?
useful for trimming data from the user!
removes whitespace on left

What is the RTRIM ( ) function?
removes any whitespaces from right side

What is the TRIM ( ) function?
removes whitespaces from left or right

What is the LEFT ( ) function?
returns the specified number or characters from left

What is the RIGHT ( ) function?
returns specified characters from the right

What is the SUBSTRING ( ) function?
returns a portion of a string given a starting index and char length

What is the LOCATE ( ) function?

returns position of first occurance of char or sequence of chars
if char doesn’t exist, get 0

What is the REPLACE ( ) function?

What is the CONCAT ( ) function?
combines two strings

What are the NOW( ), CURDATE ( ) , and CURTIME ( ) functions?
functions for working with date/time in MySQL!

How do we get the current DAY, MONTH or YEAR, HOUR, MINUTE?

YEAR ( NOW ( ) )
MONTH ( NOW( ) )
HOURS ( NOW ( ) )

What is the DAYNAME ( ) function?
returns the string day of week

What is the MONTHNAME ( ) function?

What is the EXTRACT keyword?
part of standard SQL language

How can we get all the orders placed before the current year?

What is the DATE_FORMAT operator?

Used for displaying dates to users!!
Takes two values (Date, format string)
%Y four digit year
%d day
%M month name
MySQL has more!!
EX. %H:%i for hour and minutes!

What is the TIME_FORMAT operator?
used to format dates and times!
Ex. DATE_FORMAT | TIME_FORMAT

What’s an example of performing a calculation on a date or time?
add a day or hour to a date
calculate difference between two dates
How?
What is the DATE_ADD operator?

Used to add a day or time to a date
Ex. add one day to current date time

What is the DATE_SUB function?
subtract date from a date time value

How do we calculate the difference between two dates?
DATEDIFF

What is TIME_TO_SEC function?

returns time in seconds since midnight
used to calc difference between two times

What is the IFNULL function?
Allows us to substitue a value IF NULL

What is the COALESCE function?
Allows us to return one value or another value or IF both NULL return a custom value
Supply a list of values, returns the first NOTNULL value in the list

How do we produce this table?

IFNULL

What is the IF function?
IF (expression, first, second)
evaluates an expression, if TRUE, returns first value
IF false, returns second value

How do we put a label on a product that has been ordered many times (more than once)?
HINT: Use IF function

What is the CASE operator?

Used to test multiple expressions (like IF, ELSE IF, ELSE)
Syntax:
CASE
WHEN Expression
WHEN Expression
ELSE Expression
END

How do we produce this table?
HINT: CASE operator


What are views?

Complex queries in SELECT, WHERE, etc.
Can save queries in a view, reuse in many places
HOW?
creates a view object (doesn’t return a table)
can select view to see table data!
Can use view like a table!

What can we do with views?
Use a view just like a table!
Can JOIN with any other tables with client_id
Ex.
SELECT *
FROM sales_by_client
WHERE total_sales > 500
EX.
SELECT *
FROM sales_by_client
JOIN clients USING (client_id)

Why are views so powerful?
They can greatly simplify our future queries

Don’t have to write a SELECT statement from scratch!!
Can re-use complex queries
How are views like virtual tables?

They don’t store data!
Data is actually stored in tables
views only provide a view to underlying table
Simplifying our future queries!!

How can we create a view to display a client’s balance?
(HINT: CREATE VIEW)


What if our view has an error?
We can delete a view if we realize our query has a problem!

Drop it. Recreate it
Use REPLACE keyword (prefered way)
What is a common practice for SQL queries behind views?

Save queries in SQL file (view)!
Check these SQL files into a GIT repository
Share repository with others
So, everyone can recreate databases on their machines!

Where else can we use views?
(HINT: INSERT, SELECT, DELETE)
Can use them in DELETE, INSERT, SELECT
only if….updatable view! can update data in it
- No DISTINCT keyword is used in the view
- The view has no Aggregate Functions (MIN, MAX, AVERAGE)
- Doesn’t use GROUP BY / HAVING
- Doesn’t use the UNION operator
What is an updatable view
INSERT, UPDATE, DELETE
Can use the view
- No DISTINCT keyword is used in the view
- The view has no Aggregate Functions (MIN, MAX, AVERAGE)
- Doesn’t use GROUP BY / HAVING
- Doesn’t use the UNION operator
Ex. Invoice total and payment total but we don’t have a column for balance! Let’s make a view : )
Can we use this view in an INSERT, UPDATE, DELETE operation?

Yes! Its an updatable view!
It doesn’t have:
DISTINCT keyword
Aggregate Functions (AVG, MIN, MAX, etc)
UNION OPERATOR
Therefore:
We can update data in this table
Can delete data from the view

When do we use views to insert, update or delete data?
When we don’t have access to underlying tables (due to permissions)
But views must be updatable!!
What is the WITH CHECK OPTION?

Allows us to prevent an INSERT or UPDATE statement from excluding rows from our view
HOW?
if you try to modify data that would update the view to exclude the data, you will get an error!
What are the other benefits to using views?
- Simplify queries - can save for future
- Reduce impact of changes - views provide an abstraction over our tables. We can bring a column or column name that no longer exists in a table back using views so that keeps our queries against the view from breaking!
- Restrict access to data in underlying table - can provide data security! You may exclude some columns in underlying table in view, or exclude some rows from the view. This makes user not allowed to modify underlying table columns!
Should we write SQL code inside our application code (Java, C#, Python)?
No!
Don’t write SQL in application code!
Why?
- It makes your code messy and hard to maintain - don’t mix SQL with java or python!
- Compilation Steps - if you write SQL queries in application code, will have to recompile application code everytime you make a change in your SQL query
What is a stored procedure?

A database object that contains a block of SQL code
in application code, call these procedures to get or store data!
Why?
Executed faster sometimes due to optimization
Enforce data security (remove direct access to tables or stored procedures)
Where we should write our SQL queries!
How do we create stored procedures in MySQL?

in SQL server, don’t have to change default delimiter

What do we normally call in our application code (Java, C#, python)?
Stored procedures!
But sometimes you can call it from SQL code

How can we return all the clients with balance greater than zero?
(HINT: Use Stored Procedure)

How can we create stored procedures in an easier way?

use MySQL Workbench
Right click stored procedures
Don’t have to worry about changing the delimiter!
Click Apply!
Now we have a new stored procedure

How can we drop a stored procedure if we made a mistake?
Save .sql file with dropped procedure and commit to GIT

DROP PROCEDURE IF EXISTS safe way!
How can we add parameters to stored procedures?

all parameters are required in SQL
How can we create a stored procedure for getting the invoices by client?

How do we return the customers in california by default if no state is specified in our stored procedures?


How do we return all the customers if no state is passed to our stored procedure?

How can we make this code look more professional?


How can we give default values if one or both parameters in our stored procedure are null?

How can we write a stored procedure to allow customers to make a payment?
(HINT: invoice id, payment amount, payment date params)

How can we make a payment amount for $100 on Jan 1. 2019?


How can we validate data so that we prevent a stored procedure from putting invalid data in our database?

keep validation logic to absolute min
Why?
Too much of a good thing is a bad thing
it will end up making our stored procedures bloated!

When should we do parameter validation?
As a last resort in case someone calls stored procedure without validating input
Much easier to validate input in application when user inputting data
before, hitting the database

How can we output data from a stored procedure?
avoid these as a recommendation

What types of variables do we have in SQL?
User (session) variables - used when calling stored procedures with output parameters, in memory during entire client session, SET @variable
local variable - variables defined inside a stored procedure or function, as soon as stored procedure finishes execute, they’re freed up, used in calculations
Ex. using local variables in stored procedures - as soon as finish executing stored procedure, they leave memory

What cannot a function return?
A result set with multiple columns! (like stored procedures can)
Only can return a single value!
How?
right click on functions
Ex. Calculate risk factor per client
How do we create a function?


How can we use our function get_risk_factor_for_client in a SELECT statement?

good practice to save functions .sql and push to GIT repo
DROP FUNCTION IF EXISTS to drop functions!

What should we be aware of regarding conventions?
Different organizations adopt different conventions
Don’t sweat it, adopt convention they’re already using
Don’t argue with colleagues on conventions already there!!
Just use them!
Whatever convention is there, stick to it!
Ex. procGetRiskFactor vs. getRiskFactor vs. get_risk_factor
Ex. DELIMITER $$ vs. //
What is a trigger?

A block of SQL code that automatically gets executed before or after an insert, update or delete statement
ie.
a trigger that updates a table row if another table’s value is updated (observer pattern)
Uses?
Enforce data consistency
Ex. fire trigger after Insert

How can we create a trigger that gets fired after we make a payment that reduces the total payment amount?

When a row is deleted from payments table, invoices table payment_total is automatically updated (subtract value of deleted payment)

How can we view our triggers in MySQL?

SHOW TRIGGERS LIKE ‘table_after_etc%’

How can we drop a trigger?
DROP TRIGGER IF EXISTS

What are other uses for triggers?
(HINT: logging changes to database)

logging changes in data as audits
Ex. Whenever someone inserts/deletes can log it somewhere to see who made changes when
In real world,
you’ll log changes in many tables,
you’ll learn how to design general audit tables in database design!

What is a task?
A block of SQL code that gets executed on a schedule
Very powerful!
Ex. once or a regular basis (1x per month, 1x per day at 10am), etc.
Uses?
automated database maintenance tasks such as copying data from table to archive, aggregating data for reports
What is MySQL event scheduler?
A process that runs in the background
Constantly looks for events to execute
Ex. SET GLOBAL event_schedule = ON

How can we create an event using MySQL event scheduler?

How can we view, alter or drop events?

What is a transaction?
A group of SQL statements that represent a single unit of work
all statements complete successfully or transaction fails!
Use?
make multiple changes to a database and want all to succeed or fail together as a single unit
Ex. store order with item - if order record is inserted, server crashes, end with incomplete order, database not consistent
What are the key properties of transactions?

1. Atomicity - unbreakable, each transation is a single unit or work, all statements successful and commit or it’s rolled back and undone
2. Consistency - database always remains in consistent state
3. Isolation - transations are protected from eachother, cannot interfere with eachother, if multiple transactions try to update, others have to wait for transaction to complete
4. Durability - changes made by transactions are permenant
How do we create a transaction?

In the real world, how many users try to access data in the database?
Many users at the same time
Soln?
Concurrency
Why?
if a user is trying to access data that another user is updating, could cause problems
How does MySql handle concurrency problems?
It locks the data
Other users have to wait until transaction is completed before accessing the data
What are some of the problems with concurrency?
- Lost updates - two transactions try to update the same data and no locks are used, later transaction overrides changes from previous
Ex. two transactions try to update a customer concurrently, one updates points, other tries to update state of customer, second transaction commits first and the points update is lost
Soln - locks
- Dirty reads - transaction reads data that hasn’t been committed yet
Ex. Transaction A updates points but before a commit, transaction B reads the data that is later rolled back by Transaction A
Soln - Transaction level isolation
Ex. read committed, only read data that has been committed
- Non-repeating reads - reading data twice - once initially then again after data is changed by another transaction before a commit, causing inconsistent reads
Ex. transaction A reads points, before completed, transaction B updates points, Transaction A reads points again but they’re different, data is inconsistent
Soln - repeatable reads - data read in a transaction is repeatable even if a change is made so data changes are not visible during a transaction (creates a snapshot)
- Phantom Reads - transaction A queries a list of customers, Transaction B updates a customer that now meets Transaction A’s criteria, but the data is not included
Soln - Serializable - transaction will be aware of changes made to data by other transaction and wait for them to complete first, more users, more waits, can hurt performance and scaleability (reserve for only a critical need)

What happens as we increase the level of isolation?

The more performance and scaleability issues we have
More locks involved in isolation causing transaction to wait
the lower the level of isolation the higher the level of concurrency and higher performance level BUT higher concurrency problems (more users can access data at the same time)
Default level in MySQL is repetable read
Stick to default for most situations!
Read Uncommited / Read Commited - data rarely updated

How do we set the transaction isolation level?

What is a deadlock?
A classical problem in database
different transactions cannot complete
each hold a lock the other needs
How can we minimize deadlocks?
- Look at your transactions - look at the order of your updates, follow same order when updating multiple records
- Keep transactions small and short so they’re unlikely to collide with other transactions,
- Schedule transactions to run during non peak hours when there are not a lot of users
Where are we going to use datatypes?
In next section - designing databases
String types
numeric types - numbers
date and time types - dates
blob types - storing binary
spatial types - geometric values
What are the string types?
Char(x) - fixed length like abbreviate of states
VARCHAR(x) - variable length, username, passwords, emails, addresses (max 65, 535 characters 64kb)
*can be indexed
MEDIUMTEXT (x) - JSON, .CSV, short/medium books (Max 16MB)
LONGTEXT(x) - textbooks or years or log files (Max 4GB)
TINYTEXT (x) - 255 bytes max
TEXT (x) - 64KB max but cannot be indexed
**can store zip codes and phone numbers
WHY?
Because not used in mathematical calculations
store hyphens and other delimiters
Conventions?
consistency helps maintain database
VARCHAR (50) - usernames, passwords (short strings)
VARCHAR (255) - addresses (medium strings)
Bytes?
English - 1 byte per char
European / middle eastern - 2 bytes
Asian - 3 bytes
What are the Integer types?
used to store whole numbers
TINYINT = 1 byte (-128, 127)
UNSIGNED TINYINT = (0, 255) prevent negative value from being stored in database (ex. age)
SMALLINT = 2bytes (-32k, 32k)
MEDIUMINT = 3bytes (-8M, 8M)
INT = 4bytes (-2B, 2B)
BIGINT = 8bytes (-9z, 9z)
*if data type is too small, MySQL will throw an error
*ZEROFILL - numbers will be padded with zeros, not how it’s stored only displayed
Recommendation?
Try to use the smallest data type that suits your needs
consciously choose your bytes
WHY?
keep database small and keep queries fast
What three data types do we use for storing decimal numbers?
DECIMAL (p,s) - monetary values
p - max digits including decimal (1-65)
s - max digits to right of decimal
Synonyms
DEC
NUMERIC
FIXED
**don’t use, stick to DECIMAL (p,s) but be aware of these
Others -
used in scientific calculations
FLOAT 4bytes
DOUBLE 8bytes
*store approximations, not actual numbers, low precision
How do we store “yes” or “no” values?
BOOL
BOOLEAN
*synonym for TINYINT
TRUE = 1
FALSE = 0
What are ENUMS?
Restrict values of a column to specific list of strings
ENUM(‘small’, ‘medium’, ‘large’)
SET(…)
Ex. only allow values “small” “medium” or “large” to be inserted in a column
Recommendations?
Generally, bad practice, should avoid
Why?
Changing members of ENUM is expensive - MySQL will rebuild entire table if updated, can’t get additional information, tricky to work with in application drop down, cannot reuse values in other tables causing duplications and less maintainable code
Soln?
Create a lookup table - separate table with values, can reuse table, can get all sizes with single SELECT statement, can feed drop down lists
Ex. Payment_method

What are the four data types for storing date and time values?
DATE - store date without time
TIME - time value
DATETIME - 8 bytes (beyond 2038)
TIMESTAMP - 4 bytes (only store up to 2038)
Ex. show when row was last inserted
YEAR
What are blob types?

Store large binary data
Images, videos, PDFs
TINYBLOB - binary data up to 255bytes
BLOB - binary data up to 65KB
MEDIUMBLOB - binary data up to 16MB
LONGBLOB - binary data up to 4GB
**generally better NOT to store binary data in database
Why?
- Database size increases - large binary data in the database
- Slower backups - larger database size, slower to backup
- Performance issues - reading from database is slower than reading from file system
- More code to read/store images in database

What is JSON?
a lightweight format for storing and transferring data over the internet
Javascript object notation (JSON)
used heavily in mobile and web applications!
Data is sent to backend in JSON format!
used in MySQL to store multiple key-value pairs about a product
How do we create a JSON object in SQL?

- Standard JSON format
- Internal JSON functions of MySQL
JSON_OBJECT function
JSON_ARRAY( )

How can we access individual properties from JSON objects?


How do we update JSON objects in SQL?

update existing properties
add new properties
How do we remove properties from JSON objects?
Returns a new JSON object with modified properties

What must we do as software engineers or database administrators?
Design databases
Requires a bit of planning up front
Going to learn a process for designing databases step by step!
What plays a critical role in the success of an application?
The database design!
If database is designed properly
- Can easily extend to support new biz requirements
- Can easily query with fast query results
What does a poorly designed database require?
A lot of maintenance
cost of maintining increases over time
will prevent you from supporting new features
Extracting data is difficult!
queries take several minutes to execute!
What are we learning?
A systematic approach to designing well structured databases
What is data modeling?
A process for creating a model of the data we want to store in our database
What are the four steps to data modeling?

- Understand and analyze the business requirements - solution to problem is a database, critical step many developers skip, takes time to fully understand business requirements (end users, forms, documents, databases, spreadsheets)
*better you understand the business problem, better you can find a solution (database)
*must fully understand the business requirements
- Build a conceptual model of the business - identify entities, things, concepts in business and relationships, visual model used to communicate with stakeholders to ensure both on same page
- Build a logical model - datastructure for storing data, abstract data model that is independent of database technology (shows tables, columns needed)
- Build a physical model - implementation of a logical model for a particular database technology (exact datatypes, default values for columns, primary keys, views, stored procedures, triggers)
*very specific to a database technology

What is a conceptual model?
A UML or Entity relationship model
Representing entities (people, events, locations, etc.) and their relationships
Ex. Student and Course (online learning diagram)
Modeling Tools - MSvisio, Draw.io, LucidCharts
It’s an iterative process, may add new attributes, etc. later on
constantly going back to stakeholders to add/rename/remove attributes from diagram
Process:
- Define entity and attributes
Ex.
Student - name, email, dateRegistered
Course - title, price, instructor, tages
- Define relationship between entitites
Ex.
Many to many
one to one
At this step…
No information on data types or database technology
Just a conceptual model - used to communicate with business stakeholders

What is a logical model?

Refinement of conceptual model

independent of database technology
a more details
-data types (independent of db technology)
ex. string vs. VARCHAR (255)
-relationships (one to many, one to one, etc.)
*We almost know what tables to use to structure our data
Ex. entities in logical model eventually become tables in database
What should we always do with names and addresses?
break them into smaller parts
Why?
so we can query faster
Ex. name -> firstName, lastName
What are the three types of entity relationships?

one-to-one
one-to-many
many-to-many
How can we change this relationship to contain an attribute?

day/time a student enrolls in a course is an attribute of the relationship between the student and the course!
Ex. doesn’t make sense to put in course or student, it’s an attribute of the relationship
Soln?
add new entity that represents relationship between student and course
Ex. student can have many enrollments (1 to many relationship)
Ex. each enrollment is for a particular course

What is a physical model?

implementation of a logical model for a specific database technology
includes…
-exact types in MySQL (database technology)
-nullable or not, default values set
Ex. MySQl Workbench -> file -> new model
Conventions?
Use plural names in physical models for tables
Why?
Tables are containers storing multiple entities of that type (ex. students)

When do we introduce a new column (id column) as the primary key?
If no other candidates
Ex. Email not a good primary key as primary key should never change AND it’s several characters (don’t want to duplicate data) in other tables!

What is a primary key?
column that uniquely identifies each record in a given table
Ex. student_id int
Ex. first_name & last_name = composite primary key (has multiple columns)
BUT…
first_name & last_name are not good primary keys!
Why?
They’re multiple chars, takes up space if referenced by other tables

What is the relationship between primary key and foreign keys?

When we add a relationship to a table, one end is the parent (primary key) and the other is the child (secondary key)
*primary key - parent
Ex. students table is parent (primary key)
*foreign key - child
Ex. Enrollment table (foreign key)
**Select foreign key first, then primary key when adding a relationship!

What is a foreign key?

a column in one table that references the primary key in another table
denoted with a red diamond
Ex. student_id in Enrollment references students primary key!
Ex. composite primary key from foreign keys

What is normalization?
The process of reviewing our design
to ensure we do not have data duplication!
by ensuring it meets 7 rules (7 normalizations)
almost 99% of applications..
only need to apply first 3 normal forms!

What does the first normal form say?
Each cell should have a single value
AND we cannot have repeated columns
Ex. Tags column in courses table - storing multiple tags, separated by a “,”
Soln?
Take tags column of out table and model it as a separate table called tabs
Then, add a many to many relationship between tags and courses

What are link tables?
a new table that allows us to have many to many relationships
Why?
in relational databases cannot have many to many relationships (only one to many)

What does the second normal form (2NF) say?
Every table should describe one entity,
AND every column in that table should describe the entity that the table represents
Ex. courses table - single purpose, store course records, every column is an attribute of a course vs. adding enrollment date (not an attribute of a course, rather of an enrollment)
Soln?
If an attribute doesn’t describe the entity, take it out into a separate table
What is the problem with this implementation?
(Hint: not in 2NF)

customer_name desribes a customer not an order, it is not an attribute of an order, only order_id and date are!
It’s going to cause duplication, wasting space
updates to a customer name will need to update multiple places
Soln?
put customer name in the customers table, then it’s stored in a single place, only need to update a single record
use customer_id in orders table - duplicating a 4 byte int vs. VARCHAR(50)
In what ways does this design violate the second normal form (2NF)?

courses table violates 2NF
Why?
Instructor name doesn’t belong!
Where is the single place we need to store the name of the instructor? Instructors table!
Soln?
instructors table w/ one to many relationship with courses

What does the third normal form say (3NF)?

A column in a table should not be derived from other columns
Ex. invoice_total, payment_total, balance vs. invoice_total - payment_total = balance
*balance shouldn’t be a column, it can be derived from invoice_total and payment_total
Ex. Balance column should be dropped
Ex. full_name column should be dropped
Why?
Reduce data duplication
Increase data integrity

What is Mosh’s pragmatic advice for normalization?
start with conceptual and logical models!
Don’t jump into creating tables immediately or you’ll end up with a bad design!
in real world, focus on reducing redundency
not applying 2nd, 3rd normalizations
How?
If you see duplication and it’s not foreign keys (ints) your design is not in a normalized form
What is a mistake many data modelers make?
Trying to generalize models to support future requirements!
Arguing that the model is so flexible don’t need to change in future!
Most of time, future requirements are only in their head, never happen!
These models are useless!
Don’t forget about the scope, context and requirements!
What is Mosh’s data modeling advice?

Solve today’s problems,
not future problems that may never happen!!
Build a model for your problem domain,
not the real world!
Why?
Changes will happen, whether you predict them or not!
cannot predict all future changes
can always write a script to change database model and migrate data if necessary
What if…
What if…
leads to a design with a lot of baggage you have to carry for a long time!

How do we convert or physical model to a real physical database?

MySQL wizard (forward engineering)
will create a script for creating our database!
Can save file + check into source control repository!
Copy to clipboard, MySQL workbench query window

What can forward engineering do for us?
Take our physical model (buildt in my SQL)
generate a script for creating the database
Can save the file in repository (recommended)
Can execute in a query window

How do we make changes to our tables?
(Hint: must be consistant across envrionments)
Go to physical model in MySQL
Make changes in physical model, synchronize with database
Select synchronize model
New script will be generated!
Save file, check in GIT or source control system (execute script on other db servers to update changes)

How can we make changes to a database that doesn’t have a physical model?

First, we need to reverse engineer the database to create a model
Use MySQL database -> “reverse engineer”
Ex. sql_store database doesn’t have a model
These models have benefits:
1. Help us understand the design
2. Help us identify problems with the design
3. Allow us to make changes and create a script to execute on other MySQL databases

Design a database for a system to record the necessary travel information for flight booking!
(Hint: This is an excercise in database modeling)

What is STEP ONE for creating a database to MODEL a FLIGHT BOOKING service?

Conceptual model
Don’t worry about normalization
Don’t worry about additional use cases (only business case)
Ex. DateTime booking in real world (not in requirements), leave out
Ex. Airline could be entity (normalization), do later
Ex. Many to many relationships for all (will refine later)

What is STEP TWO for creating a database to model a FLIGHT BOOKING service?

logical model
refine relationships
set type for each attribute
talk to domain experts for wording
as we learn more about the problem domain, model changes (future)
Ex. What they call “flight class”
Ex. Ticket to passenger (one passenger many tickets)
Ex. Remove airline atribute from Flight into separate entity

When should we denormalize our tables?

Tables that are frequently joined together should be combined into a table
Why?
So we don’t have to continually join the tables together
Also?
Duplication is not a big deal (don’t have a million airports)
Ex. City and State will get duplicated in Airport, proper normal form would be to separate out into table
BUT
our queries will join these tables often to find airports, this will cause performance issues!
What is STEP THREE for creating a database to model flight booking?

Create a physical model

camel_Case notation
database -> forward engineer -> continue -> save .sql script in GIT
updates?
database -> physical model update -> synchronize model -> save script in GIT
How can we design a database model for vidly?
(Hint: Step One)

Conceptual model
like a brainstorm
bare minimum (entitites + fields, basic relationship)
Next, refine in logical model

How can we design a database model for vidly?

(Hint: Step Two)
Logical Model
NumberInStock can be a tinyInt (physical model) bcuz don’t need a billion stock
coupon can be nullable (zero to one)

When engineering software what should we consider?
don’t over engineer
manage complexity (keep it simple)
don’t bring extra complexity
Why?
extra complexity will stay for lifetime of application (costly)
every update will have to deal with unnecessary complexity
How do we design a database model for vidly?
Hint: Step Three (Physical Model)

If you want to become a database administrator what must you understand?
The script that the MySQL Workbench wizard generates
Abile to create databases or modify these scripts by hand
without relying on these visual tools
Why?
Need to understand script tool generates
review it to ensure it doesn’t have any bad impacts on your database

How do we create a database by hand?
CREATE DATABASE IF NOT EXISTS ‘nameOfDb’

How do we delete a database by hand?

DROP DATABASE IF EXISTS ‘dbName’

What are the four constraints we can apply on columns when creating a table?

AUTO_INCREMENT -MySQL handles incrementing
NOT NULL -cannot be null
DEFAULT 0 -if null, gives a default value
UNIQUE - ensures every entry in column is unique

How can we alter or modify columns in our tables?
can make multiple changes
ADD column
CHANGE existing
UPDATE datatype

What should we always remember about updating databases?
Never update a database in a production environment
Why?
It can have really bad consequences
Try on a test database first, make sure the changes don’t have any bad impacts
Then, execute on a production database
How do we create a relationship between tables by hand?
Ex. add a foreign key relationship to orders table from customers table (orders is child of customer table)
convention:
fk_childTable_parentPrimaryKey (addForeignKeyOnColumn)

How do we drop or alter foreign keys on tables by hand?

What is a character set?
a table that maps each character to a number
Why?
When we store a string, MySQL converts each character to its numerical representation using a character set
Default
utf8 - 3 bytes per char (supports international language)

How can we change the character set of our database or table by hand?

Can set at db level, table level or column level!
utf8 requires 3 bytes!
latin language only 1 byte!
Scenario?
Have a table with CHAR (10) = 10 x 3 bytes = 30 bytes per entry x 1M = 30M bytes
vs.
CHAR (10) = 10 x 1 byte = 10 bytes x 1M = 10M bytes

What is a storage engine?

Determine how data is stored and what features are available
InnoDB supports transactions, foreign keys (superior)
MyISAM is old, doesn’t support transactions, foreign keys (don’t use)
We can use multiple storage engines for the same database
Caution?
MySQL has to rebuild the table if you change the storage engine, can be an expensive process
don’t do in production unless you have scheduled maintanence

Why are indexes extremely important in large databases and high traffic websites?
Because they can improve performance of queries dramatically!
know how they work, how to create them, how they speed queries
every developer, db admin must learn and understand!
Without an index, how does MySQL find records?

It has to scan every record in the table!
Ex. find customers in CA without index
scans ALL 1010 rows
Soln?
Create an index on state (like grouping by state)
way faster than reading every record!
Why?
can often fit into memory vs. disk!

Why is it faster to use indexes to find data?

They often fit into memory!
reading data from memory is always faster than reading from disk!

What are the costs of using indexes?
- Increase database size - are permentatly stored next to tables
- Slow down writes - everytime ADD, UPDATE, DELETE a record, have to update cooresponding indexes

For what should we reserve the use of indexes?
performance critical queries
Why?
indexes increase database size, slow down writes!
What is a common mistake developers make?
They add indexes at time of designing tables
Why wrong?
design indexes based on queries NOT tables!
Don’t try to solve problems that don’t exist!
Why?
whole point of using indexes is to speed up slow queries!
What do we create indexes based on?
queries
speeding up slow queries
design indexes based on queries, not your tables!
don’t blindly add indexes on tables
Why?
increases db size and slows writes!

How do we create an index?

Ex. reduce query from 1010 records to 112 rows (far faster)
Convention?
(prefix) idx_state (columnName)
Ex. idx_state

Write a query to find customers with more than 1000 points
(Hint: use Indexes)

Ex. 1010 rows searched (without index) vs. 529 rows (w/ index)!

When do we use Prefix indexes?
Problem:
If column you want to create an index on is a string
CHAR
VARCHAR
TEXT
BLOB
Why?
Disc space won’t perform well consuming all this data from the disk
small indexes are better because they fit in memory, making queries perform faster
Soln?
When indexing string columns, don’t want to include entire column, just the first few characters (prefix)
Why?
So queries will be smaller and fit in memory vs. disk space

How do we create a prefix index?
specify number of characters to include in index:
optional for char, varchar
manditory blob, text

What does collation mean?
How data is sorted in index:
A means ascending
B means descending

What is cardinality?
represents an estimated number of unique values in the index
Ex. PRIMARY has an estimated 1010 unique values
What are secondary indexes?
Indexes other than PRIMARY index
each secondary index entry contains two values (references) to the primary index

How are foreign keys indexed?
a foreign key represents
- *an index for another table’s PRIMARY key column**
- *Why?**
So we can quickly join tables
How?
MySQL creates an index on PRIMARY key and stores it as the foreign key

How do we find the optimal number of characters to include in our pre-fix indexes?
No best practice
Have to look at your data
must include enough characters to uniquely identify each entry
Ex. A one character index on last_name, MySQL might return 100k last names that start with “A”
Then have to go through one by one to find the last name
This doesn’t allow MySQL to quickly find data
many last names will begin with same one character
Procedure?
Use SELECT
COUNT (DISTINCT LEFT (index_column, #chars)
FROM indexed_table
Result?
Look for largest cardinality value with smallest number of chars
Ex. one Char = 25 cardinality
fiv Char = 966 cardinality
ten Char = 996 cardinality (30 more)
*small improvement from when doubled Chars, lower bound is optimal (5 Char)

When do we use full-text indexes?
(Hint: to do what?)
to build fast and flexible search engines within our applications
to implement a fast and powerful search engine in our application
Ex. Blog db -> allow users to search for blog posts

What is the challenge with this code?
(Hint: does posts table have an index? does a prefix index work?)

It doesn’t have an index
a pre-fix index would only contain the first few characters of the title or the body but our search query could be anywhere in the body so MySQL would have to search the entire body
Query will only return posts that have ‘react redux’ and not posts that only have ‘react’ or only have ‘redux’ or ‘redux react’ out of order

How do full text indexes work?
very different from regular indexes:
include entire string column (not just prefix)
ignore stop words
store a list of words and for each word, stores a list of rows the word appears in
Have a relevance score

How do we create a full text index?

How do we use full text indexes?
beauty is that they contain a relevancy score
it’s a floating point number between 0 - 1 (no relevance to high relevance)

What two modes do full text indexes have?
Natural language (default)
Boolean - can include/exclude certain words

What problem do composite indexes solve?
Part of a query is handled by an index, other part requires a table scan (can get slow if a lot of entries)
Ex. State is customers table has a fast index lookup, points has no index requires a table scan (could be slow if 1M customers in a state)!

What can we do with a composite index?

Index multiple columns
Should use composite indexes (in reality)
Why?
A query can have multiple filters!
Help us get optimal performance
Indexes can help us sort data faster w/ multiple indexes
Single column indexes waste a lot of space
Ex. can create a composite index with state and points

How many columns should we include in a composite index?
In MySQL an index can have a max of 16 columns
somewhere between 4-6 columns performs well
should experiment with queries and amount of data!
What should we favor?
(HINT: single column indexes or composite indexes)

Composite indexes
Why?
Many queries involve two columns in the same table

How should we pick the order of columns in a composite index?

Basic Rule:
- Put the most frequently queried column first to narrow search
Ex. 5 queries use “state” column, 1 query uses “last_name”
choose column involved in 5 queries
2. Put first column with highest cardinality (greatest # of unique values)
Ex. state column (cardinality = 48)
1M records /48 = 20k records to search
vs.
gender column (cardinality = 2)
1M records /2 = 500k records to search
*not always true (take data / queries into account!)
Why does highest cardinality first not apply here?

Always check how MySQL scans records using different indexes
Try to understand how MySQL would execute your queries with different indexes
How?
USE INDEX (alternative_index)
Ex. reduced rows from 40 to 7
Change order of index from (last_name, state) to (state, last_name)
if last name column first -> MySQL will go through each customer with ‘a’ last name, has to go through each to find customers in ‘state’
vs.
‘state’ first -> go to California, in segment, select customers with ‘a’

How can we re-write this query to utilize the index in the best possible way?
break into smaller queries
Use composite index for first query
Use a single index for second query
Ex.
create a second index on points
execute first query using state index
union
execut second query using points index
Scan 1010 records vs. 600 (composite index, single index)

How can we re-write this query to optimize the index performance?

Ex. Full index scan (1000 rows vs. 3 rows)
Why?
Used an expression in the WHERE clause
MySQL couldn’t use index in best possible way
Soln?
always isolate columns in order to have MySQL use indexes

How can we use indexes for sorting data?

When an index is placed, MySQL automatically sorts data in index
Type ALL = full table scan
Extra = using FileSort (algorithm for data sorting in table)
*expensive operation
Using index prevents MySQL from performing FileSort operation
Rule of thumb?
Columns in the ORDER BY should be same as columns in index

How should we design our order by clause to maximize MySQL sorting performance?

Using columns that are indexed
Why?
If column is not indexed, MySQL will use fileSort to sort (order)
if column is indexed, MySQL has sorted using binary tree (index)
Ex. Cost of 1000 vs. 100 (sorting cost)

How should we design our indexes?
(Hint: ORDER BY clause)

for both filtering and sorting data
ORDER BY clause using an index is 10x faster
Why?
fileSort is very expensive
it’s 10x more expensive than using index sort

What should we consider when using an index and the ORDER BY DESC?

MySQL cannot use index in ORDER BY clause DESC
so it uses expensive fileSort operation
Why?
index is sorted in ASCENDING order
Soln?
sort both columns in DESC order
so that MySQL can use index in backward index scan

Can we sort using the second column in a composite index?

(Hint: how expensive?)
expensive!
Why?
cannot use index
Soln?
WHERE state
*can narrow our search to a state or group of states to filter first, then use points (uses state_points index)

What does a composite index on state, points in the customers table contain?
customer_id, state, points
NOT all columns
Why?
MySQL includes the PRIMARY key in each index
so it can access data without touching table
fastest query we can get!
What should we consider when designing our indexes?
(Hint: How do we get a covering index?)
The columns included in the WHERE clause
the columns included in ORDER BY clause
try to include these columns in index
Finally, try to include columns in SELECT clause
*This will give you a covering index!
MySQL can use index to satisfy your query needs!
What should we watch for when working with indexes?

Duplicate indexes - my SQL will not stop you from duplicate indexes with columns in same order, will maintain them separately
Tip: SHOW INDEXES IN table before creating index
Redundant indexes -

How should we view index maintenance?
before creating new indexes,
check the existing ones!
always DROP duplicate, redundent or unused indexes!
Why?
They’re costly, slow down writes and can cause performance issues

How can we secure our databases?
Create accounts and assign privledges
Why?
If you don’t, people can access and misuse your data
Application access data on a server
must create accounts to allow users to read/write data to the database
BUT
not allow them to change database structure (add tables, delete rows, etc.)
How do we create a user that allows someone to connect to our database?
(Hint: from an IP address or domain that has MySQL installed on their machine? )
CREATE USER
john@’%.dollarlitterclub.com’ (domain)
john@127.0.0.1 (IP address)
john (no restrictions)

How can we view our users?


How can we drop a user account?

Always drop user accounts that are no longer in the organization

How can we change a user’s password?

Two ways:
- Set Password - SET PASSWORD = ‘string’
- Navigation Panel - users and privileges

How do we allow a web/desktop application permissions to read/write data to our database?

Nothing more
Cannot create new table, cannot modify tables

How do we give an administrator (admin) application permissions to read/write data to our database?

Admin Privledges:
Over one db
more than on db
an entire MySQL

How can we see the privledges assigned to a user account?

Two ways:
- SHOW GRANTS FOR user - see permissions, access
- Navigation pannel - users/privledges and schema privleges

How can we revoke a user’s privledges?
Do not grant more privledges to a user than necessary.
Take security seriously.
Always grant minimum permissions
Why?
Otherwise opening so many security holes!
