Week 3 Flashcards
SQL Sublanguages (5)
- DDL - data definition language
- DML - data manipulation language
- DQL - data query language
- DCL - data control language
- TCL - transaction control language
Define DDL
- data definition language - statements used to create tables and databases as well as defined properties
- create, alter, drop, truncate
Define DML
- data manipulation language - statements used to insert or remove data from tables
- insert, update, delete
Define DQL
- data query language - statements used to query data from a table
- select
Define DCL
- data control language - statements used to control who can access data
- grant, revoke
Define TCL
- transaction control language - statements used to commit and restore data through transaction. Transactions group a set of tasks into a single execution unit.
- commit, rollback, savepoint
Numeric SQL Datatypes (7)
- INT
- TINYINT
- BIGINT
- DECIMAL
- NUMERIC
- FLOAT
- REAL
Date/Time SQL Datatypes (5)
- DATE
- TIMESTAMP
- DATETIME
- TIME
- YEAR
Character/String SQL Datatypes (5)
- CHAR
- NCHAR
- VARCHAR
- NVARCHAR
- NTEXT
Binary SQL Datatypes (3)
- BINARY
- VARBINARY
- IMAGE
Miscellaneous SQL Datatypes (4)
- CLOB
- BLOB
- XML
- JSON
“Industry etiquette” for writing SQL
- use UPPERCASE to refer to keywords, and lowercase for non-SQL specific entities (like tables or column names)
Define database schema
- Refers to the structure of the database - the columns of each table, their data types, and any constraints on them
Define constraints and list (7)
- allow us to enforce the schema by ensuring consistency and integrity of the data in the table
1. PRIMARY KEY
2. FOREIGN KEY
3. NOT NULL
4. UNIQUE
5. CHECK
6. DEFAULT
7. AUTO INCREMENT
Explain PRIMARY KEY constraint
- uniquely identifies a record in a table
- inherently composed of two other constraints - unique and not null
What are the properties a transaction must follow (4)?
ACID
- Atomicity - all or nothing (transaction must fully complete or rollback)
- Consistency - data remains in a consistent state after the transaction
- Isolation - transactions do not interfere with one another when running
- Durability - data will persist even in case of catastrophic failure
Explain the different isolation levels (4).
What read phenomena do each prevent?
- Serializable - slowest, prevents phantom reads
- Repeatable Reads - prevents non-repeatable reads, allows phantom reads
- Read Committed - prevents dirty reads, allows non-repeatable reads
- Read Uncommitted - fastest, allows dirty reads
What is RDBMS?
- relation database management system
- A database is a collection of data, and the management system is a set of programs to store and access that data in a quick and effective manner
- Developed to handle large amounts of data
What vendors are supported by RDS?
- MySQL
- Oracle
- SQLServer
- Postgres
- Amazon Aurora
- MariaDB
Relational vs non-relational databases
- non-relational does no store data within tables that relate to each other, thus does not use SQL to interact with the database
Explain FOREIGN KEY constraint
- signifies that a column represents a reference to the primary key of another table. This allows us to create relationships between tables
Explain NOT NULL constraint
- enforces that all records must have a field for the column on which this constraint is applied
Explain UNIQUE constraint
- records cannot be inserted if another record already has the same value for the column on which this is declared (similar to NOT NULL)
Explain CHECK constraint
- provides a way of performing validation on values before records are entered into the table
Explain DEFAULT constraint
- allows setting default values on columns for records that are inserted into the table
Explain AUTO INCREMENT constraint
- allows a unique number to be generated automatically when a new record is inserted into a table
What is a candidate key?
What is a composite key?
What is a surrogate key?
candidate - multiple columns that together create a primary key to uniquely identify rows
composite - the actual combination of columns to use as the primary keys
surrogate - a column that can uniquely identify a record (can act as Primary Key)
Multiplicity relationship: one to one
- each entity in a table only relates to a single entity in another table
- to enforce the one to one aspect, use a foreign key to relate to the primary key on another table; putUNIQUE constraint on the foreign key column
Multiplicity relationship: one to many / many to one
- one entity can belong to, own, or otherwise relate to multiple other entities
- to create this in the database, we add the foreign key only on the many side of the relationship
Multiplicity relationship: many to many
- implies a one-to-many relationship in both directions on the entities
- we cannot provide a direct link between the tables in the database - instead, use a 3rd table, called a junction table, to connect them. the junctiontable will have 2 foreign keys and will be the many side of both relations
what are junction tables?
- direct link between tables in a many to many relationship
- also known as bridge table
what is normalization? normal forms (3)?
- The process of reducing redundancy in a database
1. 1NF
2. 2NF
3. 3NF
“swearing to tell the truth, the whole truth, and nothing but the truth”
How would you describe AWS? What is “the cloud” or “cloud computing” and why is it so popular now?
- AWS is leading cloud platform
- Provides resources and services
- Business benefits: no need to manage infrastructure; useful tooling; worldwide distribution/deployment
- Downsides: locked into cloud provider
Define Infrastructure, Platform, and Software as a Service
- IaaS: direct access to hardware; most control
- PaaS: hardware abstracted; developer provides the software to run
- SaaS: software abstracted
What’s the difference between a Region and an Availability Zone (AZ)?
Regions cover a geographic area, all over the world
AZs are individual data centers; 3 AZs per region for redundancy and fault tolerance
How are you charged for using AWS services? Does it vary by service?
By uptime, data stored, data exchanged
Different ways to interact with AWS services?
- Web console
- AWS CLI / REST API
- AWS SDK (software development kit)
What is RDS?
relational database service - a PaaS offering that loads RDBMS software based on your configuration
Explain the AWS service - Route 53
- DNS (domain name service) for registering domain names like ‘.com’ or ‘.net’ addresses
- Works to resolve ‘somesite.come’ into an IP address where your server is located
Explain the AWS service - VPN
- Virtual Private Network - can place your cloud resources in a VPN for network security purposes
Explain the AWS service - IAM
- Identity and Access Management - manage users, roles, and policies to provide access to your cloud resources
Explain the AWS service - ELB
- Elastic load balancer - distribute incoming requests between different services
What is JDBC?
- Java Database Connectivity
- A Java API used to execute queries on various databases. Uses JDBC drivers to connect with the database
What are the core interfaces / classes in JDBC (8)?
- DriverManager class - to make a connection with a database driver
- DataSource interface - for retrieving connections, an alternative to DriverManager
- Connection interface - represents a physical connection with a database
- SQLException class - a general exception thrown when something goes wrong when accessing the database
- Statement interface - used for executing static SQL statements
- PreparedStatement interface - represents pre-compiled SQL statements
- CallableStatement interface - used to execute stored procedures
- ResultSet interface - represents data returned from the database
What is a stored procedure and how would you call it in Java?
A stored procedure is an executable block of code that is written in PL/SQL and stored in the Oracle database. A stored procedure is called from a Java class using a CallableStatement object. When the procedure is called, its name and any relevant parameters are sent over the JDBC connection to the DBMS, which executes the procedure and returns the results (if applicable) via the connection.
What is the difference between Statement and PreparedStatement?
PreparedStatements are pre-compiled by the JVM. The database doesn’t have to compile the SQL each and every time it is executed. PreparedStatement can be parameterized, which can make the SQL more readable. Furthermore, PreparedStatement will properly escape reserved characters to prevent SQL injection attacks
Steps to executing an SQL query using JDBC (6)?
- Register the driver using
.forName()
(or letDriverManager
detect and load automatically from classpath) - Create the connection (
DriverManager.getConnection(url,username,password)
) - Create a statement for executing the SQL query (
Statement st = conn.createStatement()
); - Execute the SQL query (
ResultSet rs = st.executeQuery(String sql)
) - Use
ResultSet
to get values returned (rs.getInt(1)
, etc) - Close the connection (
conn.close()
)
How to execute stored procedures using JDBC?
Use the Callable
statement interface
Which interface is responsible for transaction management?
The Connection
interface - can commit
, rollback
, and savepoint
What is Maven?
A build automation and dependcy management tool for Java applications
What is the Maven lifecycle (8)?
Validate => project is correct and all necessary information is available
Compile => compiles project source code
Test => tests all compiled code
Package => packages all compiled code to WAR/JAR file
Integration => performs all integration tests on WAR/JAR
Verify => runs checks on the results of integration tests
Install => installs WAR/JAR to local repository
Deploy => copies final WAR/JAR to the remote repository
Where / when does Maven retrieve dependencies from? Where are they stored locally?
Maven first looks to see if the dependency is in the local repo under .m2
directory. If not, it will download the necessary .jar file(s) from the remote central Maven repository into the .m2
directory
What is the POM and what is the pom.xml?
POM stands for project object model and is the model used by Maven to understand project attributes and dependencies. The pom.xml is the xml document which lists those attributes and dependencies
The first normal form (1NF) enforces that a table must:
- Have a primary key
- Each column should be as granular as possible (e.g. “Name” column should be broken up into: “First Name”, “Last Name”, “Middle Name”, etc..)
To be in second normal form (2NF), a table must:
- must already be in 1NF
- Cannot have columns that are dependent on only one part of the key
- If there are no composite primary keys, you are automatically in 2NF
To get to third normal form (3NF), a table must:
- must already be in 2NF
- Not have transitive dependencies
- This means that if column C relates to column B which relates to column A which is the primary key, this is not in 3NF because C is related to the primary key but indirectly (it is a transitive dependency)
Explain Query Clause WHERE
operates on individual records
Explain Query Clause HAVING
used with GROUP BY and aggregate functions to operate on groups ofrecords
added to SQL because the WHERE keyword cannot be used with aggregate functions.
Explain Query Clause ORDER BY
sorts the results set on a particular column (ascending or descending)
Explain Query Clause GROUP BY
used with an aggregate function to combine records together
what is an alias in SQL?
- used to give a table, or a column in a table, a temporary name.
- often used to make column names more readable.
- only exists for the duration of that query.
- created with the
AS
keyword
What are the different joins (8)?
- INNER JOIN
- OUTER JOIN
- LEFT (OUTER) JOIN
- RIGHT (OUTER) JOIN
- FULL (OUTER) JOIN
- CROSS JOIN
- SELF JOIN
- NATURAL JOIN
Describe INNER JOIN
only joins records that match based on join predicate
Describe OUTER JOIN
joins all records in both tables, will have null as value if doesn’texist
Describe LEFT [OUTER] JOIN
joins all records in left table, plus any records that match joinpredicate on right
Describe RIGHT [OUTER] JOIN
joins all records in right table, plus any records that match joinpredicate on left
Describe FULL [OUTER] JOIN
Returns all rows from both tables specified including the ones which had null values on either side
Describe NATURAL JOIN
joins tables based on columns that have the same name in both tables
Describe CROSS JOIN
cartesian product of records in both tables, i.e. each record in one table paired with each in another (results in M x N records)
Describe SELF JOIN
joins a table with itself, FK points to PK of same table - e.g. employees table has reportsTo column for referencing manager in same table
What are joins?
- Used to combine two or more tables, joins are a database technique used in SELECT statements.
- Joins are normally performed comparing primary keys to foreign keys, however, they can be performed with any type of column, as long as the types match.
- Joins are a way of denormalizing a set of tables in order to aggregate or process some query.
What is the difference between joins and set operators?
Joins combine tables horizontally
Set operators combine results sets vertically
What are the 4 set operators?
UNION (ALL) - combines results sets, removes duplicates; like UNION but allows duplicates
INTERSECT - returns the records that are in common between the two results sets
MINUS - Removes from the first result set any rows that appear in the second result set and returns what remains
EXCEPT - Same as MINUS, but for SQLServer instead of Oracle
What is SQL?
- Structured Query Language
- allows us to interact with and manipulate relational databases
How is data structured in a SQL database?
Data is inserted as records, or rows, in a table. A database can have many tables that are related to each other. Each table has different columns, each with its own data type and possibly constraints.
What is an ERD? How is it useful?
Entity relationship diagram - used to model and show relationships between different tables in the database
Explain the concept of referential integrity
- Every foreign key in the database points to a valid, existing record
- If this is broken, we have what is called an “orphan record”
What is a cascade delete?
- Setting that allows referenced records to be deleted as well, if record being deleted has a foreign key
- Ensures referential integrity, prevents orphaned records
What is the difference between DELETE, DROP, and TRUNCATE commands?
DROP removes a table entirely; DELETE will remove specific rows (unless no WHERE clause); TRUNCATE will empty a table of all data but leave the schema intact
DROP and DELETE can be rolled back; TRUNCATE cannot
What are some SQL clauses you can use with SELECT statements (8)?
- WHERE
- HAVING
- AND
- OR
- GROUP BY
- ORDER BY
- LIMIT
- LIKE
What is an index?
allow databases to speed up the process of retrieving data (like an index of a book, know where to go for info)
What is the purpose of a view? What about an index?
View is like a stored/cached query, useful for removing sensitive fields when giving access to a table
Index increases performance of queries when done on the column with the index
What’s the difference between a clustered and non-clustered index?
Clustered - written to disk, only one allowed per table
Non-clustered - uses references in memory, multiple allowed
What is a trigger? Give the syntax for creating a trigger.
Trigger is a special type of stored procedure that runs when an event happens in the database
How would you setup a primary key that automatically increments with every INSERT statement?
- Postgres >10: use the IDENTITY keyword
- Oracle 11g: use trigger and sequence together so that sequence increments on insert
- Oracle 12c: IDENTITY keyword now exists
What is the difference between scalar and aggregate functions? Give examples of each
Scalar - operates on a single value, e.g. UPPER, LOWER, TRIM, CONCAT, DATE
Aggregate - operates on multiple values, e.g. MIN, MAX, AVG, SUM
What’s the difference between implicit and explicit cursors?
- Implicit cursor created automatically for you when you run a query
- Cursor initially points to before the first record in a resultset
- Explicit cursors must be created manually but give more control
Write a query to find all data in the table Employees
SELECT * FROM employees
Write a query to find employees with a salary over 75
SELECT * FROM employees WHERE salary > 75
Write a query to find employees whose first name contains an ‘e’ or whose last name begins with ‘S’\
SELECT * FROM employees WHERE firstName LIKE ‘e’ OR lastName LIKE ‘S’
Write a query to find the first name of all employees who do not work in accounting
SELECT firstName FROM employees WHERE dept != ‘Accounting’
Write a query to find the highest paid salesperson
SELECT * FROM employees ORDER BY salary DESC LIMIT 1
Write a query to combine the resultsets of any two previous queries
Must use UNION or UNION ALL
Steps for creating a database connection
- Register the JDBC driver
- Open a connection using:
- Database URL
- Username
- Password - Execute some SQL statement using either:
- Statement
- PreparedStatement
- CallableStatement - Retrieve the results that are returned in a ResultSet object
Explain JDBC String
The database URL is an address pointing to the database to be used
The String is made up of the JDBC URL, your username, and password
Connection conn = DriverManager.getConnection(DB_URL,USERNAME,PASSWORD)
Explain Autocommit Mode
By default, when a connection is created it is in auto-commit mode, so every SQL statement acts as a transaction and is committed immediately after execution. In order to manually group statements into a transaction, simply call:
Connection conn = DriverManager.getConnection(DB_URL,USERNAME,PASSWORD);conn.setAutoCommit(false);
Explain Data Access Objects (design pattern)
- logically separate the code that accesses the database into Data Access Objects
- to use: define an interface which declares methods through which the database will be queried. Then, concrete implementation classes can implement the interface and contain the data access logic to return the required data
Explain PL/SQL - Triggers
stored programs, which are automatically executed or fired when some events occur
Explain PL/SQL - Cursors
holds the rows (one or more) returned by a SQL statement
Explain PL/SQL - Procedures
do not return a value directly; mainly used to perform an action
Explain PL/SQL - Sequences
an object that is used to generate a number sequence
this can be useful when you need to create a unique number to act as a primary key