Terms To Remember Flashcards
What is a row called in a database?
A Record
What is a column called in a database?
Field
Databases: what function do they provide?
They organize the information we receive so that we can use it
Tables
A collection of rows and columns. A database can contain more than one table.
The structure of tables in the database?
Schema. The database’s schema includes the information about the layout of tables and other information about the database itself.
Relational Database
Tables are made up of rows and columns. Columns are the relation.
What are common column data types?
Number, text, date, true/false value, binary data
Composite Key
Two or more fields taken together to act as a unique identifier
Foreign Key
Primary key from one table when referencing another table
What are the three types of Database Relationships?
One-to-many
Many-to-many
One-to-one
Transaction
A set of operations that all must be completed
What does ACID stand for in relation to transactions?
Atomic
Consistent
Isolated
Durable
What does Atomic mean?
Means that the transaction is indivisible, that pieces of it can’t be separated out.
What does consistent mean?
Whatever the transaction does, it needs to leave the database in a valid or consistent state. Can’t violate integrity.
What does Isolated mean?
While the activities in the transaction are being completed, nothing else can make changes to the data involved
What does Durable mean?
Information actually gets written to the database.
SQL
Structured Query Language
What does SQL do?
Allows statements to be written for DBMS to interpret how to interact with data (DML)
Offers feature to manage the database itself, such as creating or modifying tables and controlling access to tables. (DDL and DCL)
What does CRUD stand for?
Create
Read
Update
Delete
A relationship connects two pieces of data in different _____ in the same _____.
tables; database
Which is a good example of a candidate key?
an employee ID number. Any piece of data that uniquely represents a row is a candidate key, and if you have a value that occurs in the data naturally, that’s a natural key.
How many SQL clauses are in this query?
SELECT Width,Height FROM Shapes;
two. Each keyword, SELECT and FROM, defines a single clause.
In a database, what is a relation?
a set of attributes (columns) that describe information about specific instances (rows) of an entity
This was the correct answer
When is an associative table useful?
when records need to be related in a many-to-many relationship. An associative (or linking) table relates foreign keys from different tables to associate their records.
What does the term transaction mean?
A transaction is a collection of steps that must all be completed in order for a change to be made to the database.
If you reference a key from Table A in Table B, what is that value in Table B?
a foreign key. Because it refers to a key in another table, in this context, the value is called a foreign key.
In a database where you keep track of records for a school, what tables should you expect to find?
Each of these tables holds a different kind of information, and because the tables can store more than one record, the table names are pluralized.
When talking about data types, what do you call the group of types that represent text?
string types. There are various kinds of string types to accommodate text of different lengths.
To store the value 4:32PM, December 27, 2019, which data type would you use?
DATETIME. This is the correct type, because it includes both a date component and a time component. The other options here do not.
Which condition represents a NULL value?
a date cell containing no data. A cell, regardless of its data type, is NULL when it has no value.
If you don’t use a number type to store numeric data, _____.
you need to take additional steps to process the data as a number whenever you use it. Storing numeric data in numeric data types gives you the ability to work with numbers directly, as you might do when you use mathematical operations in queries.
When planning a database, what do you start with?
an Entity Relationship (ER) diagram. Using an ER diagram, you can plan out what fields will appear on which tables, and how they’re related.
Three Relationships?
one to one
one to many
many to many
Referential Integrity
Databases are aware of relationships and won’t allow a user to modify data in a way that violates those relationships.
Which is an example of referential integrity?
preventing the user from entering a record that refers to nonexistent data
Defining relationships helps you to do what?
reduce the repetition of data across tables
model real-world scenarios and requirements
understand how your data should be stored
What is it called if you delete a record and the database goes on and deletes other records associated with that record?
a cascading delete
In a one-to-many relationship, the value representing the ‘many’ side is what?
foreign key
When modeling a many-to-many relationship, how should you name the linking table?
with a combination of the names of the tables it’s linking
When you need to create a many-to-many relationship, what do you need to generate?
a linking table that has a one-to-many relationship with two or more tables
Which scenario represents a one-to-many relationship?
bank customers linked to their bank accounts. A certain customer can have multiple accounts, but a certain account can belong to only one customer.
A one-to-one relationship _____.
allows only one record to be connected to only one other record
First Normal Form (1NF) Normalization Rule?
Values in each cell should be atomic and tables should have no repeating groups.
Second Normal Form (2NF) Normalization Rule?
No value in a table should depend on only part of a key that can be used to unique identify a row
Third Normal Form (3NF) Normalization Rule?
Values should not be stored if they can be calculated from another non-key field
Denormalization
The process of intentionally duplicating information in a table, in violation of normalization rules
If you can figure out the value of one non-key field in a row by looking at another non-key field in that same row, what do you violate?
Third Normal Form
In order to put a database into Third Normal Form, _____.
it must also be in First and Second Normal Form
First Normal Form tells you to do what?
remove repeating groups
When might you choose to denormalize a table?
Retrieving the data upon request would be slow or burdensome, and you are able to pre-calculate or store a copy of the data somewhere it can be retrieved faster.
A table has two rows with the same values in all columns. Which step can you take to have this table meet the first normal form (1NF) requirements?
Add a primary key to the table. The primary key will add a unique value for each row, and thus eliminate the repeating duplicate rows issue.
What is the foreign key in the table created after this command?
CREATE TABLE Models (
ModelID INT(6) NOT NULL AUTO_INCREMENT,
Color INT(6) REFERENCES Colors(ColorID),
PRIMARY KEY(ModelID)
);
Color
Which SQL command will you use to create a new database called “mydb”?
CREATE DATABASE mydb;
Which of these is not an example of a time when you would use an aggregate function?
looking up customers with the name Rafael Montresso. Aggregate functions are used to tell you about certain characteristics of multiple records. Requesting a record by specific field values is not an aggregate operation.
Which WHERE condition can you use to find all records containing a first name starting with the letter “A”?
WHERE FirstName LIKE “A%”;
When using an aggregate function, how many results do you expect?
Aggregate functions return one value that describes a set of data.
What should come instead of the ??? placeholder for this query to return all fields and records in the table?
SELECT ??? FROM mytable;
*
The asterisk is used as a wildcard to retrieve everything from the given table.
True or False? In any given query, you can only join together a maximum of two tables.
FALSE. You can join together many tables as long as you tell the database which pairs of values on the tables are intended to match.
In order to use records from more than one table in a query, you need to _____ the tables based on some matching criteria.
join
Joining tables allows you to match rows from one table with rows on another table.
If a table is set to auto-increment the primary key, you’ll need to know the next value and set it manually when you enter a record. True or false?
FALSE.
When the database automatically increments a key field, you don’t need to worry about setting the value. The database will provide the next value in the sequence automatically.
When modifying a record, it’s a good idea to specify the record _____.
as precisely as possible, ideally using the primary key
True or false? In order to sort results based on a field, that field needs to appear in the final output.
FALSE
A field used for sorting doesn’t need to appear in the ‘SELECT’ clause of the query.
A SQL statement that returns requested records from the database is called:
a SQL query
Where can you write SQL?
in database management software
in an app’s source code
at a command-line console
You can narrow down the results that a query returns by only asking for results where a _____ matches a given value.
field
Which of these tasks can you accomplish using SQL as a DML?
inserting a record into a table
When first defining a table, what should you specify?
A. the table’s name
C. the fields and type of data they contain
E. the primary key and any referential constraints
Which sorting option shows dates from latest to earliest?
ORDER BY Date
DESC
When you use SQL statements to create or modify the structure of a database, what is SQL being used as?
a Data Definition Language (DDL)
As a Data Definition Language, SQL can be used to create and modify the structure of database tables.
What is the correct SQL syntax to use when joining tables A and B on their “ID” field?
SELECT * FROM A
JOIN B ON A.ID=B.ID
When telling the database that a certain field must not contain an empty value, you say that it is:
not null
For a table that holds the purchase amounts in a grocery store over time, which query will likely return the highest value?
SELECT SUM(amount) FROM purchases;
What is the possible issue with this query?
UPDATE mytable SET price=5;
It may update undesired records.
Without a WHERE condition, this query will update all records including possibly undesired ones.
Which database is often used in a big data context?
Hadoop and Spark are often used for big data applications.
Microsoft Access is generally considered a(n) _____ database platform.
desktop
Desktop databases are typically hosted on a workstation rather than a dedicated server, and they’re designed to support a few to a few hundred users.
Relational databases can store all of these except what?
graph data
Some NoSQL databases are designed to store data arranged as graphs rather than in relations.
A stored procedure is _____.
a predefined query or statement
Stored procedures are queries that are stored on the server, and they can be called by developers or users in their queries.
An index _____.
helps to increase the speed of lookups using a particular column at the cost of speed while modifying records
As with denormalization, indexes offer a trade off.
What is it called when a malicious user tries to change the way a SQL statement works by entering their own SQL?
injection
This kind of attack involves someone injecting their own code into an application.
If you store certain kinds of information, your database may be subject to certain compliance regulations. True or false?
TRUE
If you store personally identifiable information (PII), health information, or some other kinds of information, your database may be subject to various regulations. Be sure to do your research and stay in compliance!
What is a database transaction?
a group of statements that runs or fails as a whole
Transactions combine multiple statements into a single logical block, where one failing statement rolls back the entire action.
What is a table?
A table stores and displays data in a structured format consisting of columns and rows that are similar to those seen in Excel spreadsheets.
What does SQL do?
SQL is used to access and manipulate a database.
It can be used to select & filter, add and remove data and tables, and much more.
The SELECT statement
used to select data from a table.
*
selects all columns of the table.
in order to select everything from the Customers table, we use the following:
SELECT * FROM Customers
Instead of *, we can specify a column that we want to select.
For example, let’s select only the City column:
SELECT City FROM Customers
We can specify multiple columns in the SELECT statement by separating them with commas.
For example:
SELECT firstname, lastname FROM Customers
You can run multiple SQL queries, by separating them with semicolons.
For example:
SELECT firstname, lastname
FROM Customers;
SELECT firstname, city, age
FROM Customers;
SQL is case insensitive.
The following statements are equivalent and will produce the same result:
select city from Customers;
SELECT City FROM Customers;
sElEct ciTY From Customers;
Common practice is all upper case
The ORDER BY command
used to sort the data in a SELECT statement.
By default, ORDER BY sorts the results in
ascending order. (A to Z)
We can specify the ordering using the
ASC and DESC keywords.
SELECT * FROM Customers
ORDER BY firstname DESC
lets sort our Customers data using the city column, followed by the lastname column:
SELECT * FROM Customers
ORDER BY city ASC, lastname DESC
The SELECT statement returns all the rows in the table.
In case we only need a specific number of rows from the table, we can use
the LIMIT keyword.
let’s select only the top 3 rows from our table:
SELECT * FROM Customers
LIMIT 3
We can combine the LIMIT with ORDER BY.
For example:
SELECT * FROM Customers
ORDER BY city DESC
LIMIT 3
Remember, the LIMIT keyword has to come after the ORDER BY in the SELECT statement.
We can also select rows starting from an offset.
For example, let’s select 3 rows starting from the 3rd:
SELECT * FROM Customers
ORDER BY city DESC
LIMIT 3 OFFSET 2
The first row has the offset 0, which is why OFFSET 2 will start from the 3rd row in the result.
We can remove the duplicates using the
DISTINCT keyword.
To select only specific rows that satisfy a condition
WHERE command can be used.
For example, let’s select the rows that have the age column greater than 30:
SELECT * FROM Customers
WHERE age > 30
The following comparison operators can be used in the WHERE condition:
> greater than
< less than
= greater or equal
<= less or equal
o check for equality, the
equal to operator = can be used:
WHERE age = 42
To compare with a text value, put the
text value in single quotes.
WHERE firstname = ‘James’
The not equal operator is
<>
The BETWEEN operator can be used to
Select rows that are in a range of values.
SELECT * FROM Customers
WHERE age
BETWEEN 30 AND 50
The AND operator
combines two conditions and is satisfied if both conditions are satisfied.
SELECT * FROM Customers
WHERE age >= 42 AND age <= 60
the OR operator
satisfied if any one of its conditions holds.
SELECT * FROM Customers
WHERE city = ‘New York’ OR city = ‘Chicago’
the IN operator
Instead of multiple OR conditions for a single column, the IN operator can be used.
SELECT * FROM Customers
WHERE city IN (‘New York’, ‘Chicago’)
The NOT IN operator
allows you to exclude a list of specific values from the result set.
SELECT * FROM Customers
WHERE city NOT IN (‘New York’, ‘Chicago’)
When combining conditions, it is important to
use parentheses, so that the order to evaluate each condition is known.
Any text values in SQL statements need to be enclosed in
single quotes.
SQL supports filtering text based on a search condition.
This can be done using
the LIKE keyword, by specifying a pattern for the text.
SELECT * FROM Customers
WHERE lastname LIKE ‘%s’
For example, let’s select all Customers where the lastname column value ends with an ‘s’:
To match only a single character
the _ underscore symbol can be used.
SELECT * FROM Customers
WHERE firstname LIKE ‘_ames’
This will match all the Customer firstnames that start with any letter, then end with ‘ames’.
NULL is
the absence of value.
This means that any field in our table with no value with have the value NULL.
To check for NULL values
the IS NULL operator can be used.
SELECT * FROM Customers
WHERE age IS NULL
Similarly, we can check for non-null values using
IS NOT NULL
SELECT * FROM Customers
WHERE age IS NOT NULL
the COUNT() function
is used to return the number of records of a select statement
SELECT COUNT(*)
FROM Customers
This will return the number of rows in the table.
SUM()
It returns the sum of the values in a column.
SELECT SUM(salary)
FROM Customers
The AVG() function
returns the average value of a numeric column.
SELECT AVG(salary)
FROM Customers
MIN() and MAX() are used to
return the minimum and maximum value of a column
SELECT MIN(salary)
FROM Customers;
SELECT MAX(salary)
FROM Customers;
the UPPER() function
can be used to convert text to uppercase.
SELECT UPPER(firstname)
FROM Customers
We can provide a new name for the column with the
AS keyword
SELECT UPPER(firstname) AS Name
FROM Customers
SQL allows to group records using the
GROUP BY clause
SELECT city, COUNT(*)
FROM Customers
GROUP BY city
This query will return the number of customers in each city, as well as the city name.
We can filter groups using the
HAVING clause
SELECT city, COUNT() AS c
FROM Customers
GROUP BY city
HAVING COUNT() > 1
ORDER BY c DESC
Creating a basic table involves
naming the table and defining its columns and each column’s data type.
INT
whole number
FLOAT
floating point number
DOUBLE
A double precision floating-point number
DATE
A date in YYYY-MM-DD format
DATETIME
A date and time combination in YYYY-MM-DD HH:MM:SS format
TIMESTAMP
A timestamp, calculated from midnight, January 1, 1970
TIME
Stores the time in HH:MM:SS format
VARCHAR(M)
Variable-length character string. Max size is specified in parenthesis
TEXT
Large amount of text data
The CREATE TABLE command
defined the table name and the columns list in parentheses.
A column can also have a default value, which is applied to the rows when no value is provided for it. It is defined using the
DEFAULT keyword
CREATE TABLE Customers (
id INT,
firstname VARCHAR(128),
lastname VARCHAR(128),
salary INT DEFAULT 0,
city VARCHAR(128)
);
We can also specify if a column can or cannot have NULL values.
For example
CREATE TABLE Customers (
id INT,
firstname VARCHAR(128) NOT NULL,
lastname VARCHAR(128),
salary INT DEFAULT 0,
city VARCHAR(128)
);
Now, when inserting data into the table, the firstname column has to have a value.
When the table is created, it is empty.
We can insert data using the
INSERT command
INSERT INTO Customers VALUES
(1, ‘John’, ‘Smith’, ‘New York’, 5000)
The UPDATE statement allows you to
change values in a table.
UPDATE Customers
SET salary = 9900
WHERE id = 2
It is also possible to UPDATE multiple columns at the same time by comma-separating them:
For example:
UPDATE Customers
SET salary = 9900,
city = ‘New York’
WHERE ID = 2
Similar to UPDATE, the DELETE command is used to
delete rows from a table
DELETE FROM Customers
WHERE ID = 4
The ALTER TABLE command is used to
add, delete, or modify columns in an existing table
For example, let’s add a new column to our Customers table:
ALTER TABLE Customers
ADD age INT
we can delete an entire column with
DROP COLUMN
The ALTER TABLE command can also be used to rename a column
ALTER TABLE Customers
RENAME city TO location
We can rename an existing table using the following
ALTER TABLE Customers
RENAME TO People
To delete an entire table, the following command is used
DROP TABLE Customers
What is a Relational Database?
Stores data in tables
Normalization
The process of building the best database design
Keys
How to make everything unique in a table
Record, Rows, Tuple
All talking about the rows in a table
Columns, Field, Attributes
All talking about the columns in a table
Frontend
What the user sees
Backend
What’s going on behind the scenes
Joins
Connects data from multiple tables
Atomic Value
The value stores one thing
Entity
This would be like a column header “name”
Attributes
This is the stuff that is under the heading, like the actual names themselves; this is the actual information
Foreign Key
The child in the parent to child relationship. It is a reference to a primary key.
Explain the Child and Parent relationship concept in Database Design
Parent would be something like the user name
Child would be the comments that this user makes.
Children inherit the values from their parents
What rules are followed with Keys?
Unique, never changing, never null
Superkey
Any number of columns that create unique rows
Candidate Key
The least number of columns needed to force every row to be unique
Natural Key
Something that you naturally want to store (username and password, for example)
Surrogate Key
Giving an ID to every row in every table
Simple Key
Key consists of one column
Composite Key
Key consists of two or more columns. Most common with natural keys.
Compound Key
Key that has multiple columns, and they’re all keys.
Entity Relationship Modeling
It’s when you draw out your database before you make it
Cardinality
Relationship type between rows (ex: Cardholder and card)
Modality
Whether or not the relationship is required. (ex: not null)
Normalization
When we go through our database and start correcting things that may case problems (integrity, repeating, etc.)
What are the three steps in normalization?
1st Normal Form, 2nd Normal Form, 3rd Normal Form. Can’t do the next one until you’ve done the one before it.
What is second normal form?
Deals with a partial dependency. That’s when a column only depends on part of the primary key. In order for that to happen, you need to have a compound or composite key.
What is third normal form?
Deals with what is called a transitive dependency. Which is when a column depends on a column that depends upon the primary key.
What is an index?
It’s a list of where certain data points are.
What is a nonclustered index?
Just a separate reference that points to the data
What is a clustered index?
organizes the actual data. You can only have one of these. (like a phonebook)
Three main categories of Data Types?
Date, Numeric, and String
What is a string?
it is a combination of characters
What is numeric in data types?
Numbers, no quotes
What is date in data types?
stores a date. Think DATE, TIME, or DATETIME. Usually, colons separate.
Join?
Taking something complex in a database and presenting it in a user friendly way.
What is an inner join?
Going to take rows that are the same from one table and rows that are the same from another table and join them together to make a new table
What is an outer join?
Returning all values; left outer join, right outer join and full outer join
What is an alias?
It’s when you rename something to make it easier to understand using the “AS” command
What are the four components of a database system?
Users
Database Applications
Database Management System (DBMS)
Database
What is a database?
A self-describing collection of related records
What is metadata?
Data describing the structure of the data in a database
What is a DBMS?
Serves as an intermediary between database applications and the database
What is normalization?
It’s a process where we go through our table and try to eliminate the null values
What is a functional dependency?
ex: box price and cookie price; they are dependent upon one another
What is first normal form?
each cell within the table has one value (atomic)
What is second normal form?
Beyond all requirements of the first normal form, it must be fully functionally dependent on the entire primary key. No partial dependencies.
What is the most important concept to remember when it comes to normalization?
All your tables should have to do with one business concept.
Who originally developed SQL?
IBM
SQL is comprised of three components. What are they?
data definition language (DDL), data manipulation language (DML), and data control language (DCL)
What is DDL?
used to define and manage database structures
What is DML?
data definition and updating, and data retrieval (queries)
What is DCL?
used for creating user accounts, managing permissions, etc.
what do you end all SQL statements with?
semicolon
What does the INSERT INTO statement do?
Will add a new row to a table
What does the UPDATE statement do?
Will update the rows in a table which match the specified criteria
What does the DELETE FROM statement do?
Will delete the rows in a table which match the specified criteria
What does the SELECT statement do?
Extracts information you specify from the database.
The DISTINCT keyword may be added to the SELECT statement to
Suppress the display of duplicate rows
The WHERE clause
specifies the matching or filtering criteria for the records (rows) that are to be displayed
The multiple character wildcard character is a
percent sign
The single character wildcard character is
an underscore
Queried data can be categorized by what clause?
GROUP BY
HAVING
filters the categorized results that you get from the select statement
OUTER JOIN
Can be used to obtain data that exists in one table without matching data in the other table
What are the three stages of database development?
Requirements analysis stage, component design stage, implementation stage
Entity Class
this is like the recipe that someone follows for a cake, while the entity instance is actual cake that comes about as a result
What is an attribute?
they describe a specific entity. example are projectName, startDate, projectType, projectDescription
What is an identifier or key?
It identifies a specific instance in the entity class, for example, socialSecurityNumver, studentID, employeeID, emailAddress, departmentId
Unary Relationship
an entity is related to itself (Marriage)
Binary Relationship
one entity is related to another entity (most common) (employee and parking space)
ternary relationship
three entities involved in the relationship (doctor, patient, drug; intersection of those would be a prescription)
Relationships are named and classified by their cardinalities, which is a word that means
Count (as in the number of items in a set)
Minimum cardinality
minimum number of entities that need to participate in a relationship
What is concurrency?
People or applications may try to update the same information at the same time
What is interdependency?
Changes requested by one user may impact others
Atomic Transaction
all of the database actions occur or none of them does. Each step must be successful for the transaction to be saved
Dynamic Cursor
All changes made to the rows in the result set are visible when scrolling through the cursor
What do processing rights define?
Who is permitted to perform certain actions
When certain actions are allowed to be performed
Index
a data structure that contains a copy of some of the data from one or more existing data tables
Indexes are created on one or more ______ in a table
columns
B-Tree Index
Use pointers and several layers of nodes in order to quickly locate desired data
Bitmap Index
A table is created with the values of one attribute along the horizontal axis and the values of another attribute along the vertical axis. Can be really fast.
Hashed Index
a hashing algorithm is used in order to convert an input value into a location within an index (such as a B-tree index), which in turn contains or points to the actual data row.
What are the two categories of Business Intelligence systems?
Reporting systems
data mining applications