Quiz 2 Flashcards
Physical Modeling Purpose
Translating the logical description of data into the technical specifications for storing and retrieving data
Physical Modeling Goal
Creating a design for storing data that will provide adequate performance and insuring database integrity, security and recoverability
Field
Smallest unit of data in a database
Field design
choose data type; coding, compression, and encryption; controlling data integrity
Four objectives when choosing data types
represent all possible values, improve data integrity, support all data manipulations, minimize storage space
Numeric data types
INT, SMALLINT, DECIMAL, NUMERIC(precision,scale)
Approximate numeric
FLOAT, REAL, DOUBLE
Data and Date time
DATE, TIME, DATETIME, TIMESTAMP, YEAR
String data
CHAR, VARCHAR(length), BINARY, VARBINARY, BLOB, TEXT, ENUM, SET
Integrity controls
data type, range control, default values, null value control, referential integrity
Range Control
data integrity constraint which indicates the upper and lower bounds for the field.
Default values
where one specifies the value that the field will take if no data has been entered.
Null value control
allows one to specify whether null values are allowed or not in the field.
Referential integrity
ensures that any value entered into a foreign key must have a corresponding value in the primary key of the related table. This ensures that there are no misplaced values in the foreign key.
Denormalization
Transforming normalizedrelations into unnormalizedphysical record specifications
Benefits of denormalization
Can improve performance (speed) by reducing the number of table lookups (i.e., reduce number of necessary join queries)
Costs of denormalization
wasted storage space, data integrity/consistency threats
Situations where we would denormalize
two entities with a one-to-one relationship and high access frequency between them;
associative entity with nonkey attributes. If there is a high frequency of join operations between the entities, it might be more efficient to combine the associative entity with one of the other entities;
reference data. If we have a 1:M relationship and the entity on
the one side does not participate in any other relationships
Horizontal partitioning
Distributing the rows of a table into several separate files; Useful for situations where different users need access to different row
Vertical partitioning
Distributing the columns of a table into several separate files; Useful for situations where different users need access to different columns; The primary key must be repeated in each file
Advantages of partitioning
Efficiency, local optimization, security, recovery and uptime, load balancing
Disadvantages of partitioning
Inconsistent access speed, additional complexity for the programmer, extra space or update time
Many points about data replication
Improves performance by allowing multiple users to access the same data at the same time with minimum contention
sacrifices data integrity due to data duplication
best for data that is not updated often
sequential file organization
Records are stored sequentially according to a primary key value
Indexed file organization
Records are stored sequentially or nonsequentially, and an index is created that allows the application software to locate individual records.
Hashed file organization
In a hashed file organization, the address of each record is determined using a hashing algorithm.
SQL
Structured Query Language, developed by IBM
DDL
Data definition language. Commands that define a database, including creating, altering, and dropping tables and establishing constraints
DML
Data Manipulation Language. Commands that maintain and query a database
DCL
Data Control Language. Commands that control a database, including administering privileges and committing data
DDL Commands
Create, Alter, Drop, Rename
DML Commands
Select, Update, Insert, Delete
DCL Commands
Grant, Revoke
DDL is used for
physical design and maintenance
DML is used for
implementation and maintenance
DCL is used for
implementation and maintenance
T/F SQL statements are case sensitive
FALSE
T/F SQL statements can be on more than one line
TRUE
Each SQL statement ends in a
semicolon
T/F keywords cannot be split across lines
TRUE
Table commands and SQL type
DDL, create drop alter
Index, View, and Schema commands and SQL type
DDL, create and drop
Write a Table Syntax
CREATE TABLE table_name (
column1 datatype [NOT NULL] [UNIQUE] [DEFAULT default-value], columns 2 datatype [NOT NULL] [UNIQUE],
PRIMARY KEY (column_name),
FOREIGN KEY (column_name) REFERENCES table_name(column_name)
);
How would you do a range check?
At the end of the statement CHECK (AGE >0 and AGE <150)
What are all of the alter funtions
add a new column, drop an existing column, rename a column, set a default for an existing column, change starting value of auto increment
Define a view
A virtual table based on the result-set of an SQL statement. Instead of sending the complex query to the database all the time, you can save the query as a view and then use SELECT * FROM view_name to get its rows
Write a view statement
CREATE VIEW view_name AS
SELECT column1, column2, …
FROM table_name
WHERE condition;
Add a new column
ALTER TABLE table_name
ADD column_name datatype;
Drop a column
ALTER TABLE table_name
DROP COLUMN column_name, DROP COLUMN column_name;
Rename a column
ALTER TABLE table_name
RENAME COLUMN old_column_name TO new_column_name;
Set a default value for a column
ALTER TABLE table_name
ALTER COLUMN column_name SET DEFAULT default-value;
Change the starting value of auto-increment
ALTER TABLE table_name AUTO_INCREMENT=5;
Insert a record/row into a table
INSERT INTO tb_Student (StudentName, StudentAddress, Age)
VALUES
(‘Jerry’, ‘UC Dorm’, 18),
(‘Alice’, ‘UC Dorm’, 20);
Update data in a table
UPDATE tb_Student
SET SeniorMentorID = 1
WHERE StudentName = ‘Jack’;
Delete data based on a condition
DELETE FROM tb_Student
WHERE StudentName = ‘Alice’;
Select statement order
FROM, WHERE, GROUP BY, HAVING, ORDER BY
SELECT definition
identifies what columns
FROM definition
identifies what table
WHERE definition
imposes constraints for which rows show up
ORDER BY definition
sort results rows in ascending (default) or descending (desc) order
Write a simple select statement example
SELECT CustomerName, Phone
FROM CUSTOMER
WHERE CITY = ‘Atlanta’ AND STATE = ‘GA’
ORDER BY CustomerName;
=
equal to
<
less than
<=
less than or equal to
>
greater than
> =
greater than or equal to
<>
not equal to
!=
not equal to
List the logical operators
NOT, AND, OR
List the processing order of logical operators
NOT, AND, OR
list the special operators
between, is null, like, in, exists
Write an IN statement
SELECT PID, Name, State
FROM NBAPlayer
WHERE State IN (‘TX’, ‘GA’, ‘VA’);
Is equivalent to multiple OR statements
write a not in statement
SELECT PID, Name, State
FROM NBAPlayer
WHERE State NOT IN (‘TX’, ‘GA’, ‘VA’);
equivalent to multiple != and statements
write a between statement
SELECT PID, Name, State
FROM NBAPlayer
Where PID BETWEEN 3 and 5;
equivalent to Where PID >=3 and PID<=5;
IS NULL
list of records with missing values
IS NOT NULL
list of records with no missing values
LIKE
records matching a string pattern
like statement for beginning with a letter
LIKE ‘A%’
like statement for ending with a letter
LIKE ‘%A’
LIKE statement for containing a letter
LIKE ‘%A%’
like statement for record containing one letter before the A
LIKE ‘_A’)
List the aggregate functions
count, sum, avg, max, min
COUNT
the number of rows containing the specified column (attribute)
SUM
the sum of all values for a selected column
AVG
the arithmetic mean (average) for the specified column (attribute)
MAX
the largest value in the column
MIN
the smallest value in the column
What is the average age and maximum age of students
SELECT AVG(Age) AS AverageAge, MAX(Age) AS MaxAge
FROM tb_Student;
How many of the students live in UC Dorm
SELECT COUNT(*)
FROM tb_Student
WHERE StudentAddress = ‘UC Dorm’
Show each student’s name and age difference from the average
SELECT
StudentName,
Age - (SELECT AVG(Age) FROM tb_Student) AS AgeDiff
FROM tb_Student;
DISTINCT
designed to produce a list of only those values that are different from one another
List distinct student addresses
SELECT DISTINCT StudentAddress
FROM tb_Student;
T/F SQL does not allow the use of DISTINCT with COUNT(*) together
TRUE
GROUP BY
There are circumstances where we would like to apply the aggregate
function not only to a single set of rows, but also to a group sets of rows.
GROUP BY example
SELECT State, AVG(PerHour)
FROM NBAPlayer
GROUP BY State;
HAVING definition
like where but used with group by
HAVING example
SELECT City, State, Avg(PerHour)
FROM NBAPlayer
GROUP BY City, State
HAVING COUNT( * ) >= 3;
Example of a full select with where, group by, and having
SELECT City, COUNT()
FROM NBAPlayer
WHERE State = ‘TX’
GROUP BY City
HAVING COUNT() >= 3;