midterm Flashcards
math operators
*
/
**
logical operators
AND
OR
NOT
between
min and max value
WHERE colName BETWEEN val1 AND val2
in clause
compare a value to a list of literal values that have been specified
WHERE colName IN (val1, val2, val3, val4)
LIKE clause
wildcards
% represents zero or more characters
_ represents a single character
WHERE FIrst_Name LIKE ‘Herma%’
DISTINCT clause
eliminates _____ records, fetching only ___ unique records when using a select statement
duplicate, unique
subquery syntax
SELECT emp_ID, name
FROM employees
WHERE salary =
(SELECT MAX(salary) FROM employees);
let you use an aggregate function in the where clause
types of joins (6)
inner join
left join
right join
full join
cross join
self join
input
V_name := ‘&name’;
NCHAR vs VARCHAR
when you insert a string shorter than the defined length into an NCHAR column, the database pads the string with spaces to math the defined length
varchar does not do this
NUMBER vs pls_int
number stores more than int
if statements syntax
IF condition THEN
statements;
ELSIF condition THEN
statements;
ELSE
statements;
END IF;
case statement syntax
CASE l_num
WHEN 11 THEN
statements;
WHEN 10 THEN
statements;
ELSE
statements
END CASE;
_________________________-
CASE
WHEN l_num > 11 THEN
statements;
WHEN l_num =10 THEN
statements;
ELSE
statements;
END CASE
pls_integer vs simple_integer
pls_intger can store null values, simple_integer cannot store null values, it does not raise an overflow exception and it wraps around to zero
while loop syntax
WHILE condition LOOP
statements
END LOOP;
IN vs OUT parameters
IN - read only
OUT - can change the value (write)
EXCEPTIONS
syntax
zero_divide
case_not_found
no_data_found
rowtype_mismatch
too_many_rows
value_error
EXCEPTION
WHEN ZERO_DIVIDE THEN
DBMS_OUTPUT>PUT(‘error message’);
END:
how to define an exception
DECLARE
varName EXCEPTION;
EXCEPTION
WHEN varName THEN
print statement
END;
for loop syntax
FOR i IN 1 .. 20 LOOP
statments;
END LOOP;
cursor declare syntax
declare
CURSOR cr IS SELECT cust_ID, cust_first, cust_last FROM SH.customers
cursor attributes
%ISOPEN
%NOTFOUND
%FOUND
%ROWCOUNT
4 steps in using an explicit cursor
declare
open
fetch
close
when fetching to a record or list of variables, the variables should be ___________
in the same order as the variables in the cursor
declare record
TYPE rec_cust IS RECORD(l_id NUMBER, l_first VARCHAR(20));
– instance of record
Cust rec_cust
create a table of records
create table using existing table structure
TYPE arr_cust IS TABLE OF rec_cust;
–instance of arr_cust
l_cust arr_cust
TYPE arr_cust IS TABLE OF sh.customers%ROWTYPE;
l_cust arr_cust;
bulk update syntax
OPEN cr;
FETCH cr BULK COLLECT INTO l_cust;
CLOSE cr;
FORALL i IN 1 .. l_cust.LAST
UPDATE MyCustomer
SET Cust_city = ‘Toronto’
WHERE cust_ID = l_cust(i).cust_id;
stored procedures, functions and triggers ___ ___ more efficiently and effectively than SQL
manipulate data more efficiently and effectively than SQL
functions and procedures do not remain in the database after it is created (t/f)
f, remains in the database until it is dropped
stored procedures syntax
CREATE OR REPLACE PROCEDURE hello(l_name IN VARCHAR2)
IS
l_greeting VARCHAR2(24);
BEGIN
statement
END;
how to execute a procedure
EXECUTE hello(‘myName)
how to call a procedure (3)
CALL procedureName(params);
call from another script
BEGIN
procedureName(params);
END;
association operator
BEGIn
procedureName(param2 => val2, param1 => val1);
procedure notations
positional notation(the same order)
named notation
greeting(name => ‘John’, text => ‘dear’)
mixed notation
positional first, named second
INOUT meaning
both readable and writable
select and bulk collect syntax
SELECT * BULK COLLECT INTO l_tbl
FROM sh.customers;
why do we use cursors
steps
lets you manipulate the data before inserting into a table
use cursor to collect data
put data into created table
put data into database table from created table
function vs procedure
A function ALWAYS returns 1 value
A function cannot make changes to the database (INSERT, UPDATE, DELETE)
a procedure does not have to return a value
It rarely makes sense to use ____ ____ for a function
why
output parameters
cannot edit the data
function syntax
CREATE OR REPLACE FUNCTION functionName(varName IN NUMBER)
RETURN NUMBER
IS
varName2 NUMBER(12,2)
BEGIN
statements
END;
A ___ is a named block of PL/SQL code that is executed or fired automatically when a particular type of SQL is executed
mostly used with ____, ___, ___
can be used with DDL statements such as ____, ___, ___
trigger
insert, update, delete
cerate, alter, drop
trigger syntax
CREATE OR REPLACE TRIGGER triggerName
BEFORE/AFTER/INSTEAD OF
INSERT/UPDATE/DELETE
OF colName ON tableName
FOR EACH ROW
BEGIN
statements
END;
:NEW.colName to reference new value
Dynamic SQL composes a DML or DDL statement at ___ ___
useful when:
you need to run a ___
your sql can only be formed at __ __
run time
ddl
run time
dynamic sql
CREATE OR REPLACE PROCEDURE example(employee_id NUMBER) IS
sqlStatement VARCHAR2(100);
empName VARCHAR(20);
BEGIN
sqlStatement := ‘SELECT first_name FROM EMPLOYEES WHERE emp_ID = :id’;
EXECUTE IMEEDIATE sqlStatement INTO empName USING employee_id
END;
dynamic sql
alter table, add constraint syntax
sqlStatement := ‘ALTER TABLE ‘ || tableName || ‘ ADD CONSTRAINT PK_’ || tableName ||
‘ PRIMARY KEY (‘ || colName || ‘)’;
EXECUTE IMEEDIATE sqlStatement;
make sure to add appropriate spaces to the string
package syntax
CREATE OR REPLACE PACKAGE pkg AS
PROCEDURE procedureName1(params);
PROCEDURE procedureName2(params);
END pkg;
CREATE OR REPLACE PACKAGE BODY pkg AS
PROCEDURE procedureName1(params) IS
BEGIN
statements
END procedureName1);
PROCEDURE procedureName2(params) IS
BEGIN
statements
END procedureName2);
END pkg;
how to use a procedure in a package
pkgName.procedureName(params)
Integration is the process of ____ data from different __ __ to support business needs
consolidating
data sources
There are three techniques that form the building blocks of
any integration approach:
* Data ___
* Data ____
* Data __
Consolidation
Federation
Propagation
data consolidation
Usually done for _____ ___ using Extract Transform Load (ETL) techniques
___-___ databases
flexibility: ____
requires constant __ ___ to refresh the warehouse
Data Warehousing
read only databases
not flexible (has a fixed schema)
batch processes
Data warehousing is the process of constructing and using a data warehouse.
A data warehouse is constructed by ___ data from multiple ___ __that support analytical reporting, structured and/or ad hoc queries, and decision making
Warehousing is to ___ data in one database that is optimized
for __
integrating
heterogeneous sources
consolidate, reads
Data federation is the creation of a ___ database that ___ data from distributed sources giving them a ___ ___ ___.
It is an approach to data integration that provides a single source of data for front end applications
flexibility: __
___ on demand
provides ___ views to consolidated data
only copies or transforms data when need
can affect the performance of the transactional ___ databases
virtual, aggregates, common data model
flexible
consolidation
virtual
OLTP
ETL
what is it
Is a process that is used to integrate data or move data from
one source to another
Data propagation
Is the mechanism of copying data within the ___ __. By sharing data among cases, you save time and provide relevant information to caseworkers.
Data propagation is not limited to subcases. Data can also be propagated when creating spin-
off cases. Data propagation happens on __ __
data is ___ across the database
can be ___ (data is consistent) or ___
too many ___
case hierarchy
case creation
replicated
synchronous or asynchronous
duplicates
ETL extract
Process of getting data from the data sources
Usually the data will be extracted to a _____ repository (staging
database). It is a common practice to first load the extracted data into an intermediate storage area. This intermediate storage is often referred to as a “staging database” or “temporary repository
___ extract is when the whole data is copied
temporary
Static
Incremental extract is when the deltas(mean new or modified records)- the changes are copied
ETL - cleanse extracted data
Integrating different sources of data exposes ___ issues in
the original data sets
Extract process should have processes that expose and reject
___ ___
Extract process must not __ the errors. Instead it will send an
error report to the owners of the data source to fix the data at
the source
quality
erroneous data
fix
ETL - Transform - record level
- ____: Restricting rows using the WHERE clause in the select statement)
- ___: Projecting columns from more than one table)
- ___: think of it as the opposite of joining. Convert tables that are not in the third normal form to smaller tables that do not have redundancies
- ___: converting data from more detailed to summary statistics on the data
Selection
Joining
Normalization
Aggregation
ETL - TRANSFORM – FIELD LEVEL
___ ___ change
Use functions and algorithms to ___ data from source to
destination
data type change
change
(for example change US dollars to Canadian dollars)
ETL - load
___ the transformed data to its final destination (usually data warehouse or a data mart)
build ___ if required
Insert
Data mart is a subset of __ __
uses ___ ___
centered around ___ ___
data warehouse
use dimensional models
one subject (sale, inventory)
data warehouse
___ Oriented: is focused on business entities
___: Comes from multiple systems and need to fit one structure
____-___: show change over time
____ Only: can not be updated. Has to be reloaded or refreshed
Subject
Integrated
Time-Variant
Read
need for data warehouse
when we need to integrate data from multiple systems
Separating reporting databases from operational databases
(OLTP) enhances the performance
NEED FOR DATA WAREHOUSE
star schema
facts table: has ___ values, contains ___ ___ of dimension tables
dimension tables: has ___ and their descriptions
all primary keys of the dimension tables should be ___ keys (system generated
keys)
quantitative, foreign keys
categories
surrogate
star schema
grain of the fact table
The minimum unit of reporting should be an ________________
The grain is a crucial concept in data warehousing, influencing how data is ___ and ____ in conjunction with dimension attributes
Think of the fact table as an associative entity.
This means that its primary key is a composite key that contains all the foreign keys of the entities that associates it which is the dimension
tables
intersection of all the dimensions
aggregated and analyzed
how to create and use a star schema syntax
–create tables
CREATE TABLE Fact_Emp(
J_ID NUMBER,
P_ID NUMBER,
D_ID NUMBER,
SALARY NUMBER(10,2),
Commission_PCT(4,2)):
CREATE TABLE Dim_Job(
J_ID NUMBER,
Job_ID VARCHAR2(30),
Job_title VARCHAR2(40));
CREATE TABLE Dim_Period(
P_ID NUMBER,
P_Year NUMBER);
CREATE TABLE Dim_Department(
D_ID NUMBER
Department_ID NUMBER,
Department_name (VARCHAR2(40));
CREATE SEQUENCE Dim_ID;
INSERT INTO Dim_Job
SELECT Dim_ID.NEXTVAL, Job_ID, Job_title
FROM hr.jobs;
– in the select, it is all the fields other than the first ID
create table of fact_table using %rowtype, create instance
create a cursor than selects all the fields that the fact table has, inner join all tables, group by the dimension ids
FROM hr.employees e INNER JOIN Dim_job j
ON e.jobID = j.JobID
INNER JOIN Dim_department d ON e.departmentID = d.DepartmentID
group by J-ID, P_ID, D_ID;
open cr, bulk collect into created table, clsoe cr.
use forall to insert the data in the created table into the fact table