Combining, Creating and Managing Tables with PROC SQL Flashcards
In all types of joins, what does PROC SQL generate first?
a Cartesian product (i.e. all possible combinations of values between the two tables)
What happens when you specify multiple tables in the FROM clause but don’t include a WHERE statement?
PROC SQL returns the Cartesian product of the tables (i.e. all possible combinations of the values in the two tables)
How to do an inner join?
proc sql;
select *
from table 1, table 2
where table1.column = table2.column;
quit;
What happens if you do an inner join with tables in which multiple rows have duplicate values of the column on which the tables are being joined?
e.g.
X A X B
2 b1 2 x1
2 b2 2 x2
All possible combinations of the duplicate rows are displayed:
X A X B
2 b1 2 x1
2 b1 2 x2
2 b2 2 x1
2 b2 2x2
How do you assign a table alias?
What are two situations that require a table alias?
proc sql;
select t1.column1 column2
from table1 as t1, table2 as t2
where t1.column1 = t2.column1;
Table aliases are required when
- a table is joined to itself: e.g. from airline.staffmaster as s1, airline.staffmaster as s2
- you need to reference columns from same-named tables in different libraries: e.g. from airline.flightdelays as af, work.flightdelays as wf
What are the types of outer joins?
- Left: all matching rows plus nonmatching rows from the first table specified in the FROM clause (the left table)
- Right: all matching rows plus nonmatching rows from the second table specified in the FROM clause (the right table)
- Full: All matching rows plus nonmatching rows in both tables
Outer join syntax
Note you can do an outer join on only two tables or views at a time.
proc sql;
select column1 colum2..
from table1
left join/right join/full join
table2
on join-conditions;
e.g.
proc sql;
select *
from one
left join
two
on one.x=two.x;
Syntax for creating an inner join with outer join syntax
proc sql;
select column1 column2..
from table1
inner join
table2
on join-conditions;
What are 3 advantages of PROC SQL joins over DATA step match-merges?
- PROC SQL joins don’t require sorted or indexed tables
- PROC SQL joins do not require that the columns in join expressions have the same name
- PROC SQL joins can use comparison operators other than the equal sign, e.g.
proc sql;
selec a.itemnumber, cost, price
from table 1 as a, table 2 as bb
where a.itemnumber = b.itemnumber and a.cost>b.price;
What is the difference between DATA step match-merge and PROC SQL joins when all the values of the merging variable match?
when only some of the values match?
When only some of the values match, how can you make a PROC SQL full outer join similar to a DATA step match-merge?
- No difference
- The DATA step overlays the two common columns by default, PROC SQL does not (so there will be some missing values in the column) see pg. 103
- Use COALESCE function, eg.
proc sql;
select coalesce(three.x, four.x) as X, a , b
from three
full join
four
on three.x = four.x;
What is an in-line view?
An in-line view exists only during query execution
An in-line view is a nested query that is specified in the outer query’s FROM clause. It selects data from one or more tables in order to produce a temporary table that the outer query then uses to select data for output. e.g.
from (select flighnumber, date
from sasuser.marchflights;)
Two potential advantages of using an in-line view?
- The complexity of the code is usually reduced, so that the code is easier to write and understand
- In some cases, PROC SQL might be able to process the code more efficiently
Create a FROM clause with a table and an in-line view and assign as aliases to both of them. Don’t forget the where statement.
from table1 as one,
(select column1, column2,
from table2) as two
where one.column1 = two.column2
Syntax to combine tables vertically
proc sql;
select *
from table1
except/intersect/union/outer union
select *
from table2;
Which set operator is evaluated first by default?
intersect
Syntax to use ALL and CORR
proc sql;
select *
from table1
set-operator all/corr
select *
from table2;
When do you use the ALL keyword?
cannot be used with OUTER UNION
When you want it to make only one pass through the data and not remove duplicate rows
When do you use the CORR keyword?
When you want it to compare and overlay columns by name instead of position
When used with EXCEPT, INTERSECT, and UNION, removes any columns that do not have the same name in both tables
When used with OUTER UNION, overlays same-named columns and displays columns that have nonmatching names without overlaying.
What set-operator is represented by the picture?
What two things does it do?
EXCEPT
Selects unique rows from the first table that are not found in the second table
overlays columns
What type of join is represented by the picture?
inner join
What type of join is represented by the picture?
full outer join
What type of join is represented by the picture?
left outer join