SAS SQL queries Flashcards

1
Q

PROC SQL syntax

A

PROC SQL;

SELECT column1, column2

FROM table1, table2

WHERE expression

GROUP BY column1, column2

ORDER BY column1, column2 DESC;

QUIT;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you create a new column in a SELECT clause?

A

This will create a new column named ‘newcolumn’ that calculates column2*0.5.

proc sql;

select column1, column2, column2*0.5 as newcolumn

from table1

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Do the columns specified in a WHERE clause have to be specified in the SELECT clause?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Two ways to write an ORDER BY clause

A

1 is the column’s position in the SELECT clause

ORDER BY column1

or

ORDER BY 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you join tables?

How do you select a column that appears on both tables?

A

Remember to set the matching columns in the WHERE clause

proc sql;

select table1.column1, column2, column3

from sasuser.table1, sasuser.table2

where table1.column1 = table2.column1

order by column3;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When do you use a GROUP BY clause?

A

If you use a GROUP BY clause in a PROC SQL step without a summary function, the GROUP BY clause is changed to an ORDER BY clause.

When your query contains one or more summary functions and you want to group the results

e.g.

proc sql;

select membertype, sum(milestraveled) as TotalMiles

from sasuser.frequentflyers

group by membertype;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you create a new table from the results of a query?

A

PROC SQL;

CREATE TABLE table-name AS

SELECT column1, column2

FROM table1, table2

WHERE expression

GROUP BY column1, column2

ORDER BY column1, column2 DESC;

QUIT;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you select all columns of a table?

A

use *

proc sql;

select *

from table1;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which option lets you view the list of columns from a SELECT * statement in the SAS log?

A

This will list out all the columns to the SAS. Also resolves macro variables and places parentheses around expressions to show their order of evaluation.

FEEDBACK option

proc sql feedback;

select *

from table1;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you limit the number of rows displayed in the output?

A

proc sql outobs= n;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you eliminate duplicate rows from the output?

A

use DISTINCT keyword following SELECT

proc sql;

select distinct column1, column2

from table1;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens if you use a column you created in a SELECT clause in a WHERE clause?

How do you fix this?

A

Note can also rewrite the calculation in the where clause but this is inefficient

ERROR - the WHERE clause is processed before the SELECT clause, so the column isn’t found in the original table

To fix it, use the CALCULATED keyword

proc sql;

select flightnumber, boarded + transferred + nonrevenue as Total

from sasusesr.marchflights

where calculated total < 100;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you make a calculation based on a column you created in the SELECT clause?

A

Use CALCULATED keywork

proc sql;

select flightnumber, boarded + transferred + nonrevenue as Total, calculated total/2 as Half

from sasusesr.marchflights;

quit;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What 3 things end a PROC SQL step?

A

QUIT;

Submitting another PROC step or a DATA step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you add a label to a column in a PROC SQL step?

A

proc sql;

select column1 label =‘labelname’, colum2

from table1;

17
Q

How do you add a format to a column in a PROC SQL step?

A

proc sql;

select column1, column2 format = format

from table1;

18
Q

How do you get PROC SQL to ignore permantent labels in a table?

A

use the NOLABEL system option

19
Q

Where do you put titles and footnotes relative to a PROC SQL step?

A

Either before the PROC SQL statement or between the PROC SQL statement and the SELECT statement

20
Q

How do you add a column that contains a character or numeric constant?

A

Include the text string or number in the SELECT clause

proc sql;

select column1, column2, ‘myconstantstring’, column3

from table1;

21
Q

The ANSI-standard sumary functions, such as AVG and COUNT, can be used with ____ argument(s). The SAS summary functions, such as MEAN and N, can be used with ____ argument(s).

A

ANSI - a single argument

SAS - either single or multiple arguments

22
Q

If the summary function specifies one column as argument, then the calculation is….

If the summary function specifies multiple columns as arguments, then the calculation is….

A

If the summary function specifies one column as argument, then the calculation is performed down the column.

If the summary function specifies multiple columns as arguments, then the calculation is performed across columns for each row.

23
Q

What do the following clauses do?

select count(*) as Count

select count(jobcode) as Count

select count(distinct jobcode) as Count

A

select count(*) as Count

Returns the total number of rows in a group or in a table

select count(jobcode) as Count

Returns the total number of rows in a group or in a table for which there is a nonmissing value in the selected column

select count(distinct jobcode) as Count

Returns the total number of unique values in a column

24
Q

Use COUNT(*) to count rows within groups of data

A

This will give you the counts for each category of column1

proc sql;

select column1, count(*) as Count

from table1

group by 1;

25
Q

How do you use a HAVING clause?

What are two differences between a HAVING clause and a WHERE clause?

A

You can have summary functions in a HAVING clause but not in a WHERE clause

Use it after a GROUP BY clause to select groups based on a condition

proc sql;

select column1, avg(column2) as AvgCol

from table1

group by column1

having condition;

26
Q

What happens if you have a HAVING clause with no GROUP BY clause?

A

The HAVING clause treats the whole table as one group

27
Q

When does data remerging occur? (3)

A
  1. The values returned by a summary function are used in a calculation
  2. the SELECT clause specifies a column that contains a summary function and other columns(s) that are not listed in a GROUP BY clause
  3. The HAVING clause specifies one or more columns or column expressions that are not included in a subquery or a GROUP BY clause
28
Q

What happens during data remerging? (2 things)

A

PROC SQL makes two passes through the table:

  1. PROC SQL calculates and returns the value of summary functions, and groups data according to the GROUP BY clause
  2. PROC SQL retrieves any additional columns and rows that it needs to display ini the output, and uses the result from the summary function to calculate any arithmetic expressions in which the summary function participates
29
Q

What is a subquery?

A

This will return the groups for which AvgSalary is greater than the average salary across the entire payroll

A query that is nested in, and is a part of, another query.

It can return single or multiple values to an outer query.

It can return values for multiple rows but only for a single column.

e.g.

proc sql;

select jobcode, avg(Salary) as AvgSalary

from sasuser.payrollmaster

group by jobcode

having avg(salary) >

(select avg(salary) from sauser.payrollmaster);

30
Q

What are the types of subqueries?

A
  1. Noncorrelated: a self-contained subquery that executes independently of the outer query
  2. Correlated: a dependent subquery that requires one or more values to be passed to it by the outer query before the subquery can return a value to the outer query
31
Q

When a noncorrelated subquery returns multiple values, use the following operators in a WHERE or HAVING clause:

A

IN

ANY or ALL

EXISTS

32
Q

Explain the effect of the ANY operator:

where dateofbirth < ANY

(subquery)

A

This returns the values that are less than any value returned by the subquery. eg. if the subquery returns 20, 30, 40, then the outer query will select all values < 40.

eg.

proc sql;

select empid, jobcode, dateofbirth

from sasuser.payrollmaster

where jobcode in (‘FA1’, ‘FA2’)

and dateofbirth < any

(select dateofbirth

from sasuser.payrollmaster

where jobcode=’FA3’)

In this code, we’re trying to identify any flight attendants at level 1 or 2, who are older than any of the flight attendants at level 3.

33
Q

Explain the effect of the ALL operator:

where dateofbirth < ALL

(subquery)

A

Returns values that are less than all the values returned by the subquery. e.g. if subquery returns 20, 30, 40, then the outer query will select all values < 20.

34
Q

EXISTS is true if…

NOT EXISTS is true if…

A

EXISTS is true if the subquery returns at least one row

NOT EXISTS is true if the subquery returns no data

35
Q

What type of subquery is this?

proq sql;

select lastname, firstname

from sasuser.flightattendants

where not exists

(select *

from sasuser.flightschedule

where flightattendants.empid = flightschedule.empid)

A

Correlated subquery

36
Q

What does the NOEXEC option do?

A

SAS checks the syntax of all queries in that PROC SQL step for accuracy but does not execute them.

If query is valid and all referenced columns and tables exists, the SAS log displays:

Statement not executed due to NOEXEC option

proc sql noexec;

select…

37
Q

What does the VALIDATE keyword do? Where do you put it?

A

Checks for errors in the query. If not errors, SAS log displays:

PROC SQL statement has valid syntax

Place before SELECT statement

proc sql;

validate

select…..;