Indexes and Views in PROC SQL Flashcards

1
Q

What is an index?

A

An index allows you to access observations directly. The index stores values in ascending value order for a specific variable or variables and includes information as to the location of those values within observations in the data file.

For example, suppose you want the observation with SSN (Social Security number) equal to 123-45-6789:

Without an index, SAS accesses observations sequentially in the order in which they are stored in the data file. SAS reads each observation, looking for SSN=123-45-6789 until all observations are read.

With an index on variable SSN, SAS accesses the observation directly. SAS satisfies the condition using the index and goes straight to the observation that contains the value without having to read each observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of indexes?

A
  • Simple: based on only one column
  • Composite: based on two or more columns. In the index, the values of the key columns are concatenated to form a single value. Needs a new name that is not the name of any existing columns.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are 4 cases where creating an index would improve efficiency?

A
  1. a WHERE clause that contains a comparison operator, TRIM or SUBSTR, CONTAINS, or LIKE
  2. a subquery returning values to the IN operator
  3. a correlated subquery in which the column being compared with the correlated reference is indexed
  4. a join in which the join expression contains the = operator, or a join in which all the columns in the join expression are indexed in one of the tables being joined.

pg. 229-230

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 benefits of using an index

A
  1. A small subset of data (<15%) can be accessed more quickly (The bigger the subset, the less advantageous it is to use an index)
  2. equijoins can be performed without internal sorts
  3. uniqueness can be enforced by creating a unique index
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

4 costs of using an index

A
  1. Needs more CPU time
  2. Needs additional input/output requests when compared to reading the table sequentially
  3. Uses additional memory for buffers into which the index pages and code are loaded for processing
  4. Requires additional disk space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Create index syntax

A

Unique specifies that all the values in the column are unique

proc sql;

create (unique) index index-name

on table1 (columns);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you find out whether a table has any indexes?

A

Use DESCRIBE TABLE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you find out whether SAS is actually using the index you created?

A

Note the default option = N

Use OPTIONS MSGLEVEL = I before PROC SQL

It will tell you about index usage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two options to control index usage and what do they do?

A

IDXWHERE= and IDXNAME = in parenthesis in the from clause after the table name

IDXWHERE = YES : tells SAS to choose the best index to optimize a where expression, and to disregard the possibility that a sequential search of the table might be more resource-efficient

IDXWHERE = NO : tells SAS to ignore all indexes and satisfy the conditions of a where expression with sequential search of the table

IDXNAME = index name: tells SAS to use an index that you specify

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you delete an index?

A

DROP INDEX statement

proc sql;

drop index index-name

from tablename;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can an index be created on a view?

A

NO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a VIEW?

A

A PROC SQL view is a stored query expression that reads data values from its underlying files. The view contains only the logic for accessing the data, not the data itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Create a view syntax

A

The rest is normal proc sql query

proc sql;

create view viewname as

select ….

;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Get information about a view

A

proc sql;

describe view viewname;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you delete a view?

A

proc sql;

drop view view-name;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When you specify SQL options how long do they stay?

A

It remains in effect until you change it or you re-invoke PROC SQL.

17
Q

Name 2 SQL options to control execution and what they do

A

proc sql inobs = number;

proc sql outobs = number;

INOBS = restrict the number of input rows (rows that are read)

OUTOBS = restrict the number of output rows

18
Q

Name 3 SQL options to control output and what they do

A

Double-spaces the output. Default is NODOUBLE

proc sql double/nodouble;

proc sql flow/noflow/flow=n/flow=nm;

proc sql number/nonumber;

19
Q

Name a SQL option used for evaluating performance and what it does

A

Specify whether PROC SQL writes timing information for each statement to the SAS log. NOSTIMER is default.

proc sql stimer/nostimer;

20
Q

What can you use to add, drop, or change PROC SQL options without re-invoking the SQL procedure?

A

RESET options;

21
Q

What is a Dictionary Table?

A

Read-only SAS tables that have info about SAS libraries, SAS macros, SAS system options, and titles and footnotes currently in effect.

You use it like any other SAS table. The libref is Dictionary.

22
Q

Write the code to list all the tables in the Sasuser library that contain a column naed EmpID

A

proc sql;

select memname

from dictionary.columns

where libname = ‘SASUSER’

and name = ‘EmpID’;