Advanced SAS Programming Techniques Flashcards
What is an index?
A file that stores values in ascending order for a variable or variables and includes information about the location of those values in the data file. It helps you access values directly rather than having SAS go down the rows until it finds a value.
Syntax to create an INDEX in a DATA statement (for a simple and composite index)
This creates two simple indexes
For a simple index:
data datasetname (index=(column1 column2/unique));
set datasetname;
run;
For a composite index:
data datasetname (index=(indexname=(column1 column2)));
set datasetname
run;
What option can you use to see information in the SAS log about index creation or index usage?
Default is n
options msglevel = i;
When is an index not used? (3)
With a subsetting IF statement in a DATA step
With particular WHERE expressions
If SAS thinks it is more efficient to read the data sequentially
Syntax to create or delete indexes on existing datasets with PROC DATASETS
NOLIST is optional. It suppresses the printing of the directory of SAS files in the SAS log and as ODS output.
proc datasets library = libref NOLIST;
modify datasetname;
index delete index-name;
index create index-specification;
quit;
Syntax to create or delete indexes on an existing dataset with PROC SQL
proc sql;
create unique index index-name
on table-name(column1, column2);
drop index index-name from table-name;
quit;
What information is printed out by the CONTENTS procedure or the CONTENTS statement in PROC DATASETS? (5)
- general and summary info
- engine/host dependent info
- alphabetic list of variables and attributes
- alphabetic list of integrity constraints
- alphabetic list of indexes and attributes
Syntax for PROC CONTENTS
Syntax for PROC DATASETS with CONTENTS statement (2)
nolist is optional and suppresses the printing of the directory of SAS files in the SAS log and as ODS output.
proc contents data=libref.dataset;
run;
proc datasets library=libref nolist;
contents data=dataset;
quit;
proc datasets nolist;
contents data=libref.dataset;
quit;
How do you list the contents of all files in a SAS library with PROC CONTENTS and with PROC DATASETS?
proc contents data=libref._all_;
run;
proc datasets library=libref nolist;
contents data=_all_;
quit;
What happens to an index if you sort the data in place with the FORCE option in PROC SORT?
The index file is deleted
How do you copy a data set to a new location using PROC DATASETS?
Using PROC COPY?
What happens if the data set contains an index?
A new index file is automatically created for the new data file
proc datasets library = old-libref nolist;
copy out = new-libref;
select datasetname;
quit;
proc copy out = new-libref in = old-libref
[move];
select datasetname;
run;
quit;
How do you rename an indexed dataset so as to preserve the index?
Note the index is automatically renamed as well
proc datasets library=libref nolist;
change old-datasetname = new-datasetname;
quit;
How do you rename variables within an indexed data set so as to preserve the index?
If you rename a variable for which there is a simple index, the index is also renamed. If the variable is used in a composite index, the composite index automatically references the new variable name.
proc datasets library=libref nolist;
modify datasetname;
rename oldvarname1 = newvarname1
oldvarname2 = newvarname2;
quit;
How to combine raw data files vertically using FILENAME?
filename fileref (‘file-path1’ ‘file-path2’);
COMPRESS function
default characters-to-remove is blank spaces
Allows you to remove characters from a string
compress(source-string, characters-to-remove);