SAS P1 L6 Flashcards
What can a DATA step read? (3)
DATA step can read:
- SAS Data sets
- Excel worksheets
- Raw data files
How create new data set from existing data set?
What statements needed?
Syntax?
Use DATA step:
DATA output-SAS-data-set; {this is DATA statement}
SET input-SAS-data-set; {this is SET statement}
RUN; {Run statement}
What does a DATA step do?
How select only particular observations?
- Data step reads all observations and all variables from input data step sequentially
- Use WHERE statement to subset (select only certain) observations that meet particular condition
How write SAS DATE CONSTANT? What must be used? What happens? How many digit year? Examples: how write 1/1/2000? 31/12/11? 1/4/04? When can be used?
SAS DATE CONSTANT written as 'ddmmmyy'D - must use quotes - will be converted to SAS date value - used if want 4 digit year - D can be upper or lower case Examples: '01JAN2000'D '31Dec11'D '1jan04'd
Can be used in any SAS expression, including WHERE
How assign value to variable? Syntax? Does var have to be old or new? What is expression? Keyword? Possible operands? Possible operators? Operator hierarchy? How change? What if operand has missing value?
Assign value to variable via:
variable = expression;
- variable can be old or new
- expression is a set of instructions, a series of operands/operators that create a value
- NO keyword
- operands: char. const, num. const, date const, char. variable, num. variable
- operators: arithmetic calculations or SAS functions
- operators follow usual arithmetic hierarchy, can use parens to change order
- if any operand in expression has missing value, result will be missing value
What does a SET statement do? (2)
How exclude/include variables?
SET statement:
READS all variables and WRITES them to output data set
Exclude or include variables using DROP, KEEP (=keywords)
What does DROP do?
Syntax?
How many variables per DROP statement?
How separate variables?
DROP specifies variable to EXCLUDE from output data set:
DROP variable1 variable2 ….;
(variable names separated by spaces)
What does KEEP do?
Syntax?
How separate variables?
What must be included?
KEEP specifies variable to INCLUDE in output data set
KEEP variable1 variable2 ….;
(variable names separated by spaces)
must include every variable to be written - new variables too!
How decide whether to use DROP or KEEP?
Any effect on input data set?
Doesn’t really matter whether use DROP or KEEP.
Use the one that means you specify the fewest var.
DROP, KEEP have no effect on input data set.
How does SAS process DATA step? Phases?
SAS processes DATA step in two phases:
Compilation phase and execution phase
What happens during Compilation? (4 steps)
What does descriptor portion include?
Compilation:
- SAS SCANS each data step for syntax errors
- COMPILES program – converts to machine code if no errors found
- CREATES PDV (program data vector) to hold current obs
- When compilation complete, RECORDS (makes) descriptor portion of new data set
Descriptor portion includes data set name, variable names
What is a PDV? Where is it? What is automatically included? (2) What are these used for? Default value? How much space is used? What is PDV used for?
PDV = program data vector
- area of memory where SAS builds one observation
- contains 2 automatic variables that can be used as part of processing but that are not written to data set:
N interation # of data step
Error signals appearance of error caused by data during execution. Default = 0 = no error
- a slot is added for each variable in input data step
What are the automatic variables in a PDV?
What are they used for?
Default values?
PDV has 2 automatic variables that can be used as part of processing but that are not written to data set:
N interation # of data step
Error signals appearance of error caused by data during execution. Default = 0 = no error
What supplies variable name, type, length to PDV?
Descriptor portion of data set supplies attributes
var name, type, length
How create new variables in PDV? (5 step process)
What if var being dropped?
Where does info about var attributes come from?
- ADD SLOT for each variable in input data step
- GET ATTRIBUTES from descriptor portion of data set:
var name, type, length - PUT NEW variable in PDV
- [In compilation phase, SAS FLAGS any var to be dropped from output]
- BOTTOM of data set: compilation phase is complete, and descriptor portion of new data set is RECORDED