Midterm 1 Flashcards

Question

Which step displays the director of the project library and suppresses printing the contents of individual data sets? A. proc contents data=project; run; B. proc contents data=project.all; C. proc contents data=project nocontents; run; D. proc contents data=project._all_nods; run;

Answer 1

D. proc contents data=project._all_nods; run;

Answer 2

The print procedure can show the data portion of a SAS data set. ex. proc print data=project.enroll; run;

Answer 3

Two ways to add comments: *comment /* comment */

Answer 4

golf.supplies

Answer 5

work.newprice

Answer 6

The SET statement reads an observation from one or more SAS data sets for further processing in the DATA step. By default, the SET statement reads all variables and all observations from the input data sets. The set statement can read temporary or permanent data sets.

Answer 7

During the compilation phase, SAS does the following: Checks the syntax of the SAS statements. Translates the statements into machine code. Identifies the name, type, and length of each variable. The following three items are potentially created: input buffer program data vector descriptor information

Answer 8

The input buffer is a logical area in memory into which SAS reads each record of a raw data file when SAS executes an INPUT statement. This buffer is created on when the DATA step reads raw data When the data step reads a SAS data set, SAS reads the data directly into the program data vector.

Answer 9

A logical area in memory where SAS builds a data set, one observation at a time. Along with data set variables and computed variables, the PDV contains the following two automatic variables: - the _N- variable, which counts the number of times the DATA step begins to iterage. - the _ERROR_ variable, which signalas the occurrence of an error caused by the data during execution. Either 0 (no error) or 1 (one or more errors occured)

Answer 10

D. SAS sets the _ERROR_ variable equal to the total number of errors caused by the data during execution

Answer 11

B. Initial value of the variable.

Answer 12

Information that SAS creates and maintains about each SAS data set, including data set attributes and variable attributes. I.e. name of the data set, date and time that the data set was created, names data types, and lengths of the variables.

Answer 13

During the execution phase, SAS does the following: - Initializes the PDV to missing and sets the initial values of _N_ and _ERROR_ - Reads data values into the PDV - Executes any subsequent programming statements - Outputs the observation to a SAS data set - Returns to the top of the DATA step - Resets the PDV to missing for any variables not read directly from a data set and increments _N_ by 1 - repeats the process until the end of file is detected.

Answer 14

Nine times

Answer 15

the DROP statement specifies the names of the variables to omit from the output data set. Use DROP= after data-set input name to specify the variables for writing to a specific output data set. data work.total(keep=name total test1 test2)

Answer 16

The KEEP statement specifies the names of the variable to write to the output data set. Use KEEP= after data-set input name to specify the variables for writing to a specific output data set. data work.total(drop=name total test1 test2)

Answer 17

``` The FORMAT statement associates formats to variable values. ex. data work.newprice; set golf.supples; saleprice=price*0.75; format saleprice dollar18.2; run; Format statements assigned in a DATA step are considered permanent attributes (stored in the descriptor portion). ```

Answer 18

The LABEL statement assigns descriptive labels to variable names. data work.newprice; set golf.supples; saleprice=price*0.75; label type='Type of Ball' saleprice='Sale Price' run; Label statements assigned in a DATA step are considered permanent attributes (stored in the descriptor portion).

Answer 19

A. adding a /DEBUG option to the DATA statement

Answer 20

The DATA step debugger consists of windows and a group of commands that provide an interactive way to identify logic and data errors in DATA steps.

Answer 21

The PUTLOG statement can be used to write messages to the SAS log to help identify logic errors.

Answer 22

By default, at the end of each iteration, every DATA step contains an implicit OUTPUT statement that tells SAS to write observations to the data set or data sets that are being created.

Answer 23

The OUTPUT statement without arguments causes the current observation to be written to all data sets that are named in the DATA statement. Multiple output statements can be used in a data step. Placing an explicit OUTPUT statement in a DATA step overrides the implicit output, and SAS adds an observation to a data set only when an explicit OUTPUT statement is executed.

Answer 24

``` The DATA statement can specify multiple output data sets. The OUTPUT statement can specify the data set names. data work.first work.second; set work.scores; test=test1; output work.first; test=test2; output work.second; drop test1 test2; run; ``` Using the OUTPUT statement without arguments causes the current observation to be written to all data sets that are named in the DATA statement. The drop and keep statements apply to all output data sets.

Answer 25

D. work.total, work.first, and work.second

Answer 26

Ex. if sex='F' then output female; | else if sex='M' then output male;

Answer 27

The FIRSTOBS= and OBS= data set options can be used to control which observations are read from the input data set. ex. set sashelp.retail (obs=10); FIRSTOBS= and OBS= are valid for input processing only. They are not valid for ouput processing.

Answer 28

the FIRSTOBS= data set option specifies a starting point for processing an input data set.

Answer 29

the OBS= data set option specifies an ending point for processing an input data set. The OBS= option specifies the number of the last observation, and not how many observations there are to process.

Answer 30

A. data shoes (firstobs=101 obs=200); | set sashelp.shoes; run;

Answer 31

An expression is a sequence of operands and operators that forms a set of instructions that define a condition for selecting observations. operands are constants (character or numeric), variables (character or numeric), SAS functions operators are symbols that request a comparison, logical operation, or arithmetic calculation.

Answer 32

Comparison operators compare a variable with a value or with another variable. ``` EQ or =: equal to NE or ^= ~= : not equal to GT or >: greater than GE >=: greater than or equal to LT or ```

Answer 33

D. name ne Mary Ann

Answer 34

Logical operators combine or modify expressions AND or &: logical and OR or |: logical or NOT or ^: logical not

Answer 35

Arithmetic operators indicate that an arithmetic calculation is performed. If a missing value is an operand for an arithmetic operator, the result is a missing value. **: exponentiation *: multiplication /: division +: addition -: subtraction

Answer 36

``` The WHERE statement can use special WHERE operators BETWEEN - AND : an inclusive range CONTAINS or ? : a character string LIKE: a character pattern SOUNDS LIKE or =* : spelling variation IS NULL : missing value IS MISSING : missing value SAME AND ALSO : augments and expression ```

Answer 37

1. Mark | 3. Mickey

Answer 38

C. Either statement will work

Answer 39

B. if saleprice>10;

Answer 40

The WHERE statement causes the DATA step to process only those observations from a data set that meet the condition of the expression. The expression in the WHERE statement Can reference variables that are from the input data set. Cannot reference variables created from an assignment statement or automatic variables (_N_ or _ERROR_).

Answer 41

B. where difference ge 1000;

Answer 42

The subsetting if statement causes the DATA step to continue processing only those observations in the program data vector that meet the condition of the expression. data work.newprice; set golf.supplies; saleprice=price*0.75; if saleprice>10; run; If the expression is true for the observation, SAS continues to execute the remaining statements in the DATA step, including the implicit OUTPUT statement at the end of the DATA step. The resulting SAS data set (or data sets) contains a subset of the original SAS data set. If the expression is false, no further statements are processed for that observation, the current observation is not written to the DATA step are not executed, and SAS immediately returns to the beginning of the DATA step.

Answer 43

``` B. data subset; set sales; differences=actual-predict; if difference between 500 and 1000; run; ```

Answer 44

The WHERE statement selects observations before they are brought into the program data vector. The subsetting IF statement selects observations that were read into the program data vector.

Answer 45

The IF-THEN DELETE statement causes the DATA step to stop processing those observations in the program data vector that meet the condition of the expression. ex. if saleprice<= 10 then delete; If the expression is true for the observation, the current observation is not written to a data set, and SAS returns immediately to the beginning of the DATA step for the next iteration.

Answer 46

Orders SAS data set observations by the values of one or more character or numeric variables Either replaces the original data set or creates a new data set. Produces only an output data set, but no report. Arranges the data set by the values in ascending order by default. The DATA= option identifies the input SAS data set. The OUT= option names the output data set. Without the OUT= option, the SORT procedure overwrites the original data set. ex. proc sort data=sashelp.shoes out=shoes; by descending region product; run;

Answer 47

The BY statement specifies the sorting variables. PROC SORT first arranges the data set by the values of the first BY variable PROC SORT then arranges any observations that have the same value of the first BY variable by the values of the second BY variable. This sorting continues for every specified BY variable. By default, the SORT procedure orders the values by ascending order. The DESCENDING option reverses the sort order for the variable that immediately follows in the statement. In addition ot the SORT procedure, a BY statement can be used in the DATA step and other PROC steps. The data sets used in the DATA step and other PROC steps must be sorted by the values of the variables that are listed in the BY statement or have an appropriate index.

Answer 48

If more than one data set name appears in the SET statement, the resulting output data set is a concatenation of all the data sets that are listed. SAS reads all observations from the first data set, then all from the second data set, and so on, until all observations from all the data sets are read.

Answer 49

``` The LENGTH statement specifies the number of bytes for storing variables. EX. data company; length name $ 15; set divisionA divisionB; run; ```

Answer 50

The RENAME= data set option changes the names of variables. The RENAME= option specifies the variable that you want to rename equal to the new name of the variables The list of variables to rename must be enclosed in parentheses. Ex. set divisionA (rename=(state=location)) divisionB;

Answer 51

B. set divisionA (rename=(name=first state=location)) | divisionB (rename=(name=first);

Answer 52

Use a single SET statement with multiple data sets and a BY statement to interleave the specified data sets. The observations in the new data set are arranged by the values of the BY variable or variables. Then, within each BY group, they are arranged by the order of the data sets in which they occur. The data sets that are listed in the SET statement must be sorted by the values of the variables that are listed in the BY statement, or they must have an appropriate index. ``` data company; length name $ 15; set divisionA (rename=(state=location)) divisionB; by name; run; ```

Answer 53

The merge statement joins observations from two or more SAS data sets into single observations. The BY statement specifies the common variables to match-merge observations. The variables in the BY statement must be common to all data sets. The data sets listed in the MERGE statement must be sorted in the order of the values of the variables that are listed in the BY statement, or they must have an appropriate index. ``` Ex. data combine; merge revenue expense; by name; profit=revenue-expense; run; ```

Answer 54

The IN= option creates a variable that indicates whether the data set contributed data to the current observation. Within the DATA step, the value of the variable is 1 if the data set contributed to the current observation, and 0 if the data set did not contribute to the current observation. ``` Ex. data combine1; merge revenue1 (in=rev) expense1 (in=exp); by name; profit=revenue-expense; run; ```

Answer 55

A. The IN= variables are included in the SAS data set that is being created.

Answer 56

A. if rev=1;

Answer 57

``` With an INPUT statement, the INFILE statement identifies the physical name of the external file to read. The physical name is the name that the operating environment uses to access file. EX. data work.kids; infile 'kids.dat'; input name $ 1-8 siblings 10 @12 bdate mmddyy10. @23 allowance comma2. hobby1 $ hobby2 $ hobby3 $; run; ```

Answer 58

The input statement describes the arrangement of values in the input data record and assigns input values to the corresponding SAS variables. ``` EX. data work.kids; infile 'kids.dat'; input name $ 1-8 siblings 10 @12 bdate mmddyy10. @23 allowance comma2. hobby1 $ hobby2 $ hobby3 $; run; ```

Answer 59

C. delimited input

Answer 60

With column input, the column numbers that contain the value follow a variable in the INPUT statement. To read with column input, data values: must be in the same columns in all the input data records. Must be in standard form. Column input statement can contain: variable- names a variable that is assigned input values. $ : Indicates to store a variable value as a character value rather than as a numeric value. start-column: Specifies the first column of the input record that contains the value to read. -end-column: Specifies the last column of the input record that contains the value to read. Ex. input name $ 1-8 siblings 10

Answer 61

With formatted input, an informat follows a variable name and defines how SAS reads the value of this variable. An informat gives the data type and the field width of an input value. To read with formatted input, data values -Must be in the same columns inall the input data records. Can be in standard or nonstandard form. Formatted input statement can contain the following: pointer-control-moves the input pointer to a specified column in the input buffer. @n moves the pointer to column n. +n moves the pointer n columns. variable-names a variable that is assigned input values. informat- specifies a SAS informat to use to read the variable values. ex. input @12 bdate mmddyy10. @23 allowance comma2.

Answer 62

With list input, variable names in the INPUT statement are specified in the same order that the fields appear in the input data records. To read with list input data values: -must be separated with a delimiter -can be in standard or nonstandard form. ``` You must specify the variables in the order that they appear in the raw data file, left to right. The default length for variables is 8 bytes. A space (blank is the default delimiter. ``` pointer control: moves the input pointer to a specified column in the input buffer. Variable: names a variable that is assigned input values. $ : Indicates to store a variable value as a character value rather than as a numeric value. : Reads data values that need additional instructions that informats can provide but are not aligned in columns. informat: specifies an informat to use to read the variable values. input hobby1 $ siblings bdate : mmddyy10. ;

Answer 63

Formatted input

Answer 64

D. descriptor information

Answer 65

A data error is when the INPUT statement encounters invalid data in a field. When SAS encounters a data error, these events occur: A note that describes the error is printed in the SAS log. The input record contents of the input buffer being read is displayed in the SAS log. The values in the SAS observation (contents of the PDV) being created are displayed in the SAS log. A missing value is assigned to the appropriate SAS variable. Execution continues

Answer 66

The DATALINES statement can be used with an INPUT statement to read data directly from the program, rather than data stored in a raw data file. datalines; Chloe 2 11/10/1995 $5Running Music Gymnastics Travis 2 1/30/1998 $2Baseball Nintendo Reading ; Run;

Answer 67

A. Multiple DATALINES statements can be used in a DATA step.

Answer 68

Standard data is any data that SAS can read without any special instructions.

Answer 69

1 numeric variable and 3 variables should be numeric

Answer 70

Nonstandard data is any data that SAS cannot read without a special instruction.

Answer 71

An informat is an instruction that SAS uses to read data values into a variable SAS uses the informat to determine the following: -whether the variable is numeric or character -the length of character variables.

Answer 72

A. 123456.78 and 1234.567

Answer 73

D. when a problem occurs with a valid informat, SAS writes a note to the SAS log, assigns a missing value to the variable, and terminates the DATA step.

Answer 74

The DLM= option specifies a delimiter to be used for list input. Blank is the default delimiter. Ex. infile 'kids4.dat' dlm=' , ';

Answer 75

By default SAS treats two consecutive delimiters as one, not as a missing value between the delimiters.

Answer 76

The DSD option can do the following: Treat two consecutive delimiters as a missing value Remove quotation marks from strings and treat any delimiter inside the quotation marks as a valid character Set the default delimiter to a comma. infile 'kids5.dat' dsd;

Answer 77

The missover option prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. When an input statement reaches the end of the current input data record, variables without any values assigned are set to missing with the MISSOVER option.

Midterm 1 Flashcards

(104 cards)