Ch 14: Using Functions to Manipulate Data Flashcards
Select the best answer for each question. Check your answers using the answer key in the appendix.
1.
Within the data set Cert.Temp, PayRate is a character variable and Hours is a numeric variable. What happens when the following program is run?
data work.temp;
set cert.temp;
Salary=payrate*hours;
run;
a.
SAS converts the values of PayRate to numeric values. No message is written to the log.
b.
SAS converts the values of PayRate to numeric values. A message is written to the log.
c.
SAS converts the values of Hours to character values. No message is written to the log.
d.
SAS converts the values of Hours to character values. A message is written to the log.
Correct answer: b
When this DATA step is executed, SAS automatically converts the character values of PayRate to numeric values so that the calculation can occur. Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.
A typical value for the character variable Target is 123,456. Which statement correctly converts the values of Target to numeric values when creating the variable TargetNo? a. TargetNo=input(target,comma6.); b. TargetNo=input(target,comma7.); c. TargetNo=put(target,comma6.); d. TargetNo=put(target,comma7.)
Correct answer: b
You explicitly convert character values to numeric values by using the INPUT function. Be sure to select an informat that can read the form of the values.
A typical value for the numeric variable SiteNum is 12.3. Which statement correctly converts the values of SiteNum to character values when creating the variable Location? a. Location=dept||'/'||input(sitenum,3.1); b. Location=dept||'/'||input(sitenum,4.1); c. Location=dept||'/'||put(sitenum,3.1); d. Location=dept||'/'||put(sitenum,4.1);
Correct answer: d
You explicitly convert numeric values to character values by using the PUT function. Be sure to select a format that can read the form of the values.
The variable Address2 contains values such as Piscataway, NJ. How do you assign the two-letter state abbreviations to a new variable named State? a. State=scan(address2,2); b. State=scan(address2,13,2); c. State=substr(address2,2); d. State=substr(address2,13,2);
Correct answer: a
The SCAN function is used to extract words from a character value when you know the order of the words, when their position varies, and when the words are marked by some delimiter. In this case, you do not need to specify delimiters, because the blank and the comma are default delimiters
The variable IDCode contains values such as 123FA and 321MB. The fourth character identifies sex. How do you assign these character codes to a new variable named Sex? a. Sex=scan(idcode,4); b. Sex=scan(idcode,4,1); c. Sex=substr(idcode,4); d. Sex=substr(idcode,4,1);
Correct answer: d
The SUBSTR function is best used when you know the exact position of the substring to extract from the character value. You specify the position to start from and the number of characters to extract.
Because of the growth within the 919 area code, the telephone exchange 555 is being reassigned to the 920 area code. The data set Clients.Piedmont includes the variable Phone, which contains telephone numbers in the form 919-555-1234. Which of the following programs correctly changes the values of Phone? a. data work.piedmont(drop=areacode exchange); set cert.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then scan(phone,1,3)='920'; run; b. data work.piedmont(drop=areacode exchange); set cert.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then phone=scan('920',1,3); run; c. data work.piedmont(drop=areacode exchange); set cert.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then substr(phone,1,3)='920'; run; d. data work.piedmont(drop=areacode exchange); set cert.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then phone=substr('920',1,3); run;
Correct answer: c
The SUBSTR function replaces variable values if it is placed on the left side of an assignment statement. When placed on the right side (as in Question 5), the function extracts a substring.
Suppose you need to create the variable FullName by concatenating the values of FirstName, which contains first names, and LastName, which contains last names. What is the best way to remove extra blanks between first names and last names? a. data work.maillist; set cert.maillist; length FullName $ 40; fullname=trim firstname||' '||lastname; run; b. data work.maillist; set cert.maillist; length FullName $ 40; fullname=trim(firstname)||' '||lastname; run; c. data work.maillist; set cert.maillist; length FullName $ 40; fullname=trim(firstname)||' '||trim(lastname); run; d. data work.maillist; set cert.maillist; length FullName $ 40; fullname=trim(firstname||' '||lastname); run;
Correct answer: b
The TRIM function removes trailing blanks from character values. In this case, extra blanks must be removed from the values of FirstName. Although answer c also works, the extra TRIM function for the variable LastName is unnecessary. Because of the LENGTH statement, all values of FullName are padded to 40 characters.
Within the data set Cert.Bookcase, the variable Finish contains values such as ash, cherry, teak, matte-black. Which of the following creates a subset of the data in which the values of Finish contain the string walnut? Make the search for the string case-insensitive. a. data work.bookcase; set cert.bookcase; if index(finish,walnut) = 0; run; b. data work.bookcase; set cert.bookcase; if index(finish,'walnut') > 0; run; c. data work.bookcase; set cert.bookcase; if index(lowcase(finish),walnut) = 0; run; d. data work.bookcase; set cert.bookcase; if index(lowcase(finish),'walnut') > 0; run;
Correct answer: d
Use the INDEX function in a subsetting IF statement, enclosing the character string in quotation marks. Only those observations in which the function locates the string and returns a value greater than 0 are written to the data set.