Reading Other Data File Types Flashcards

1
Q

How do you access an excel file in SAS?

A

Use SAS/ACCESS LIBNAME statement. Then SAS treats each worksheet in the workbook as a SAS data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SAS/ACCESS LIBNAME statment

A

LIBNAME libref (engine) (PATH=) “workbook file path” (options);

Example:

LIBNAME orionx pcfiles path=”&path/sales.xls”;

(PATH= is used with pcfiles engine)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you determine which engine to use in SAS/ACCESS LIBNAME statement?

A

SAS and MS Office offer 32-bit and 64-bit.

If bitness is the same, use excel engine.

If bitness is not the same, use pcfiles engine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the SAS name literal?

A

libref.’worksheetname$‘n

It’s a way to allow special characters or blanks in data set names. Useful since excel files are displayed in SAS with special character $.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is it important to dissociate a libref given to an excel file?

A

SAS locks the excel document while libref is assigned to it, and you won’t be able to open in in excel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you access data from a database?

A

Use SAS/ACCESS LIBNAME statement:

LIBNAME libref engine (SAS/ACCESS options);

engine is the database (e.g. oracle)

SAS/ACCESS options:

USER=

PASSWORD=

PATH=

SCHEMA=

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 3 types of raw data files?

A

text file, CSV file, and ASCII file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are 2 ways data can be presented in a raw data file?

A

Delimited file: Data values arranged sequentially. Data values separated by spaces or other special characters

Fixed column file: arranged in columns. Each field has a set width in every record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What important information about each record do you need to specify to read a delimited raw data file in SAS?

A

Location of data value in the record, name of SAS variable in which to store the data, and the type of SAS variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 different techniques to read raw data files in SAS?

A
  1. List input: reads standard and nonstandard data separated by delimiter
  2. Column input: reads standard data arranged in columns
  3. Formatted input: reads standard and nonstandard data arranged in colums
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the difference between standard and nonstandard data?

A

SAS can read standard data without any special intructions.

Nonstandard data can include dates or values with special characters. SAS needs extra intructions to read this data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you use the DATA Step to read raw data?

A

DATA new-data-set;

INFILE “file path” dlm=’,’;

INPUT variable1 variable2 $ variable3 $;

RUN;

dlm = what separates the data values. The default is a space or blank

INPUT specifies variables in order in which they appear in the raw data file. Character variables are followed by $

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

During compilation phase, what does SAS create that is particular to raw data files?

A

SAS creates an input buffer before the PDV. SAS reads data from the raw data file into the input buffer and the PDV reads from the input buffer.

It looks like a ruled grid where each character in the data value is placed in a different square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you specify the length of certain variables?

A

Use the LENGTH statement

LENGTH variable(s) ($) length;

Note: This needs to go before the input statement.

Note: The variables will be listed in the order and case used in the length statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between formats and informats?

A

Formats tell SAS how to write a value.

Informats tell SAS how to read a valued.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

INPUT statement for a modified list input

A

INPUT variable ($) variable (:informat.);

: tells SAS to read up to the delimiter

17
Q

Informat statement

A

($)informat(width).

18
Q

Colon format modifier, e.g. :date.

A

Tells SAS to read up to delimiter. Ignores width.

19
Q

How to add own data straight in the program

A

DATALINES statement

DATA work.mydata;

input variable1 variable2;

datalines;

(my data)

;

Note: datalines and the data should be the last thing on the DATA step

Note2: this doesn’t need to end in “run”

20
Q

If using DATALINES statement, where do you add the delimiter statement?

A

data work.mydata;

infile datalines dlm=’,’;

input variable1 variable2;

datalines;

(my data)

;

21
Q

DSD option - what is it and when to use

A

DSD = delimiter sensitive data

INFILE ‘file path’ (DLM=) DSD;

  • Sets default delimiter to a comma
  • treats consecutive delimiters as missing values
  • enables SAS to read values with embedded delimiters if the value is surrounded by quotation marks

Use it when have missing values in raw data file in list style

22
Q

MISSOVER Option - what is it and when to use it

A

INFILE ‘file path’ DLM= ‘,’ MISSOVER;

Use when there’s a missing value at the end of a record. This option sets the value to missing. Note that DSD would not work if missing value is at end of the line because it would not be between delimiters.

23
Q

True or False:

You can use the WHERE statement to subset observations read from a raw data file

A

FALSE

24
Q

When using list input, what is the default length for all variables?

A

8 bytes

25
Q

What do you use to read nonstandard data from a raw file?

A

Modified list input

26
Q

How does list input treat two or more consecutive delimiters (i.e. when there’s a missing value)?

A

It treats two or more delimiters as a single delimiter and not as a missing value. In this case, need DSD option.

27
Q

General form of Formatted Input

A

INPUT (pointer-control) variable1 informat.;

pointer-controls:

  1. @n moves the input pointer to a specific column number
  2. +n moves the input pointer forward to a column number that is relative to the current position
28
Q

How to read variable-length records that contain fixed-column data?

A

Use the PAD options in INFILE statement. It pads each record with blanks so that all data lines have the same length

29
Q

In modified list input, how do you read character values that contain embedded blanks? e.g Los Angeles

A

Use & modifier

data dataset;

infile filename;

length City $12;

input Rank City &;

run;