SAS P1 L8a Flashcards
What is another name for a raw data file?
Raw data file is also called a flat file
What are characteristics of raw data files? types? (3) where located? what per line? fields? what software needed?
Raw data files: can be text, CSV, ASCII files external text file one record per line, usually multiple fields per record NOT software specific
How are raw data files arranged?
Raw data files: fields are delimited or arranged in columns
Fields in delimited raw data file: order? separated by? widths? column headings? documentation?
Fields in delimited raw data file:
- in sequential order
- separated by spaces or special char (ex: comma)
- fields can be varying widths
- no column headings
- will have external documentation = record layout, explains values
Fixed column file:
how fields ID’d?
field width?
Fixed column file:
- fields are ID’d by starting and ending column
- given field will begin in same column, have same width in every record
(not learning these here)
What info needs to be specified for SAS to read raw data file? (3)
For SAS to read raw data file, for each field, must specify:
- location of data value in record
- name of SAS variable in which to store data
- type of SAS variable
Name 3 techniques to read raw data files.
What data can they read?
How is data arranged?
Techniques to read raw data files:
list input: standard and/or non-standard data
separated by delimitor
column input: only standard data
in columns
formatted input: standard and/or non-standard data
in columns
I think we only use list input in this course.
What is standard data?
Standard data: data SAS can read without any special instructions
What is non-standard data?
Non-standard data:
- includes data like dates, or special characters (ex: $)
- SAS needs special instructions to read non-standard data
How do you use data step to read raw data?
Commands / key words?
Use SET statement?
Example?
Using data step to read raw data file:
- use INFILE, INPUT
- DON’T use SET statement
Ex: DATA output-SAS-data-set-name;
INFILE ‘raw-data-file-name’;
INPUT specifications;
RUN;
INFILE statement:
identifies?
INFILE statement identifies physical name and location of raw data file
INPUT statement:
describes?
assigns?
INPUT statement:
- describes arrangement of values in raw data file
- assigns input statements to corresponding SAS var
INFILE syntax:
if referencing macro variable?
if delimiter is comma?
what is default delim?
- use double quotes if referencing macro variable
(ex: “{and}path/sales.csv” where {and} = ampersand) - use dlm = ‘,’ if delimiter is comma
or change to appropriate char - default delimeter is space
INPUT syntax: what if variable is string? what order? names must? include which variables? skip any fields?
INPUT variable1 {$} variable2 {$} …. ;
- add {$} if variable is string
- specify variables in order in which they appear in data set
- must follow SAS naming conventions for variables
- must include variables for ALL data up to and including last field you want
- any remaining fields will be ignored
- can’t skip over fields