Lesson 2: Accessing Data Flashcards
Syntax to view / readtable and column a
attributes
PROC CONTENTS DATA=data-set;
RUN;
Syntax to create a Library
LIBNAME - libref - engine - “path”;
LIBNAME - key word
libref - name of your library
engine - (base is default, can be omitted)
path - found in Properties
Syntax to delete library Ref
LIBNAME libref CLEAR;
libref if your LIBRARYNAME
Syntax to use PROC step with Library Ref
PROC CONTENTS DATA = LIBRARYNAME.FILENAME;
RUN;
What kind of statement is LIBNAME? Does it need a RUN statement?
LIBNAME statement is a global statement and does not need a RUN statement at the end.
What are LIBNAME name attributes?
- The libref must be 8 characters or less
- must start with either a letter or underscore
- includes only letters, numbers, and underscores
- the path must also be in quotation marks
How to access tables (data) in your library?
What is the syntax?
PROC CONTENTS DATA = libref.table-name
You use the library to access tables in your program by specifying
- the libref (or library name)
- a period and
- the table name (mylib.class).
You don’t have to type the file extension because SAS already knows you’re accessing SAS tables from that location.
Work Library
WORK library is default, automatic and temporary; all tables deleted at the end of the session.
Tables stored in the Work library are deleted at the end of each SAS session.
Work is the default library, so if a table name is provided in the program without a libref, the table will be read from or written to the Work library.
DATA=WORK.TEST
DATA=TEST
SASHELP Library
SASHELP is automatic and contains collection of sample tables and other data.
The library is automatically available when SAS starts.
DATA=SASHELP.CARS
Syntax to use library to READ data directly from Excel workbooks?
LIBNAME libref XLSX “path/file-name.xlsx”;
LIBNAME - keyword
libref - name of your library
XLSX - engine to READ Excel
The full path to the Excel file, including the .xlsx file extension, must be provided in DOUBLE quotation marks.
Importing unstructured data
You must import the data into a structured format, such as a SAS table, before you can use the data in a program. SAS can import delimited files with any character acting as the delimiter. By default, SAS assumes column names are found in the first row of the file.
PROC IMPORT
DATAFILE=”path/filename” DBMS=filetype
OUT=output-table
REPLACE
GUESSINGROWS=n|MAX;
RUN;
PROC IMPORT reads data from an external data source and writes it to a SAS data set.
DATAFILE= path and complete file name
DBMS= defines the file type (e.g. cvs, txt)
OUT= library and name of the SAS output table
REPLACE overwrites the SAS output table if it already exists
GUESSINGROWS=provides a number of rows to examine. By default it scans the first 20 rows
Importing Excel Files
Imports a copy of Excel data as a SAS table and uses that SAS table in your program.
PROC IMPORT can read only one spreadsheet at a time, and by default it reads the first worksheet.
PROC IMPORT DATAFILE=”path/file-name.xlsx”
DBMS=XLSX
OUT=output-table <REPLACE>;
SHEET=sheet-name;
RUN;</REPLACE>
What portions do SAS data sets have?
DATA and DESCRIPTOR
What are the three types of SAS columns?
Name, Type, and Length.
Columns are either
character or numeric
Character columns can have a length
between 1 and 32,767 bytes (1 byte = 1 character).
Numeric columns are stored with a length of
8 bytes.
Character columns can consist
letters, numbers, special characters, or blanks.
Numeric columns can consist of
digits 0-9, minus sign, decimal point, and E for scientific notation.
SAS date values are a type of numeric value and represent the number of days between
January 1, 1960, and a specified date
What is LIBREF
is the name of the library that can be used in a SAS program to read data files.
What does ENGINE do?
provides instructions for reading SAS files and other types of files.
What does PATH do?
provides the directory where the collection of tables is located.
The LIBREF remains active until
you clear it, delete it, or shut down SAS.
LIBNAME libref engine “path”;
LIBNAME libref CLEAR;
The XLSX engine enables
to read data directly from Excel workbooks. The XLSX engine requires the SAS/ACCESS to PC Files license.
The VALIDVARNAME=V7 system option forces
table and column names read from Excel to adhere to recommended SAS naming conventions.
Spaces and special symbols are replaced with underscores, and names greater than 32 characters are truncated.
Date values are automatically converted
to numeric SAS date values and formatted for easy interpretation
Worksheets or named ranges from the Excel workbook can be referenced in a SAS program as
libref.spreadsheet-name.
other databases, it’s a good practice to delete the libref at the end of your program with WHAT option?
CLEAR
OPTIONS VALIDVARNAME=V7;
LIBNAME libref XLSX “path/file.xlsx”;
The DBMS option identifies
the file type.
The CSV value is included with Base SAS.
The OUT= option provides
the library and name of the SAS output table
The REPLACE option is necessary to
overwrite the SAS output table if it exists.
SAS assumes that column names are in the
1st line of the text file and data begins on the second line
Date values are automatically converted to
numeric SAS date values and formatted for easy interpretation
The GUESSINGROWS= option indicates the
number of rows the IMPORT procedure scans in the input file to determine the appropriate data type and length of columns. The default value is 20 and the allowed range is 1 to 2147483647 (or MAX).
Syntax for importing CSV file is
PROC IMPORT DATAFILE=”file.csv”
DBMS=CSV
OUT=output-table REPLACE;
GUESSINGROWS=n;
RUN;
Syntax for Importing an Excel (XLSX) File
PROC IMPORT DATAFILE=”file.xlsx” DBMS=XLSX
OUT=output-table REPLACE;
SHEET=sheet-name;
RUN;
Missing numeric values are represented
with a period