Ch 15: Producing Descriptive Statistics Flashcards
The default statistics produced by the MEANS procedure are n-count, mean, minimum, maximum, and which one of the following statistics:
a.
median
b.
range
c.
standard deviation
d.
standard error of the mean
Correct answer: c
By default, the MEANS procedure produces the n, mean, minimum, maximum, and standard deviation.
Which statement limits a PROC MEANS analysis to the variables Boarded, Transfer, and Deplane?
a.
by boarded transfer deplane;
b.
class boarded transfer deplane;
c.
output boarded transfer deplane;
d.
var boarded transfer deplane;
Correct answer: d
To specify the variables that PROC MEANS analyzes, add a VAR statement and list the variable names.
The data set Cert.Health includes the following numeric variables. Which is a poor candidate for PROC MEANS analysis?
a.
IDnum
b.
Age
c.
Height
d.
Weight
Correct answer: a
Unlike Age, Height, or Weight, the values of IDnum are unlikely to yield any useful statistics.
Which of the following statements is true regarding BY-group processing?
a.
BY variables must be either indexed or sorted.
b.
Summary statistics are computed for BY variables.
c.
BY-group processing is preferred when you are categorizing data that contains few variables.
d.
BY-group processing overwrites your data set with the newly grouped observations.
Correct answer: a
Unlike CLASS processing, BY-group processing requires that your data already be indexed or sorted in the order of the BY variables. You might need to run the SORT procedure before using PROC MEANS with a BY group.
Which group processing statement produced the PROC MEANS output shown below?
a.
class sex survive;
b.
class survive sex;
c.
by sex survive;
d.
by survive sex;
Correct answer: b
A CLASS statement produces a single large table, whereas BY-group processing creates a series of small tables. The order of the variables in the CLASS statement determines their order in the output table.
Which program can be used to create the following output?
a.
proc means data=cert.diabetes;
var age height weight;
class sex;
output out=work.sum_gender
mean=AvgAge AvgHeight AvgWeight;
run;
b.
proc freq data=cert.diabetes;
tables height weight sex;
run;
c.
proc means data=cert.diabetes noprint;
var age height weight;
class sex;
output out=work.sum_gender
mean=AvgAge AvgHeight AvgWeight;
run;
d.
Both a and b.
Correct answer: a
You can use PROC MEANS to create the table. The MEANS procedure provides data summarization tools to compute descriptive statistics for the variables Age, Height, and Weight for each Sex group.
By default, PROC FREQ creates a table of frequencies and percentages for which data set variables?
a.
character variables
b.
numeric variables
c.
both character and numeric variables
d.
none: variables must always be specified
Correct answer: c
By default, PROC FREQ creates a table for all variables in a data set.
Frequency distributions work best with variables that contain which types of values?
a.
continuous values
b.
numeric values
c.
categorical values
d.
unique values
Correct answer: c
Both continuous values and unique values can result in lengthy, meaningless tables. Frequency distributions work best with categorical values.
Which PROC FREQ step produced this two-way table?
a.
proc freq data=cert.diabetes;
tables height weight;
format height htfmt. weight wtfmt.;
run;
b.
proc freq data=cert.diabetes;
tables weight height;
format weight wtfmt. height htfmt.;
run;
c.
proc freq data=cert.diabetes;
tables height*weight;
format height htfmt. weight wtfmt.;
run;
d.
proc freq data=cert.diabetes;
tables weight*height;
format weight wtfmt. height htfmt.;
run;
Correct answer: d
An asterisk is used to join the variables in a two-way TABLES statement. The first variable forms the table rows. The second variable forms the table columns
Which PROC FREQ step produced this table?
a.
proc freq data=cert.diabetes;
tables sex weight / list;
format weight wtfmt.;
run;
b.
proc freq data=cert.diabetes;
tables sex*weight / nocol;
format weight wtfmt.;
run;
c.
proc freq data=cert.diabetes;
tables sex weight / norow nocol;
format weight wtfmt.;
run;
d.
proc freq data=cert.diabetes;
tables sex*weight / nofreq norow nocol;
format weight wtfmt.;
run;
Correct answer: d
An asterisk is used to join the variables in crosstabulation tables. The only results shown in this table are cell percentages. The NOFREQ option suppresses cell frequencies, the NOROW option suppresses row percentages, and the NOCOL option suppresses column percentages.