*Copyright @ www.mycsg.in;
Create sample datasets
`class01` is a direct copy of `sashelp.class` and is used in most examples
`class02` introduces a missing value in `sex` so that the learner can observe how PROC FREQ handles missing category values
data class01; set sashelp.class; run; data class02; set sashelp.class; if age=16 then sex=""; run;
Copy Code
View Log
SAS Log
data class01; set sashelp.class; run; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.CLASS01 has 19 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds data class02; set sashelp.class; if age=16 then sex=""; run; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.CLASS02 has 19 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
View Data
Dataset View
PROC FREQ with all defaults
If `data=` is not used, PROC FREQ analyses the most recently created dataset in the session
PROC FREQ creates one-way frequency tables for all variables by default
No output dataset is created unless additional options request one
The default statistics shown are frequency, percent, cumulative frequency, and cumulative percent
proc freq; run;
Copy Code
View Log
SAS Log
proc freq; ERROR: File WORK.TEMP.DATA does not exist. run; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE FREQ used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Review the output window and notice that a separate frequency table is produced for each variable
Using the DATA= option to mention the input dataset
proc freq data=class01; run;
Copy Code
View Log
SAS Log
proc freq data=class01; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.00 seconds cpu time 0.01 seconds
Using the TABLES statement to request frequencies for a specific variable
The `tables` statement limits the report to the variables listed
Here we request a one-way table for `sex` only
proc freq data=class01; tables sex; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Using the TABLES statement to request frequencies for multiple individual variables
List multiple variables separated by spaces on the same TABLES statement
PROC FREQ creates one separate one-way table for each variable listed
proc freq data=class01; tables sex age; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex age; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Using multiple TABLES statements
Multiple TABLES statements can also be used instead of listing all variables in one statement
The final result is similar, but the code may be easier to read in some situations
proc freq data=class01; tables sex; tables age; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex; tables age; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Using BY group processing to obtain frequencies of one variable grouped by another variable
Use a BY statement for the grouping variable
Before using BY processing, the dataset must be sorted by the same BY variable
PROC FREQ then creates a separate frequency table for each BY group
proc sort data=class01 out=class_sort; by sex; run; proc freq data=class_sort; by sex; tables age; run;
Copy Code
View Log
SAS Log
proc sort data=class01 out=class_sort; by sex; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: The data set WORK.CLASS_SORT has 19 observations and 5 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds proc freq data=class_sort; by sex; tables age; run; NOTE: There were 19 observations read from the data set WORK.CLASS_SORT. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Cross tabulation to obtain frequencies of one variable grouped by another variable
List the variables separated by an asterisk
The unique values of the first listed variable appear as rows
The unique values of the second listed variable appear as columns
Frequency, percent, row percent, and column percent are shown in the cells by default
proc freq data=class01; tables sex*age; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex*age; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Cross tabulation of more than two variables
For more than two variables, PROC FREQ creates layered tables
A separate table is created for each unique value of the first listed variable in a three-way table
data class_agegroup; set class01; length age_group $10; if age lt 13 then age_group="Pre-teen"; else if age lt 19 then age_group="Teen"; run; proc freq data=class_agegroup; tables age_group*age*sex; run;
Copy Code
View Log
SAS Log
data class_agegroup; set class01; length age_group $10; if age lt 13 then age_group="Pre-teen"; else if age lt 19 then age_group="Teen"; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: The data set WORK.CLASS_AGEGROUP has 19 observations and 6 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.01 seconds proc freq data=class_agegroup; tables age_group*age*sex; run; NOTE: There were 19 observations read from the data set WORK.CLASS_AGEGROUP. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Options to control the statistics displayed in the output window
One-way tables default output
proc freq data=class01; tables sex; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
One-way tables suppress cumulative statistics
proc freq data=class01; tables sex / nocum; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex / nocum; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
One-way tables suppress percent
proc freq data=class01; tables sex / nopercent; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex / nopercent; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
One-way tables suppress count
The `nocount` option suppresses the display of frequencies in the output table
This corrects a common misunderstanding that count cannot be suppressed
proc freq data=class01; tables sex / nocount; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex / nocount; ------- 1 WARNING 1-322: Assuming the symbol NOCOL was misspelled as nocount. run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
One-way tables suppress multiple possible elements
Multiple table options can be listed together after a slash
proc freq data=class01; tables sex / nopercent nocum; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex / nopercent nocum; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
N-way tables default output
For two-way tables, the default statistics include frequency, percent, row percent, and column percent
proc freq data=class01; tables sex*age; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex*age; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
N-way tables suppress percent values
proc freq data=class01; tables sex*age / nopercent norow nocol; run;
Copy Code
View Log
SAS Log
proc freq data=class01; tables sex*age / nopercent norow nocol; run; NOTE: There were 19 observations read from the data set WORK.CLASS01. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Display missing values as a valid category
By default, missing values are excluded from one-way frequency tables in many situations
The `missing` option tells PROC FREQ to treat missing values as a valid category level
We use `class02` here because one value of `sex` was intentionally set to missing
proc freq data=class02; tables sex / missing; run;
Copy Code
View Log
SAS Log
proc freq data=class02; tables sex / missing; run; NOTE: There were 19 observations read from the data set WORK.CLASS02. NOTE: PROCEDURE FREQ used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
Key points to remember
`proc freq` is used for one-way and multiway frequency tables
The `tables` statement controls which variables are tabulated
BY processing requires prior sorting and creates separate tables for each BY group
Table options such as `nocum`, `nopercent`, `nocount`, `norow`, and `nocol` control what is displayed
The `missing` option lets missing values appear as a valid category in the output