*Copyright @ www.mycsg.in;
Create sample datasets
proc freq with all defaults
- uses the recently created dataset as the data= option is not used
- gives one way frequencies of unique values in all variables - both numeric and character variables
- output can be seen in output/results window - no dataset is created
- default stats provided in output window are: frequency, percentage, cumulative frequency and cumulative percentage
- a separate table is created for each variable in the output window
using data= option to mention the input dataset
using tables statement to request frequencies for a specific variable
using tables statement to request frequencies for multiple individual variables
- list the variables separated by a space in the table statment
using multiple table statments
using by group processing to obtain frequencies of a variable grouped by another variable
- use by statment for the grouping variable
- pre sort the dataset by using sort procedure
- separate table is created for each unique value in the by group variable
cross-tabulation to obtain frequencies of a variable grouped by another variable
- list the variables separated by an asterisk
- the unique values of first listed variable variable appear as rows
- the unique values of second listed variable variable appear as columns
- A row for row totals and column for column totals will also be displayed
- frequency, percent, rowpercent and columnpercent appear in the cells created at intersection of row and column values
cross tabulation of more than two variables
- a separate table is created for each unique value in the first listed variable
- the unique values of second listed variable variable appear as rows
- the unique values of third listed variable variable appear as columns
options to control the stats displayed in output window
one way tables - default
one way tables - suppress cumultative stats
one way tables - suppress percent
one way tables - suppress count
- there exists no such option to suppress count
one way tables - suppress multiple possible elements
- list the options seen above separated by a space
n-way tables - default
- default stats: frequency, percentage, column percentage, row percentage
n-way tables - suppress percentage
n-way tables - suppress row percentage
n-way tables - suppress column percentage
n-way tables - suppress both row and column percentage
- list the above used options separated by space
n-way tables - suppress both row and column percentage and also suppress (total) percetage
- list the above used options separated by space
option to change the tabular view of n-way tables to list view (view all the levels across variables as columns)
option to save the frequencies and percentages into a dataset
- we need to use out= option on table(s) statement to save the stats into a dataset
- by default all the variables listed on table statement and count, percentage are present in dataset
- the default results continue to be displayed in the output/results window along the creation of dataset
- any of the options (nopercent, nocum, nocol, nrow,list) used to suppress stats are applicable
only for results displayed in output window
out= option on n-way table
using noprint option on proc freq statement
- when creating a dataset for stats we may not be interested in seeing the results in output window
- in such cases, we can use noprint option on proc freq statement to suppress the results in output window
mutliple one-way tables on tables statment with out= option
- output dataset will be created for the last table requested - sex in the first example
- output dataset will be created for the last table requested - age in the second example
multiple n-way tables on tables statement with out=option
- output dataset will be created for the last table requested
create multiple datasets within the same procedure invocation
- we need to use multiple table statements
missing values in variables for which frequency tables are requested
missing values - results in output window
- displays the total number of records with missing levels at the bottom of the table
include missing value as level in the table
- missing option on tables statement
missing levels: impact on output dataset
- missing values will be considered for levels creation and a separate row is created for that level
- note that in the dataset, the percentages are calculated based on non-missing level counts and percentage is not calculated on missing levels