Common scenarios where appending (of datasets) is required in SAS?
Combine the records of students of different classes when a separate dataset is present for each class
Combine the recods of male and female students when male and female records are present in different datasets
Combine the records of screen failures and treated subjects when they are present in different datasets
Combine the records of inclusion criteria and exclusion criteria not met by subjects when they are present in different datasets
Combine the prior medications data and concomitant medications data of subjects when they are present in different datasets
Combine all the input datasets available to check the latest available date for a subject across all datasets
What does 'SET statement' of SAS data step do?
By definition, set statement is used for reading observations from one or more datasets
When only one dataset name is provided on the set statement, it reads all the observations from the input
dataset unless otherwise specified and creates a new dataset (or overwrites existing dataset when same dataset name
is provided in the data statement)
As specified in the first point, set statement can be used to read observations from more than dataset in a single data step
We use this ability to read observations from multiple datasets for appending observations from one dataset to another dataset
To append datasets, we need to specify the names of the input datasets separated by space on the set statement
When appending, SAS reads all the observations from first input dataset and then starts reading the observations from the nex input dataset.
That is, in the output dataset we will see the observations from first dataset first followed the observations from second input dataset and so on
Create some example datasets
Combining input datasets having same attributes using SET statement
When appending datasets with set statement, SAS reads all variables and observations from the input datasets
Create a dataset named m_and_f by appending males and females datasets using set statement
We need to specify the name of the output dataset 'm_and_f' on the dataset, to hold the combined variables and observations
We need to specify the names of the input datasets (males and females) on the set statement separated by a space in between.
Notice in the output dataset, all 10 male sudent records are seen first(observations from males dataset) followed by the
9 observations from females dataset
Create a dataset named cars1 by appending sedan and sports datasets using SET statement
Create a dataset named allcars by appending cars1 and othercars datasets using SET statement
Create a dataset named allcars2 by appending sedan,sports and othercars datasets using SET statement
Recreating input datasets
Notice that we are appending three datasets at a time