This lesson covers appending datasets using PROC APPEND
For an introduction to appending concepts, common scenarios, and appending using the SET statement, refer to lesson SAS_APPENDING_L101
What is PROC APPEND?
PROC APPEND is a SAS procedure specifically designed to append observations from one dataset to another
It adds observations from the DATA dataset to the end of the BASE dataset
Unlike the SET statement, PROC APPEND does not read the BASE dataset - it only reads the DATA dataset
This makes PROC APPEND significantly more efficient than the SET statement when the BASE dataset is large
A common use case is appending a new batch of records to a large accumulating dataset
Syntax of PROC APPEND
The basic syntax for PROC APPEND requires two arguments: BASE= specifies the target dataset that will receive the appended observations, and DATA= specifies the source dataset whose observations will be added
Optional arguments such as FORCE can follow the DATA= option when needed
SAS Log
BASE= specifies the dataset to which observations are to be appended (the target dataset)
DATA= specifies the dataset whose observations are to be added to the BASE dataset
The BASE dataset is modified directly - no separate output dataset is created
Key difference between SET statement and PROC APPEND
SET statement reads ALL input datasets to create a new output dataset
PROC APPEND reads only the DATA dataset and appends it to the BASE dataset without reading the BASE dataset
When the BASE dataset has millions of records and DATA has a small batch of new records, PROC APPEND is much faster
With SET statement, as the BASE grows larger, the time to append also grows - with PROC APPEND it does not
Create example datasets
Recreating the same example datasets used in SAS_APPENDING_L101 for consistency
SAS Log
Appending datasets with same attributes using PROC APPEND
When BASE and DATA datasets have the same variables with the same attributes, PROC APPEND works without any additional options
Observations from the DATA dataset are added to the end of the BASE dataset
The BASE dataset is updated in place - it will contain observations from both the original BASE and the DATA dataset after the procedure runs
Append females dataset to males dataset using PROC APPEND
Before PROC APPEND, males has 10 observations and females has 9 observations
After PROC APPEND, males will contain 19 observations - original 10 male records followed by 9 female records
The females dataset remains unchanged
SAS Log
Dataset View
Append sports dataset to sedan dataset using PROC APPEND
Before PROC APPEND, sedan has 262 observations and sports has 120 observations
After PROC APPEND, sedan will contain 382 observations
SAS Log
Dataset View
PROC APPEND with FORCE option
By default, PROC APPEND requires the BASE and DATA datasets to have the same variables with compatible attributes
If the DATA dataset has variables not present in the BASE dataset, PROC APPEND produces an error and stops
If a variable has a longer length in DATA than in BASE, PROC APPEND produces an error and stops
The FORCE option instructs PROC APPEND to proceed despite these differences
When FORCE is used and DATA has extra variables not in BASE, those variables are dropped during the append
When FORCE is used and a variable is longer in DATA than in BASE, values may be truncated - always check the log for warnings
Appending datasets with different variables using FORCE option
Creating two datasets where the DATA dataset has an extra variable not present in the BASE dataset
SAS Log
class_with_bmi contains all class variables plus BMI
class_no_bmi contains only the standard class variables
Appending class_with_bmi to class_no_bmi without FORCE would produce an error because BMI is not in the BASE dataset
With FORCE, the append proceeds and the BMI variable is dropped from the appended records
SAS Log
Dataset View
Appending datasets where DATA has a longer variable length than BASE
Creating two datasets where the same variable has different lengths
SAS Log
name_short has NAME with default length of 8 characters
name_long has NAME with length of 20 characters
Without FORCE, PROC APPEND would error because the DATA variable NAME is longer than in BASE
With FORCE, the append proceeds but NAME values from name_long are truncated to 8 characters
Always check the SAS log for truncation warnings when using FORCE