Announcement Icon Online training class for Clinical SAS programming starting soon. Click here for details.

Transpose data - long to wide


SAS code


data LONG;
infile datalines dlm='|' dsd missover;
input usubjid : $8. lbtestcd : $8. lbstresn : BEST6.;
label ;
format ;
datalines4;
1001|HGB|13
1001|ALT|30
1001|AST|23
1002|HGB|12
1002|ALT|28
1002|AST|15
;;;;
run;

proc sort data=long ;
   by usubjid;
run;
 
proc transpose data=long   out=wide  ;
   by usubjid;
   var lbstresn;
   id lbtestcd;
run;

 

SAS code description

The PROC TRANSPOSE procedure is used to transpose or reshape a dataset from long format to wide format. The dataset "long" is transposed into a new dataset named "wide" using the OUT= option. The transposition is performed by grouping the data by the variable "usubjid" with the BY statement.

BY usubjid;: This statement specifies the variable "usubjid" as the grouping variable. The transposition will be done separately for each unique value of "usubjid".

VAR lbstresn;: This statement specifies the variable "lbstresn" to be transposed. The values of this variable will become the values of the new variables in the transposed dataset.

ID lbtestcd;: This statement specifies the variable "lbtestcd" as the identifier variable. The unique values of "lbtestcd" will become the variable names in the transposed dataset.

After executing the PROC TRANSPOSE procedure, the dataset "long" is transformed into the "wide" dataset, where each unique value of "usubjid" becomes a separate observation, and the values of "lbstresn" are transposed into new variables named according to the unique values of "lbtestcd".

R code

long<-tribble(
~usubjid,~lbtestcd,~lbstresn,
"1001","HGB",13,
"1001","ALT",30,
"1001","AST",23,
"1002","HGB",12,
"1002","ALT",28,
"1002","AST",15,
)


wide<-pivot_wider(long,
                  id_cols=usubjid,
                  values_from=lbstresn,
                  names_from = lbtestcd)

R code description

The pivot_wider() function is a part of the Tidyverse package in R, specifically the tidyr package. It is used to reshape a dataset from long format to wide format. In this code snippet, the dataset "long" is transformed into a new dataset named "wide" using the pivot_wider() function.

id_cols = usubjid: This argument specifies the variable "usubjid" as the identifier column. Each unique value of "usubjid" will become a separate observation in the wide dataset.

values_from = lbstresn: This argument specifies the variable "lbstresn" from which the values will be extracted and spread across the wide dataset.

names_from = lbtestcd: This argument specifies the variable "lbtestcd" whose unique values will become the column names in the wide dataset.

By calling pivot_wider() with the specified arguments, the "long" dataset is reshaped into the "wide" dataset. Each unique value of "usubjid" becomes a separate observation, and the values from "lbstresn" are spread across the wide dataset with new columns created based on the unique values of "lbtestcd".