*Copyright @ www.mycsg.in;
What does PROC TRANSPOSE do
`proc transpose` reshapes data by converting values from rows into columns or from columns into rows
It is commonly used when the structure of a dataset must be changed before reporting, summarisation, or merging
A frequent use case is converting repeated values within a subject into separate variables on one row
Another common use case is turning multiple variables into one stacked variable for later analysis
Create a simple example dataset
We create a long-style dataset named `scores_long` where each student has multiple rows
The variable `test` identifies the type of score, and `score` contains the value to be transposed
This layout is a good starting point for understanding how rows can be converted into columns
data scores_long; input student $ test $ score; datalines; Alice Math 88 Alice Science 91 Alice English 84 John Math 79 John Science 85 John English 82 Jane Math 93 Jane Science 89 Jane English 95 ; run;
Copy Code
View Log
SAS Log
data scores_long; input student $ test $ score; datalines; NOTE: The data set WORK.SCORES_LONG has 9 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.01 seconds ; run;
`scores_long` contains one row per student per test
Each student therefore appears multiple times in the dataset
Inspect the dataset and confirm that the same student name repeats across different test values
View Data
Dataset View
Basic transpose without BY grouping
In the simplest case, `proc transpose` takes values from one variable and writes them as separate observations in the transposed output
This example transposes the `age` and `height` variables from a single-observation dataset so the learner can see the default output structure
The output dataset usually contains `_NAME_` to store the original variable name and `COL1` to store the transposed value
data one_student; set sashelp.class(obs=1 keep=name age height); run; proc transpose data=one_student out=one_student_t; var age height; run;
Copy Code
View Log
SAS Log
data one_student; set sashelp.class(obs=1 keep=name age height); run; NOTE: There were 1 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.ONE_STUDENT has 1 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds proc transpose data=one_student out=one_student_t; var age height; run; NOTE: There were 1 observations read from the data set WORK.ONE_STUDENT. NOTE: The data set WORK.ONE_STUDENT_T has 2 observations and 2 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
The variables listed on the `var` statement become separate observations in `one_student_t`
`_NAME_` identifies whether the row came from `AGE` or `HEIGHT`
`COL1` stores the corresponding value
View Data
Dataset View
Transpose rows to columns within each student
Most practical transpose tasks use a `BY` variable so each group is transposed separately
We sort the data by `student` before using `by student;` because BY group processing requires matching sort order
`id test;` tells SAS to use the values of `test` as output column names
`var score;` tells SAS which values should populate those new columns
proc sort data=scores_long out=scores_long_sort; by student; run; proc transpose data=scores_long_sort out=scores_wide; by student; id test; var score; run;
Copy Code
View Log
SAS Log
proc sort data=scores_long out=scores_long_sort; by student; run; NOTE: There were 9 observations read from the data set WORK.SCORES_LONG. NOTE: The data set WORK.SCORES_LONG_SORT has 9 observations and 3 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds proc transpose data=scores_long_sort out=scores_wide; by student; id test; var score; run; NOTE: There were 9 observations read from the data set WORK.SCORES_LONG_SORT. NOTE: The data set WORK.SCORES_WIDE has 3 observations and 5 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
`scores_wide` now contains one row per student
The values of `test` such as `Math`, `Science`, and `English` become output columns
The corresponding values from `score` fill those columns for each student
Verify that each student now appears only once in the transposed dataset
View Data
Dataset View
Transpose columns to rows for all students
`proc transpose` can also be used to stack multiple variables into one value column
In this example, `age`, `height`, and `weight` are converted into repeated rows per student
This is useful when wide data must be converted into a long structure
proc transpose data=sashelp.class out=class_long name=source_variable; by name; var age height weight; run;
Copy Code
View Log
SAS Log
proc transpose data=sashelp.class out=class_long name=source_variable; by name; var age height weight; run; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.CLASS_LONG has 57 observations and 3 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.00 seconds cpu time 0.01 seconds
The output dataset contains one row per student per original analysis variable
`source_variable` identifies whether the value came from `age`, `height`, or `weight`
`COL1` stores the corresponding value
View Data
Dataset View
Understand the NAME option
The `name=` option lets us rename the default `_NAME_` variable created by `proc transpose`
This is useful when the default variable name is not descriptive enough for the output dataset
proc transpose data=one_student out=one_student_t2 name=source_variable; var age height; run;
Copy Code
View Log
SAS Log
proc transpose data=one_student out=one_student_t2 name=source_variable; var age height; run; NOTE: There were 1 observations read from the data set WORK.ONE_STUDENT. NOTE: The data set WORK.ONE_STUDENT_T2 has 2 observations and 2 variables. NOTE: PROCEDURE TRANSPOSE used (Total process time): real time 0.01 seconds cpu time 0.00 seconds
The output now uses `source_variable` instead of `_NAME_`
This makes the meaning of that column easier to understand when reviewing the dataset
View Data
Dataset View
Key points to remember
`proc transpose` reshapes datasets by converting rows to columns or columns to rows
`var` identifies the values to transpose
`by` groups observations and produces one transposed result per group
`id` uses the values of a variable as output column names
When a BY statement is used, the input dataset should be sorted by the same BY variables first