When working with data, we frequently need to work with only a selected set of variables. For this, we need programming features to drop variables or columns which are not needed for analysis.
There are multiple ways of dropping the variables/columns which are not required in both SAS and R tidyverse. Below is one basic approach in both SAS and R.
Let us assume that we have a dataset named "class" with 5 variables named Name, Sex, Age, Height, Weight.
Name |
Sex |
Age |
Height |
Weight |
Alfred |
M |
14 |
69 |
112.5 |
Alice |
F |
13 |
56.5 |
84 |
Barbara |
F |
13 |
65.3 |
98 |
Carol |
F |
14 |
62.8 |
102.5 |
Henry |
M |
14 |
63.5 |
102.5 |
James |
M |
12 |
57.3 |
83 |
Let us assume that we do not need Height and Weight for analysis.
Name |
Sex |
Age |
Height |
Weight |
Alfred |
M |
14 |
69 |
112.5 |
Alice |
F |
13 |
56.5 |
84 |
Barbara |
F |
13 |
65.3 |
98 |
Carol |
F |
14 |
62.8 |
102.5 |
Henry |
M |
14 |
63.5 |
102.5 |
James |
M |
12 |
57.3 |
83 |
We can create a new subset dataset (tibble/dataframe) using the below code.
library(tidyverse)
library(haven)
setwd(dir = "D:/SAS/Home/dev/clinical_sas_samples/mycsg/SAS/SASnR/")
class<-haven::read_sas("class.sas7bdat")
class_selvars<-select(class,-Height,-Weight)
Example class dataset (sas dataset) can be downloaded from here.