# How to select/subset required rows in SAS and R tidyverse

This post is part of 'SASnR | Subset observations (rows)' series

When working with data, we frequently require only a subset of observations/rows for a specific analysis.

So, we need programming features to select the specific subset of rows meeting a filter condition.

There are multiple ways of achieving this result in both SAS and R. Below is one basic approach in SAS and R.

Let us assume that we have the following input data with 19 observations and five variables capturing some basic information about the students of a class.

 Name Sex Age Height Weight Alfred M 14 69 112.5 Alice F 13 56.5 84 Barbara F 13 65.3 98 Carol F 14 62.8 102.5 Henry M 14 63.5 102.5 James M 12 57.3 83 Jane F 12 59.8 84.5 Janet F 15 62.5 112.5 Jeffrey M 13 62.5 84 John M 12 59 99.5 Joyce F 11 51.3 50.5 Judy F 14 64.3 90 Louise F 12 56.3 77 Mary F 15 66.5 112 Philip M 16 72 150 Robert M 12 64.8 128 Ronald M 15 67 133 Thomas M 11 57.5 85 William M 15 66.5 112

Let us assume that we only data for male students for a particular analysis.

 Name Sex Age Height Weight Alfred M 14 69 112.5 Alice F 13 56.5 84 Barbara F 13 65.3 98 Carol F 14 62.8 102.5 Henry M 14 63.5 102.5 James M 12 57.3 83 Jane F 12 59.8 84.5 Janet F 15 62.5 112.5 Jeffrey M 13 62.5 84 John M 12 59 99.5 Joyce F 11 51.3 50.5 Judy F 14 64.3 90 Louise F 12 56.3 77 Mary F 15 66.5 112 Philip M 16 72 150 Robert M 12 64.8 128 Ronald M 15 67 133 Thomas M 11 57.5 85 William M 15 66.5 112

We can create a subset of male students (Sex="M") using the below code.

## SAS code

data class;
set sashelp.class;
where sex="M";
run;
• SAS data step is used to create a new dataset
• data statement is used to specify the name of the output dataset to be created
• set statement is used to specify the name of the input dataset
• where statement is used to specify the filter condition (sex="M", in this case). Only the records which meet the filter condition will be read from input dataset when creating this new dataset

## R tidyverse code

`library(tidyverse)library(haven)setwd(dir = "D:/SAS/Home/dev/clinical_sas_samples/mycsg/SAS/SASnR/")class<-haven::read_sas("class.sas7bdat")males<-filter(class,Sex=="M")`
• tidyverse library is loaded into the current R session using library function
• as haven is not a core tidyverse package, it has be explicitly loaded
• setwd function is used to set the working directory, working directory is set to the same directory containing the SAS dataset
• read_sas function of haven is used to read the sas dataset into R session as a tibble
• filter verb(function) is used to subset only the required rows (male student rows) and the associated result is stored in a new dataset/tibble name males
• the first argument of filter function is the name of the input dataset
• the second argument is the filter condition
• note that a double equals sign is used as part of the filter condition (Sex=="M"). In R, to check whether two objects are equal (equality) we need to use a double equals sign ==.