Announcement Icon Online training class for Clinical SAS programming starting soon. Click here for details.

Subset Observations


SAS code


data CLASS;
infile datalines dlm='|' dsd missover;
input Name : $8. Sex : $1. Age : best32. Height : best32. Weight : best32.;
label ;
format ;
datalines4;
Alfred|M|14|69|112.5
Alice|F|13|56.5|84
Carol|F|14|62.8|102.5
Henry|M|14|63.5|102.5
James|M|12|57.3|83
;;;;
run;
 
data males;
set class;
where sex="M";
run;
 
data preteen;
    set class;
    where age in (11,12);
run;
 

SAS code description

The provided SAS code snippets demonstrate how to create new datasets by filtering the data from an existing dataset based on specific conditions.

In the first snippet, a dataset named "males" is created by selecting only the observations from the original "class" dataset where the value of the variable "sex" is equal to "M", representing males.

In the second snippet, a dataset named "preteen" is created by selecting only the observations from the original "class" dataset where the value of the variable "age" is either 11 or 12, indicating preteen ages.

Both code snippets use the set statement to read the data from the original dataset, and the where statement to apply the filtering conditions.

These SAS code snippets demonstrate how to extract specific subsets of data from an existing dataset, allowing for targeted analysis or further processing.

R code

library(tidyverse)

class<-tribble(
  ~Name,~Sex,~Age,~Height,~Weight,
  "Alfred","M",14,69,112.5,
  "Alice","F",13,56.5,84,
  "Carol","F",14,62.8,102.5,
  "Henry","M",14,63.5,102.5,
  "James","M",12,57.3,83,
)


males <- filter(class, Sex=="M")


preteen<-filter(class,Age %in% c(11,12))

R code description

The provided R Tidyverse code snippets demonstrate how to create new data frames by filtering an existing data frame based on specific conditions.

In the first snippet, a data frame named "males" is created by filtering the "class" data frame to include only rows where the value of the "Sex" variable is "M", indicating males. This is achieved using the filter function from the dplyr package.

In the second snippet, a data frame named "preteen" is created by filtering the "class" data frame to include only rows where the value of the "Age" variable is either 11 or 12. This is done using the filter function and the %in% operator.

Both code snippets showcase the power of the filter function in extracting specific subsets of data from a data frame based on given conditions. This allows for targeted analysis and further processing of the filtered data.