mycsg SASnR 2210_Last-dot-concept

SAS code

data CLASS;

infile datalines dlm='|' dsd missover;

input Name : $8. Sex : $1. Age : best32. Height : best32. Weight : best32.;

label ;

format ;

datalines4;

Alfred|M|14|69|112.5
Alice|F|13|56.5|84
Barbara|F|13|65.3|98
Carol|F|14|62.8|102.5
Henry|M|14|63.5|102.5
James|M|12|57.3|83
Jane|F|12|59.8|84.5
Janet|F|15|62.5|112.5
Jeffrey|M|13|62.5|84
John|M|12|59|99.5
Joyce|F|11|51.3|50.5
Judy|F|14|64.3|90
Louise|F|12|56.3|77
Mary|F|15|66.5|112
Philip|M|16|72|150
Robert|M|12|64.8|128
Ronald|M|15|67|133
Thomas|M|11|57.5|85
William|M|15|66.5|112
;;;;

run;


*------------------------------------------------------------------------------;
*subset of highest height;
*------------------------------------------------------------------------------;

proc sort data=class;

    by sex height;

run;


data highestheight;

    set class;

    by sex height;

    if last.sex;

    keep sex height name;

run;


 

This SAS code snippet demonstrates how to create a subset of data containing observations with the highest height for each unique value of the "Sex" variable.
First, the PROC SORT step is used to sort the "class" dataset in ascending order by the variables "Sex" and "Height".
Then, in the DATA step:
The SET statement is used to read the sorted "class" dataset.
The BY statement specifies the variables "Sex" and "Height" for processing the data in a sorted manner.
The IF LAST.SEX condition is used to identify the last observation within each unique value of "Sex". This condition becomes true only for the last observation of each group.
The KEEP statement is used to select the variables "Sex", "Height", and "Name" to include in the resulting dataset.
After executing this code snippet, the "highestheight" dataset will contain the subset of observations with the highest height for each unique value of "Sex".

R code

library(tidyverse)
class<-tribble(
~Name,~Sex,~Age,~Height,~Weight,
"Alfred","M",14,69,112.5,
"Alice","F",13,56.5,84,
"Barbara","F",13,65.3,98,
"Carol","F",14,62.8,102.5,
"Henry","M",14,63.5,102.5,
"James","M",12,57.3,83,
"Jane","F",12,59.8,84.5,
"Janet","F",15,62.5,112.5,
"Jeffrey","M",13,62.5,84,
"John","M",12,59,99.5,
"Joyce","F",11,51.3,50.5,
"Judy","F",14,64.3,90,
"Louise","F",12,56.3,77,
"Mary","F",15,66.5,112,
"Philip","M",16,72,150,
"Robert","M",12,64.8,128,
"Ronald","M",15,67,133,
"Thomas","M",11,57.5,85,
"William","M",15,66.5,112,
)


highestheight<-class %>%   
  arrange(Sex,Height) %>%   
  group_by(Sex) %>%  
  slice(n()) %>%   
  select(Name,Sex,Height)

This R Tidyverse code snippet demonstrates how to create a subset of data containing observations with the highest height for each unique value of the "Sex" variable.

Using the pipe operator %>%, the following operations are performed:

The arrange function is used to sort the "class" data frame in ascending order by the variables "Sex" and "Height".
The group_by function is applied to group the data frame by the variable "Sex".
The slice function is used to extract the last observation within each group, which corresponds to the observation with the highest height for each unique value of "Sex".
The select function is used to choose the variables "Name", "Sex", and "Height" to include in the resulting data frame.
After executing this code snippet, the "highestheight" data frame will contain the subset of observations with the highest height for each unique value of "Sex".

Last dot concept