Why do we need SDTM?


This post is part of 'SDTM | General' series

Let us assume that as part of a clinical trial, we want to collect certain basic information like sex, age, and race of the trial participants(subjects).

Subject number  
Sex
  • (1) Male
  • (2) Female
Age  
Race
  • (1) Asian
  • (2) White
  • (3) African American

This information, along with other trial data, needs to be submitted in a tabular format to the regulatory authorities. 

Let us examine some theoretical possibilities on how the collected data can be structured for submission - all the below representations are the data for an Asian male whose age is 52 and given an identification number of 1001.

DEMOG
Subject Gender Collected_age Race
1001 1 52 1

 

DM
Subject Sex Collected_age Race
1001 M 52 Asian

 

SUBJINFO
SUBNUM Sex age_years Race
1001 1 52 Asian

What is the problem here?

  • There are multiple approaches in which the same collected data can be organized or presented
  • When this data is provided to someone for review, we also need to provide metadata (data about this data)
  • For the same data, we end up having different metadata based on the organizing structure we choose
  • Data reviewers will need to spend time understanding the metadata before they can review the actual data

 

  • Some of the underlying data element concepts remain constant irrespective of the person who is collecting data
  • If we can standardize the structure for the commonly used data concepts and enforce it on the data submitters, the end data will always be structured in a predictable way
  • Predictable structure has several advantages like easy and efficient data pooling, less or no preparation time understanding the metadata, build reusable computer programs for data analysis etc.
  • Study data tabulation model (SDTM) is aimed at this standardization

How does SDTM help in standardization of data structure and formatting?

  • Each collected data point has a 'focus' (purpose)
  • SDTM has standard dataset structures based on the focus of the observation
  • SDTM has standard variables to present the information based on the purpose of the collected data
  • SDTM has standard variable values for the commonly used variables

In other words, SDTM allows the users to organize data into:

  • Standard dataset names
  • Standard variables in each dataset
  • Fixed list of allowed values in the variables where applicable
  • Flexibility to create custom datasets based on existing standard dataset structures
  • Flexibility to extend the allowed values in variables where applicable

 

The standard way of organizing data collected in our example above is:

  • The focus of the observation is 'demographics' of the subject - so, present it in the standard DM (Demographics) data (dataset level standardization)
  • Sex of the subject when collected must be presented in a variable named 'SEX' (variable level standardization)
  • Age of the subject when collected must be presented in a variable named 'AGE' (variable level standardization)
  • Provide the units in which age is collected in a variable named 'AGEU' (variable level standardization)
  • Race of the subject when collected must be presented in a variable named 'RACE '(variable level standardization)
  • Always use the value 'M' when the subject is male (value level standardization) 
DM
SUBJID SEX AGE AGEU Race
1001 M 52 YEARS Asian

 

So, the main advantage of SDTM standard is that we will have a standard predictable and consistent structure for the data collected in clinical trials.





Filter a category
AllSASnRSDTMSASDomainADaMStatistics

List of other posts


Domain


General
What are CDISC standards?
What is a clinical development plan?
What is a clinical study report?
What is a clinical trial registry?
What is the importance of baseline characteristics in a clinical trial?
Drug discovery
Drug Development
Preclinical Research
The Investigational New Drug Process
What is ICH?
Efficacy data vs Safety data
What is a clinical trial?
Why are clinical trials conducted?
What is a clinical trial protocol?
Who conducts clinical trials?
Inclusion/Exclusion Criteria in a clinical trial
What are the phases of clinical trials?
What happens after a clinical trial is completed?
FDA Drug Review
Key steps in developing new drugs and biologicals

Trial design aspects
What is a crossover clinical trial?

Terminology
What is a cohort?

SAS


Definitions
What is a computer?
What happens when we execute a SAS program?
What is software?
What is SAS?
What is data?
What is data entry?
What is data retrieval?
What is data management?
What is "Report"?
What is statistics?
What is Statistical Analysis?
How do we use SAS?
What kind of questions can SAS help us answer?
How do we provide instructions to SAS?
What is a SAS program?
What does a SAS program contain?

General
Attributes of a SAS dataset
Rules for SAS dataset names
Rules for SAS variable names
Rules for SAS library names
Rules for character SAS format names
Reserved SAS dataset names
Rules for numeric SAS format names
What can SAS dataset options do?
Attributes of a SAS variable
Automatic conversion of data types in SAS
How does SAS expect our data to be organized?
Introduction to SAS interface
By groups in SAS

Informats
Rules for character SAS informat names

Proc freq
Count the number of times a particular value occurred in a variable of a dataset

Proc contents
Check the list of variables in a SAS dataset

Proc datasets
Delete all sas datasets from a library
Delete specific sas datasets from a library
Save specific sas datasets (and delete others) of a library
Rename SAS datasets using proc datasets change statement

Log issues
WARNING: No matching members in directory.

One-line definitions
What is a SAS library?
What is a libref?
What is an input statement?
What is infile statement?
What is set statement?
What is length statement?

SDTM


General
How to derive baseline flag in SDTM
How to create SEQ variable in SDTM
New domains in SDTM IG 3.3
What is a codelist?
How to derive study day variable in SDTM
What is SDTM?
Why do we need SDTM?
How to convert original results to standard results using conversion factors

Demographics
What information does SDTM.DM (Demographics) contain?

Adverse Events
What is causality assessment?
What information does SDTM.AE (Adverse Events) contain?

Disposition
What information does SDTM.DS (Disposition) contain?

Concomitant Medications
What information does SDTM.CM (Concomitant/Prior Medications) contain?

Procedures
What information does SDTM.PR (Procedures) contain?

ADaM


General
What is ADaM?

ADSL
What is ADSL as per ADaM standard?

BDS
What is BDS as per ADaM standard?

Statistics


General
Alpha and beta errors
What is correlation?

SASnR


Introduction
What is R?
What is an R package?
What is tidyverse?
What are the core packages of tidyverse?
What is haven package of tidyverse?
How to install tidyverse?
How to load core tidyverse packages into the R session?

Reading data
Import/Read SAS dataset into R

Creating sample data
How to create some sample data in SAS and R tidyverse

Subset variables (columns)
How to select only required variables/columns in SAS and R tidyverse?
How to drop unwanted variables/columns in SAS and R tidyverse?

Subset observations (rows)
How to select/subset required rows in SAS and R tidyverse

Appending data
Append two datasets in SAS and R tidyverse

Merging/joining data
Merge/full join two datasets in SAS and R tidyverse
Merge/inner join two datasets in SAS and R tidyverse
Merge/left join two datasets in SAS and R tidyverse

Sort (order) observations
Sort/order observations based on the values in a single variable in SAS and R tidyverse

Transpose/Restructure data
Restructure/transpose long data to wide data
Restructure/transpose wide data to long data

Obtain frequencies
Obtain frequencies/counts based on one variable - one-way frequencies in SAS and R tidyverse
Obtain frequencies/counts based on two variables - two-way frequencies in SAS and R tidyverse

Descriptive statistics
Descriptive statistics for a numeric variable using SAS and R tidyverse