The previous lesson showed how to write format values directly in a `proc format` VALUE statement
When a format has many values, or when the values are stored in a dataset rather than being known in advance, typing each value individually is impractical
The `CNTLIN=` option in PROC FORMAT reads format definitions from a specially structured dataset rather than from manually coded VALUE statements
This allows formats to be built programmatically from lookup tables, controlled terminology datasets, or any other data source
CNTLIN is widely used in clinical programming where dictionary-driven decode values are stored in reference datasets
The required structure of a CNTLIN dataset
A CNTLIN dataset must contain at minimum three specific variables: `FMTNAME`, `START`, and `LABEL`
`FMTNAME` holds the name of the format being created — all rows with the same FMTNAME define one format
`START` holds the raw value that the format maps from — this is the code or key that will appear in the data
`LABEL` holds the decoded or display value that the format maps to
An optional `TYPE` variable controls whether the format is for character variables (value `C`) or numeric variables (value `N`) — if omitted, SAS infers from whether `START` is character or numeric
Optionally, `END` can be used for range formats where a range of raw values maps to a single label
Create a CNTLIN dataset and apply it as a format
We create a lookup dataset that maps two-letter sex codes to their full descriptions
We then pass this dataset to PROC FORMAT via CNTLIN to register the format
Finally we apply the format in a PROC PRINT to confirm it works
SAS Log
Setting `type='C'` in the CNTLIN dataset is important — it tells SAS this is a character format and the format name will be stored with a `$` prefix
Character formats must be referenced with the `$` prefix in FORMAT statements, so the PROC PRINT uses `$SEXFMT.` not `SEXFMT.`
The `proc format cntlout=cntlout;` step immediately after the CNTLIN load exports all currently registered formats back to a SAS dataset named `cntlout` — this is a quick way to verify exactly what was registered and inspect the FMTNAME, START, LABEL, and TYPE values
No VALUE statement was needed because the format came entirely from the dataset
Dataset View
Build a larger decode format from a reference dataset
In practice, CNTLIN is most valuable when the lookup dataset has many rows or is generated automatically
Here we create a visit code lookup with ten entries to demonstrate how CNTLIN scales without requiring any additional PROC FORMAT coding
SAS Log
Adding a new visit to the decode table only requires adding one row to `visit_lookup` — no PROC FORMAT code needs editing
Inspect the PROC PRINT output and confirm that all visit codes are decoded to their full descriptions
Dataset View
Use FMTLIB to verify the registered formats
After creating formats with CNTLIN, you can use PROC FORMAT with the FMTLIB option to display all currently registered formats and confirm the values loaded correctly
SAS Log
The output lists every START-LABEL pair for each named format, confirming the CNTLIN load was successful
This is a useful verification step in production programs before the formats are applied to analysis datasets
Build a CNTLIN dataset from an existing dataset using a DATA step
Rather than hard-coding rows in a DATA step, CNTLIN datasets are often generated programmatically from a source reference file
In this example we derive a numeric age-group format from calculated ranges, showing how CNTLIN supports range-based formats using `START`, `END`, and the `HLO` variable
SAS Log
`type='N'` specifies a numeric format because `start` and `end` are numeric
`hlo=''` is initialised at the top of the dataset so all rows carry the variable — then `hlo='H'` on the last row tells SAS the upper end of that range is open (extends to the highest possible numeric value)
In CNTLIN datasets, `HLO` is the correct way to express open-ended ranges — the `end=.H` syntax used in VALUE statements does not apply here
Inspect the PROC PRINT and confirm that each patient's age is grouped correctly
Dataset View
Key points to remember
CNTLIN= in PROC FORMAT reads format definitions from a dataset instead of requiring coded VALUE statements
The CNTLIN dataset must contain at minimum `FMTNAME`, `START`, and `LABEL` variables
Add a `TYPE` variable with value `C` or `N` to explicitly declare character or numeric formats — especially important for numeric range formats
Use `END` along with `START` to define range-based formats where a span of values maps to one label
Use `HLO='O'` in a CNTLIN row to create an OTHER category (the equivalent of `other=` in a VALUE statement)
CNTLIN is most valuable when format values are numerous, change frequently, or are derived from a data source such as a controlled terminology file
Always verify CNTLIN-built formats with PROC FORMAT FMTLIB before applying them to analysis datasets