pharmaverse.ndexr.io

Open-source R for the clinical pipeline.

The pharmaverse is a community-curated, opinionated R stack for getting clinical-trial data from raw collection through ADaM analysis datasets and submission-ready tables, listings, and graphs — without leaving the open-source toolchain.

What it is

Founded by Roche, GSK, Atorus, and Pfizer; now backed by sponsors across the pharma industry. The premise: clinical reporting has been SAS-shaped for decades, and the regulatory framework around it (CDISC, FDA submission packages) doesn't actually require SAS — it requires standards-compliant outputs and an auditable workflow. The pharmaverse delivers both, in R, with packages that:

  • Implement the CDISC standards (SDTM, ADaM, define-XML, ADaM IG).
  • Export submission-grade XPT v5 files the way the FDA expects them.
  • Produce TLGs (tables, listings, graphs) with pixel-stable layouts that survive manual review.
  • Carry auditable logs and validated package risk scores so QA / Statistical Programming teams can sign off on R the same way they signed off on SAS.

Project home: pharmaverse.org . Code: github.com/pharmaverse .

The clinical pipeline, in R

Each step is one or more pharmaverse packages. Inputs and outputs are CDISC-compliant at every stage.

Step 1
Collected → SDTM

Raw CRF / EDC data is mapped to Study Data Tabulation Model datasets (DM, AE, EX, LB, VS, …). pharmaverseraw / pharmaversesdtm provide reference datasets for examples and tests.

Step 2
SDTM → ADaM

admiral builds Analysis Data Model datasets (ADSL, ADAE, ADTTE, ADLB) from SDTM with derivation patterns shared across studies. Therapeutic-area extensions: admiralonco / admiralophtha / admiralpeds / admiralvaccine / admiralneuro / admiralmetabolic.

Step 3
ADaM → TLG

rtables + rlistings + tern + chevron + tlf produce statistician-grade tables, listings, and graphs. Same layout primitives every study, parametrised by analysis spec.

Step 4
ADaM → submission

xportr writes XPT v5 files with the variable labels, lengths, and metadata regulators require. metacore / metatools manage the define-XML metadata that travels with them.

Package families we mirror
Live mirror state →
admiral

Core ADaM dataset construction.

admiraldev

Internal dev utilities for the admiral family.

admiralonco

Oncology-specific ADaM derivations (RECIST, response, TTE).

admiralophtha

Ophthalmology-specific ADaM patterns.

admiralpeds

Pediatrics: growth charts, age-windowed visits.

admiralvaccine

Vaccine trials: reactogenicity, immunogenicity datasets.

admiralneuro

Neurology endpoints and scales.

admiralmetabolic

Metabolic / endocrine derivations.

pharmaversesdtm

Reference SDTM datasets used as examples.

pharmaverseadam

Reference ADaM datasets — the canonical worked example.

rtables

Layout engine for clinical TLGs — programmable, pixel-stable.

rlistings

Patient-level listings with the same layout DSL as rtables.

tern

Statistical layouts for rtables (descriptive, KM, MMRM, GEE).

chevron

Boilerplate templates for common clinical TLGs.

tlf

Common TLF formatters — fonts, footnotes, headers.

formatters

Number / date formatting primitives.

xportr

Write XPT v5 files for FDA / PMDA submission.

metacore

In-memory metadata model for studies.

metatools

Apply the metacore model to ADaM construction.

logr / logrx

Auditable run logs — who, when, with what inputs.

riskmetric

Quantify package risk for validation packets.

diffdf

Compare two data frames cell-by-cell — used for regression / SAS parity tests.

Tplyr

Tabular data summary in a tidyverse idiom.

staggered

Staggered DiD estimators — methodology rather than CDISC.

pharmaverse

Umbrella + entry-point package; pointers to the docs and Slack.

Why it matters for biostatistics

Pharmaceutical statistical programming has historically been a SAS-only world because the FDA submission packet, the QA processes, and the regulator-side reviewers were all built around it. The pharmaverse is the credible, opinionated R alternative — same outputs, same audit trail, open source.

  • Reproducibility: every step is plain R code, version-pinned via renv. No black-box procs.
  • Cross-study reuse: ADSL / ADTTE construction is a function call, not a per-study macro.
  • Validation packets: riskmetric scores + reverse dependency tests give Statistical Programming + QA the artefacts they need to sign off.
  • Audit trail: logrx wraps a programme and records inputs, outputs, environment, and Git commit — what regulators look for.
Where it lives in our mirror

Every pharmaverse package above is on CRAN, and every CRAN package is in our bucket:

                s3://ndexr/cran/src/contrib/admiral_<ver>.tar.gz
s3://ndexr/cran/src/contrib/rtables_<ver>.tar.gz
s3://ndexr/cran/src/contrib/xportr_<ver>.tar.gz
... and so on across the family.
              

Live coverage at repo.ndexr.io (filter admiral , pharmaverse , rtables , etc. in the package column).