SCILHS i2b2 PCORnet Common Data Model Ontology

The Scalable Collaborative Infrastructure for a Learning Health System (SCILHS, pronounced "skills") is a network of 10 health centers across the United States that will cover over 8 million patients. SCILHS is a Clinical Data Research Network (CDRN) in the Patient-Centered Outcomes Research Institute's PCORnet.

SCILHS (with much help from others) has developed an i2b2 information model that represents the PCORnet Common Data Model (CDM). This information model consists of an i2b2 ontology/terminology and a process for mapping local data elements to the ontology without changing the underlying imported data. This approach highlights i2b2's ability to separate data model from both information model and the underlying data format.

By conforming to this ontology, SCILHS sites can query across the network with an interoperable data model using SHRINE, and we provide a script to programmatically generate data marts in the PCORnet data model format. This will enable detailed analysis through the PCORnet Distributed Research Network (DRN). Likewise, if PCORNet Queries were rewritten to run against the i2b2 schema, they could run reliably at every site using the PCORNet CDM i2b2 ontology.


The ontology is live at Change the username to pcori, leave the password as demouser, and click 'Login'. The existing demo data has been mapped to the PCORI ontology, so most queries involving demographics, diagnoses, procedures, and enrollment will work. This did not require any changes to the demo data, only to the information model (ontology). Adding new demo data for the remaining sections of the ontology is on our roadmap.

This release, v2.1, represents several rounds of improvement over our initial release, and it includes much of CDM v1-v3. It has been vetted by ten of our sites - the mapping has been completed and it is successfully being used for inter- and intra- network querying. This release corresponds to the PCORnet CDM v3.0, though it is presently missing the death and PRO tables.

- Jeffrey Klann, PhD

About the ontology

The core ontology was originally generated from Dan Connolly's code to generate ontologies from the PCORnet CDM v1 spec, and edited significantly based on site feedback. Note that some clarifications on the meaning of various elements appear in the tooltips. Many terminology trees were added, as described below. The ontology has been modified to do as much as possible out-of-the-box. By default, the ontology is mapped to relevant sections in the i2b2 metadata, and some elements are computed automatically (such as number of patients enrolled). Sites can adapt the ontology to their local data through a mapping process described in the documentation, and no changes to actual data are necessary.

What does the ontology cover?


What's not in the ontology?


 Download the ontology

To download, please view the README at our new GitHub-hosted site!