Ontology Working Group
Space shortcuts
Space Tools
Ontology Working Group OWL

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: much refactoring of Part 2 and Part 3

...

No. Several i2b2 ontologies have been developed and are openly available for use. Any organization may also modify an existing ontology for its own use, or develop a new ontology. (See Ontologies 201 – Custom Metadata – Additions and Modifications.)

The i2b2 software includes i2b2 ontologies that you can use right away. There are additional ontology trees that can substitute for, or can be added to, the standard i2b2 ontology trees. (See Appendix E – Advanced Ontologies.)

Which ontologies can I use right away ('out of the box')?

...

Each ontology is designed to be used with some version of the i2b2 Common Data Model.

Name

FolderDescriptionWhen To UseTarget Data Model
i2b2 Demo Ontologydemodefault metadata from i2b2 authors; legacy categoriesdemonstrationi2b2 Common Data Model (star-schema)
ACT Ontologyactmodern categories and concepts, supplied by the ENACT projectproductioni2b2 Common Data Model (star-schema)
ACT-on-OMOP Ontologyact-omopmodern categories and concepts, supplied by the ENACT projectproduction, at sites using OMOP CDMmulti-fact-table-enhanced i2b2 Common Data Model (star-schema)

...

The ontology you choose depends on your institution's research goals and patient data.

Research Goals

If your goal is simply to understand and show how i2b2 works, then the i2b2 demo ontology is sufficient. If your goal is to set up i2b2 for research use, one of the ACT ontologies will be far more useful. The ACT ontologies are more modern, more robust, and will satisfy the needs of more researchers.

Furthermore, any institution that is planning to join the ENACT Network will need to use an ACT ontology to ensure compatibility with the other institutions in the network.

Patient Data

First, a little more terminology. The i2b2 "patient data" are stored in the i2b2 Clinical Research Chart database, or "CRC database." The ontology categories and concepts that describe those patient data are known as the "metadata." The metadata are stored in the i2b2 Metadata database, or "ONT database."

When setting up the patient data for i2b2, your institution needs to decide how it will conduct the ETL mapping from the EHR to the i2b2 CRC database. If your patient data are going to When setting up the patient data for i2b2, your institution needs to decide how it will conduct the ETL mapping from the EHR to the i2b2 CRC database. If your patient data are going to be set up only for i2b2 and SHRINE use, then many institutions set up the patient data in the default i2b2/tranSMART Common Data Model (CDM). In this case, they typically use the ACT Ontology with all its various robust collection of coding standards; the patient data from the EHR system need to be mapped to the multiple coding schemes in the ACT Ontology as they are loaded into the CRC database.

If your patient data are going to be set up in a database for queries by other systems besides i2b2, then many institutions use the OMOP Common Data Model (CDM) for their patient data in the CRC database. In this case, they typically use the ACT-on-OMOP Ontology, which relies chiefly on the SNOMED CT coding OMOP Vocabulary standard; the patient data from the EHR system need to be mapped to the SNOMED CT appropriate OMOP Vocabulary coding scheme as they are loaded into the OMOP CDM in the CRC database.

Info
For some institutions, the ETL mapping from the EHR to the i2b2 CRC databases is the most problematic process in the setup. Those institutions may decide to minimize the complexity of their ETL, and simply copy the coding scheme from their EHR into the patient records in the CRC database. If the coding schemes in their EHR are not standard coding schemes, then they may have to customize their i2b2 ontology to reflect the coding schemes present in their CRC database.
Tip Box

Setting up i2b2 can be a complex undertaking. You can learn a lot about how it works, and get the system up and running most quickly, by first setting up your i2b2 instance with the i2b2 Demo Ontology and demo patient data. Those databases will comprise a "Demo" project in your i2b2 instance.

When you have proven the deployment with the demo project, you can add a separate, new project for research. In this case you could use actual patient data in a second CRC database and an ACT Ontology in a second ONT database. Those new databases will comprise a "research" project in your i2b2 instance.

Tip Box

It is highly recommended for newcomers to begin their i2b2 journey by first setting up the Demo Ontology with the Demo CRC patient data.

Setting up i2b2 can be a complex undertaking. You can learn a lot about how it works, and get the system up and running most quickly, by first setting up your i2b2 instance with the i2b2 Demo Ontology and demo patient data. Those databases will comprise a "demo" project in your i2b2 instance.

When you have proven i2b2's deployment with the demo project, you can add a separate, new project for research. In this case you could use actual patient data in a second CRC database and an ACT Ontology in a second ONT database. Those new databases will comprise a "research" project in your i2b2 instance.




...

Part 3: How Ontologies and Patient Data Part 3: How Ontologies and Patient Data Are Tied Together

How does it work? How does the i2b2 ontology make the patient data queryable?

First, a little more terminology. The i2b2 "patient data" are stored in the i2b2 Clinical Research Chart database, or "CRC database." The concepts that describe those patient data are known as the "metadata." The metadata are stored in the i2b2 Metadata database, or "ONT database."

The patient data in the CRC database work in tandem with the metadata in the ONT database to allow i2b2 to conduct queries for the researcher. The metadata spell out all the various healthcare concepts or terms that a researcher may wish to query for in the user interface, and the patient data must be coded in such a way that they reflect the codes defined in the metadata.

Only when a patient's codes match a query's ontology codes will the patient be counted as part of a query result. Each patient "observation" in the CRC database must have a code associated with it, and that code must match a The i2b2 metadata in the ONT database work together with the patient data in the CRC database. Each patient "observation" in the CRC database must have a code associated with it, and that code must match a healthcare concept — diagnosis, procedure, medication, lab test, demographic descriptor, etc. — that exists is defined in the ONT database. So the CRC patient data will be queryable in i2b2 only if the patient facts in the CRC database are recorded utilizing standard codes that are referenced defined in the i2b2 metadata (ONT database).

For instance, if the patient's electronic health record indicates that the patient had the procedure "tonsillectomy with adenoidectomy," then that fact needs to be recorded in the CRC database using the standard code for that particular procedure as defined in the ONT database. In the case of using the default i2b2 Procedures metadata tree, that code would be "ICD9:28.3". When a researcher makes a query in the user interface for "tonsillectomy with adenoidectomy," that query will be translated by i2b2 into a query for patients in then i2b2 will query the CRC database for all patients who have the code ICD9:28.3 in their data records.

Info

It's important to understand that each institution will have its own protocols for coding diagnoses, procedures, medications, etc., in the patient electronic health records (aka EHR, as in Cerner or Epic databases), and that these institutional protocols may use standard

or

, non-standard, or proprietary codes. So, when

preparing

loading patient data into the CRC database for i2b2, it's necessary for the data curators who perform the ETL process (extract-transform-load) to map the codes

from

existing in the patient EHR into the codes that are

present

defined in the i2b2 metadata (ONT database).

For instance, let's say a patient's EHR record includes an NDC code for a medication. And let's say that your institution's i2b2 ontology tree only has an RxNorm code for that

type of

particular medication. Then the medication record from the EHR

should

would need to be mapped into a patient record in the i2b2 CRC database in such a way that

uses

the

appropriate

ontology's RxNorm code (not the EHR's NDC code) appears in the patient record. If the patient record in the CRC database has the NDC code from the EHR, then it would not be matched in a query

when

, since the i2b2 query

is

would be using the RxNorm code.

Info
If your local institution does not have data in the CRC database for a certain domain in your chosen i2b2 ontology, then user queries referencing that domain may come back empty. To avoid that, you can exclude that domain from the i2b2 user interface, so that the domain without matching data in the CRC database is never used in a query.

What's the relationship between the metadata and the patient data?

This is another key question/concept.

The earlier question about "How does it work? How does the i2b2 ontology make the patient data queryable?" introduced the concept of how the patient data in the CRC database work in tandem with the metadata in the ONT database to allow i2b2 to conduct queries for the researcher.

...

For some institutions, the ETL mapping from the EHR to the i2b2 CRC databases is the most problematic process in the setup. Those institutions may decide to minimize the complexity of their ETL, and simply copy the coding scheme from their EHR into the patient records in the CRC database. If the coding schemes in their EHR are not standard coding schemes, then they may have to customize their i2b2 ontology to reflect the coding schemes present in their CRC database. In this case, they can reduce the complexity of the ETL mapping, but this will surely increase the complexity of setting up the ontology tree.


Info
If your local institution does not have data in the CRC database for a certain domain in your chosen i2b2 ontology, then user queries referencing that domain may come back empty. To avoid that, you can exclude that domain from the i2b2 user interface, so that the domain without matching data in the CRC database is never used in a query.

What if my institution records new concepts into the EHR that are not already included in the ontology?

If your institution is using concepts or codes that are not reflected in your ontology tree, the researchers will not be able to query for those concepts in i2b2.

One remedy is to customize your ontology trees to include the new concepts that are missing from the i2b2 ontologies. See Ontologies 201 – Custom Metadata – Additions and Modifications.

What if my institution's EHR is missing concepts that are found in one of the ontology trees?

If your institution's patient data do not include concepts from one of the ontology trees, then queries made from that tree's concepts may come up with zero results.

Trees with zero matching patient records can be handled in these ways:

  • The tree can be renamed in the metadata-configuration table (TABLE_ACCESS) to include a caveat, for instance "tree name (Unused)."
  • The tree can be deactivated in the metadata-configuration table (TABLE_ACCESS).
  • Do nothing. Since the patient count is displayed for each concept in the ontology tree, those concepts will have no count, and researchers will be unlikely to use those concepts in a query.

What else is pertinent to setting up my first ontology?


But there is more to the relationship between the ONT and CRC databases. One additional topic is the i2b2 "project." The other additional topic is the "secret sauce" of the relationship.

...

In addition, in your CRC database, there should be a CONCEPT_DIMENSION table.

Suggested Next Steps

...

  • Visit Appendix C – Ontology Tables to learn more about the structure of the ontology database.
  • Visit Appendix D – Test Queries to learn how to run some "sanity check" queries on your new database.
  • Visit Ontologies 102 – Patient Counts ("totalnum") to learn how to add patient counts to your ontology concepts in the user interface.

...

Ontology Working Group OWL