Ontologies 101 – Introduction – Your First i2b2 Ontology

What is an ontology?

The idea of "Ontology" is an idea borrowed from philosophy, and it is used to discuss "existence."

In the realm of i2b2, an ontology is actually a "taxonomy," which is a hierarchical group of terms or concepts. If you recall (from your high-school biology class) Linnaeus's "binomial nomenclature" model of how to name living organisms, then you already know what a taxonomy is. For instance, the common house fly Musca domestica and other flies in the family Muscidae are members of the order Diptera, which are organisms in the class Insecta, phylum Arthropoda, kingdom Animalia. A taxonomy defines the hierarchical "is-a" relationship between two concepts. Likewise, a Camry is a Toyota, which is an automobile, which is a type of motorized, wheeled conveyance.

Why does i2b2 have an ontology?

An i2b2 instance requires an ontology so that researchers can locate "concepts" of interest and query data.

The medical concepts contained in an ontology are essentially the building blocks of an i2b2 query:
This is what “drives” the upper-left corner of the i2b2 webclient. In order to make facts or observations you wish to load into your i2b2 query-able, you must have corresponding entries in your ontology (that account for the codes used to represent these facts or observations).

These concepts are rendered as a collection of nested folders and are assumed to represent child-parent (or is-a) relationships. Researchers may locate these concepts by using a hierarchical tree, or taxonomy. When you log in to the demo i2b2 site (username "demo", password "demouser" are already filled in), you will see the i2b2 ontology tree on the upper left side of the user interface. We also refer to that listing as a group of ontologies, where each domain or top-level taxonomy is called an ontology in its own right.

In this screenshot, we would say that i2b2 is displaying an ontology tree with 13 top-level ontologies or domains, namely: Clinical Trials, Custom Metadata, Demographics, etc.

For instance, diabetes mellitus is an endocrine disorder, which is a type of diagnosis. Aspirin is a non-steroidal anti-inflammatory drug, which is a type of drug.

How does it work? (How does an ontology make the data queryable)

The ontology in made up of medical concepts. The ontology is visible to researchers through the webclient, and each medical concept in the ontology includes specific code(s) that match medical observation facts (such as diagnoses, medications, procedures, etc). A researcher can select concepts they are interested in to build a query, and medical data that matches those concepts' codes will be counted.

What are the sources of the concepts and domains? (Where do they come from?)

Typically, the concepts that make up domains are based on medical terminology standards. Some commonly used standards to represent basic structured clinical patient data collected in EHRs include:

Data domains	Typical Standards
Demographics	HL7 Administrative
Diagnoses	ICD
Procedures	ICD, CPT, HCPCS
Medications	RxNorm + VA Classes hierarchy
Labs	LOINC
Vital Signs	LOINC

Medical terminology releases are not directly usable by i2b2
Standardized I2b2 tables must be created for each desired medical terminology to make it usable within i2b2.

Do all i2b2 instances always have the same ontology?

No. Several i2b2 ontologies have been developed and are openly available for use. Any organization may also modify an existing ontology for it's own use, or develop a new ontology. (See Ontologies 201)

What ontologies can I use right away ('out of the box')?

The i2b2 database loading modules come with at least 3 sets of "metadata" or ontology trees. These are the demo ontology, the ACT ontology, and the ACT-on-OMOP ontology.

Name

Description

Included Domains

(I don't think all domains can be listed, or it will be hard to understand. I think this merges with description and tries to convey the level of detail and comprehensiveness).

Target Data Model

i2b2 demo ontology

default metadata from i2b2 authors

fixme

i2b2 Common Data Model (star-schema); default CRC database has matching concepts

ACT ontology

ENACT project

fixme

i2b2 Common Data Model (star-schema); ACT CRC demo database has matching concepts

ACT-on-OMOP ontology

ENACT project

fixme

i2b2 Common Data Model (star-schema), but modified with views into the OMOP Common Data Model; CRC database loaded with SYNPUF demo data has matching concepts

That may be the "demo" ontology, but it doesn't resemble the "ACT" ontology that our PI was showing us. How many different ontology trees does i2b2 have?

~~The i2b2 database loading modules come with at least 3 sets of "metadata" or ontology trees. These are the demo ontology, the ACT ontology, and the ACT-on-OMOP ontology.~~

How about a thumbnail description of these ontologies? How do they differ from each other?

~~Sure. Here is a table describing the major differences among them.~~

~~Name~~	~~Description~~	~~Included Domains~~	~~Target Data Model~~
~~i2b2 demo ontology~~	~~default metadata from i2b2 authors~~	~~fixme~~	~~i2b2 Common Data Model (star-schema); default CRC database has matching concepts~~
~~ACT ontology~~	~~ENACT project~~	~~fixme~~	~~i2b2 Common Data Model (star-schema); ACT CRC demo database has matching concepts~~
~~ACT-on-OMOP ontology~~	~~ENACT project~~	~~fixme~~	~~i2b2 Common Data Model (star-schema), but modified with views into the OMOP Common Data Model; CRC database loaded with SYNPUF demo data has matching concepts~~

What is an ontology?

Why does i2b2 have an ontology?

How does it work? (How does an ontology make the data queryable)

What are the sources of the concepts and domains? (Where do they come from?)

Do all i2b2 instances always have the same ontology?

What ontologies can I use right away ('out of the box')?

That may be the "demo" ontology, but it doesn't resemble the "ACT" ontology that our PI was showing us. How many different ontology trees does i2b2 have?

How about a thumbnail description of these ontologies? How do they differ from each other?

You just mentioned Domains. What do you mean by that?

There are many terms related to i2b2 ontologies that are new to me. Is there a Glossary?

How should I choose which ontology tree to use?

Those ontology trees are called metadata, too. Where are the data?

What's the relationship between the metadata and the data?

How do I deploy my chosen ontology tree?

What should my metadata database look like when I am done?

Pages

Recently updated