Population-Wide Analysis Bundle


Using SHRINE to perform federated queries of i2b2 data

Introduction


This population-wide analysis bundle provides researchers with real-time access to data on large patient populations at multiple healthcare organizations. It includes i2b2, which enables query and analysis of data within an institution, and SHRINE (Shared Health Research Information Network), which is a federated query tool that connects different sites' i2b2 systems. In this bundle, patient-level data never leaves an institution. The patient data are stored locally within each site's i2b2 database, and only aggregate counts and statistics are shared with others in the network through SHRINE. The bundle also includes a common ontology called ACT (Accrual for Clinical Trials), which has been implemented in a SHRINE network with more than 50 institutions and 125 million patients.

Figure 1. High-level view of the bundle. The SHRINE Query Tool broadcasts queries to multiple i2b2 sites using a common ontology and returns real-time results.

Use Cases


Uses cases for this bundle include:


Bundle Components


This bundle includes documentation on how to install and configure the following items:


Demo

A public demo of this bundle is available at the following URL:
http://shrine-node3.i2b2transmartbundles.org/shrine-api/shrine-webclient/
Log in with the user demo and password demouser.
It consists of a 3-node SHRINE with Synthea demo data.

Technical Architecture


i2b2 Components

i2b2 consists of independent applications that provide different functionality called "cells" (Figure 2). A collection of cells form an i2b2 "hive". Most i2b2 hives include (1) a Project Management (PM) cell for authentication and authorization; (2) a Clinical Research Chart (CRC) cell, which contains patient data and the query engine; and, (3) an Ontology (ONT) cell, which describes the concepts and codes contained within the CRC cell. Many i2b2 hives also include (4) a Workplace (Work) cell, which enables users to "bookmark" frequently used items in the user interface and share these with collaborators; and (5) an Identity Management (IM), which allows authorized users to retrieve identified patient data. Cells communicate with each other using i2b2 XML messages sent to APIs. When a cell receives a request message, it queries a table in the HiveData database to determine the location of the main database for that cell, based on the user's project. An exception is the PM cell, which uses a single database for all projects. The i2b2 Web Client is written entirely in HTML and JavaScript. It communicates with a Web Proxy on a web server, which redirects messages to the appropriate cell.

Figure 2. i2b2 components.

SHRINE Components


The SHRINE software consists of several parts (Figure 3). (1) At the site where investigators are forming queries (the Site Originating Query), there is a SHRINE Web Client and a SHRINE application called the Query Entry Point (QEP). The QEP authenticates the user, provides the Web Client with access to the ontology, sends queries to the SHRINE Hub, and polls the hub for results from sites. In SHRINE Version 3.0, the QEP does not have any dependencies on i2b2, except for using the i2b2 PM cell for authentication. It copies the ontology into a Lucene index to enable fast searches for concepts. In older versions of SHRINE, the Legacy Web Client used the i2b2 ONT cell to access the ontology. (2) The SHRINE Hub includes the main hub software and database that contains the configuration settings for the network. It also contains a Message-Oriented Middleware (MOM) component to support asynchronous communication across sites. (3) The sites that are running the queries and returning aggregate counts contain a SHRINE Adapter. This polls the SHRINE MOM for new queries and runs them on an i2b2 CRC cell. (This CRC cell also requires a PM and ONT cell to run, but these are not shown in the figure.) Usually sites that are running queries also have investigators that are forming queries. These sites have a single i2b2 instance along with the SHRINE Web Client, QEP, and Adapter.

Figure 3. SHRINE components.

SHRINE Configurations


The i2b2 and SHRINE components can be combined in different ways (Figure 4): (a) Most commonly, they are used to form a federated network, with a central SHRINE Hub, and each hospital having the items needed to both send queries to the network and respond to queries from other sites. Some sites also use the i2b2 Web Client to perform local queries that are not sent to other institutions. (b) An alternative approach is to use a central SHRINE Web Client and QEP. This simplifies the architecture of the network, but it requires sites to send their investigators to an external website to run queries, which might not be possible. (c) The same i2b2 instance at a site can participate in multiple networks by installing a SHRINE Adaptor for each network. (d) A single site can create a one-node network if they want to use the SHRINE Web Client as an alternative user interface for i2b2.

Figure 4. SHRINE configurations. (a) Federated network with a SHRINE Web Client at each site. (b) A centralized SHRINE Web Client. (c) A site participating in two SHRINE networks. (d) One site using the SHRINE Web Client as an alternative user interface for i2b2.

Key Technical Concepts

System Requirements

It is recommended to install SHRINE and i2b2 on separate machines, as they are both resource intensive. If you install them on the same machine, two versions of the Java JDK will need to be active (JDK 11 for SHRINE and JDK 8 for i2b2). So, a three-node SHRINE network requires a minimum of four virtual machines (one for each node and one for the Hub), but optimally seven (one for each SHRINE node, one for each i2b2, and one for the Hub).
Both platforms consist of a database layer, an application server layer, and a web-based client layer. i2b2 must be configured for use with the ACT ontology when used by this SHRINE bundle. High-level requirements for each of these are listed in the next subsections.

i2b2

i2b2 requirements can be found here. A summary of the key requirements:


SHRINE Nodes and Hubs

SHRINE requirements can be found here. A summary of the key requirements:

Installation

Youtube Tutorial Videos


We have developed several Youtube tutorial videos that parallel the installation steps below.

  1. Install i2b2: https://www.youtube.com/watch?v=Oi9tlLVYXU8
  2. PostgreSQL ACT Ontology:     https://youtu.be/np33ydjricg
  3. Synthea COVID19:     https://youtu.be/RoMmL7-R14s
  4. SHRINE Node:      https://youtu.be/TDVkf8N0R-E 


i2b2 Install

There is a public demo of i2b2 available at https://www.i2b2.org/webclient/

Installing the ACT Ontology in i2b2


There is a public demo of the ACT ontology in i2b2 available at. https://dbmi-ncats-test01.dbmi.pitt.edu//webclient/


SHRINE node install

The SHRINE 3.0 install guide can be found here: https://open.catalyst.harvard.edu/wiki/display/SHRINE/SHRINE+3.0.0+Installation+Guide 

It is divided into 14 chapters. Some notes on the steps for install are shown below:

SHRINE Hub Install

The SHRINE Hub is a special SHRINE node that coordinates communication between the nodes. To install it, follow the install instructions for a SHRINE node above and then the additional instructions below. The Hub does not require an i2b2 installation.