Anchor
_ft6mbcw66jvk
_ft6mbcw66jvk
Data Science Bundle (alpha version!)

Anchor
_k7qnfgljn60b
_k7qnfgljn60b
Combining clinical and genomic data using i2b2 and tranSMART to perform complex analyses of real-world data

Anchor
_i0objq32md3z
_i0objq32md3z
Introduction

This data science bundle supports complex analyses of real-world clinical and genomic data. It includes i2b2, which enables query and cohort identification, and tranSMART, adds a suite of tools for data exploration, R-based advanced analytics (e.g., correlation analysis, heat maps, PCA, etc.), and genomic modules for Genome Wide Association Studies (GWAS) and high dimensional data analysis such as RNAseq.

Image Modified
Figure 1. High-level view of the bundle. The i2b2 common data model integrates clinical and genomic data. i2b2 provides tools for query and cohort selection, and tranSMART contains modules for high-dimensional analyses.

Anchor
_xwvjd9pbemy6
_xwvjd9pbemy6
Use Cases

Uses cases for this bundle include:

Translational research
Genome association studies

Anchor
_rc59n6p09u3u
_rc59n6p09u3u
Bundle Components

This bundle includes documentation on how to install and configure the following items:

i2b2 - Local query tool
- Database
- Application Layer
- i2b2 Web Client
- Sample synthetic data
tranSMART - Analysis tools
- Additional database tables
- Application Layer
- tranSMART User Interface
Demo data - Synthetic datasets for testing the software

Anchor
_sra348i6vrgq
_sra348i6vrgq
Demo

A public demo of this bundle is available at the following URL:

...

It consists of both i2b2 and tranSMART running on the same database with Synthea demo data.

Anchor
_z5u2w5mdf6zr
_z5u2w5mdf6zr
Technical Architecture

Anchor
_fd27q8ivgxlh
_fd27q8ivgxlh
i2b2 Components

i2b2 consists of independent applications that provide different functionality called "cells" (Figure 1). A collection of cells form an i2b2 "hive". Most i2b2 hives include (1) a Project Management (PM) cell for authentication and authorization; (2) a Clinical Research Chart (CRC) cell, which contains patient data and the query engine; and, (3) an Ontology (ONT) cell, which describes the concepts and codes contained within the CRC cell. Many i2b2 hives also include (4) a Workplace (Work) cell, which enables users to "bookmark" frequently used items in the user interface and share these with collaborators; and (5) an Identity Management (IM), which allows authorized users to retrieve identified patient data. Cells communicate with each other using i2b2 XML messages sent to APIs. When a cell receives a request message, it queries a table in the HiveData database to determine the location of main database for that cell, based on the user's project. An exception is the PM cell, which uses a single database for all projects. The i2b2 Web Client is written entirely in HTML and JavaScript. It communicates with a Web Proxy on a web server, which redirects messages to the appropriate cell.
Image Modified
Figure 2. i2b2 components.

Anchor
_pzplpfitssbs
_pzplpfitssbs
tranSMART Components

The TranSMART web user interface is a single tomcat application with an extended set of plugins which may be enabled/disabled in the configuration file.
TranSMART is delivered with a set of supporting applications:

Transmart-data creates an empty database including all schemas and tables for i2b2, and includes stored procedures for data loading and data management. It also provides targets to install R and saset of required R/BioConductor packages from source.
Transmart-etl provides Pentaho dataintegration (Kettle) jobs to load clinical and omics datatypes, plus a loader application for genes, pathways, proteins and other metadata.
RInterface provides an API for scripts to login and extract study data for external analysis
Transmart-batch and tranSMART-ICE are alternative data loading tools
Transmart-manual is the online manual to be installed alongside the server using tomcat or a web server
Scripts provides all-in-one installation scripts for supported operating systems
GWAVA provides a server for GWAS data visualisation within tranSMART
Transmart-test is an automated test environment for developers

Anchor
_xpl7n52nemi9
_xpl7n52nemi9
System Requirements

Anchor
_gyx8y3993xq5
_gyx8y3993xq5
i2b2

i2b2 requirements can be found here. A summary of the key requirements:

Database: Oracle (>=11g), PostgreSQL (>=9), MSSQL (>=2012).
OS: Windows, Mac, or Linux
Software components: JDK 8, Wildfly 17, web server (no specific requirement)

Anchor
_yw94k3ijoviz
_yw94k3ijoviz
tranSMART

TranSMART is supported with:

Oracle Enterprise Edition 12.0.1, PostgreSQL (>=9.4)
OS: Linux (Ubuntu 20.04, 18.04, 16.04, 14.04, Centos 7, Fedora 33, …)
Software components: JDK 8, Tomcat (>=7), Groovy, R (>=4.0.0),

The additional database requirements (e.g. Oracle Enterprise) ar eto support partitioning for the large ables used for omics data in the tranSMART-specific schemas.

Anchor
_aovxr3keay9p
_aovxr3keay9p
Installation

Anchor
_esqg7wfji704
_esqg7wfji704
i2b2 Install

Download:

...

). There are 3 components:
- Data (Chapter 3). Install the i2b2 database on MSSQL, Oracle, or Postgres. This provides many metadata tables for querying and authentication, as well as the actual core data tables.
  - It is essential to configure i2b2 with the ACT ontology, to be compatible with the current SHRINE 3.0 bundle. When installing i2b2 data, follow the instructions in the next section on installing the ACT ontology.
- Server (Chapter 4). A Java program that runs in the Wildfly container which provides an API and data analytic methods on the database. It is divided into components called cells. SHRINE uses some of these cells: CRC to communicate to the database, ONT to provide the query ontology, and PM to manage authentication.
- Webclient (Chapter 5). A web interface to i2b2, which is not required for SHRINE but could be useful for local querying and testing (SHRINE is network-only).

Anchor
_wxtqm1dqcc0a
_wxtqm1dqcc0a
tranSMART Install

Installation instructions are on the transmart wiki. They can be used generally on any Linux operating system.

Install scripts are provided to install on a set of supported operating systems. They are provided for a fresh clean installation of the operating system and can be amended and re-launched in case of problems (e.g. files not in the expected path/format, or changs to the components/requirements for R installation from the public R distributions)

Versions Compared

Old Version 2

New Version Current

Key

Anchor
_ft6mbcw66jvk
_ft6mbcw66jvk
Data Science Bundle (alpha version!)

Anchor
_k7qnfgljn60b
_k7qnfgljn60b
Combining clinical and genomic data using i2b2 and tranSMART to perform complex analyses of real-world data

Anchor
_i0objq32md3z
_i0objq32md3z
Introduction

Anchor
_xwvjd9pbemy6
_xwvjd9pbemy6
Use Cases

Anchor
_rc59n6p09u3u
_rc59n6p09u3u
Bundle Components

Anchor
_sra348i6vrgq
_sra348i6vrgq
Demo

Anchor
_z5u2w5mdf6zr
_z5u2w5mdf6zr
Technical Architecture

Anchor
_fd27q8ivgxlh
_fd27q8ivgxlh
i2b2 Components

Anchor
_pzplpfitssbs
_pzplpfitssbs
tranSMART Components

Anchor
_xpl7n52nemi9
_xpl7n52nemi9
System Requirements

Anchor
_gyx8y3993xq5
_gyx8y3993xq5
i2b2

Anchor
_yw94k3ijoviz
_yw94k3ijoviz
tranSMART

Anchor
_aovxr3keay9p
_aovxr3keay9p
Installation

Anchor
_esqg7wfji704
_esqg7wfji704
i2b2 Install

Anchor
_wxtqm1dqcc0a
_wxtqm1dqcc0a
tranSMART Install

Pages

Recently updated

Page History

Versions Compared

Old Version 2

New Version Current

Key

Anchor_ft6mbcw66jvk_ft6mbcw66jvkData Science Bundle (alpha version!)

Anchor_k7qnfgljn60b_k7qnfgljn60bCombining clinical and genomic data using i2b2 and tranSMART to perform complex analyses of real-world data

Anchor_i0objq32md3z_i0objq32md3zIntroduction

Anchor_xwvjd9pbemy6_xwvjd9pbemy6Use Cases

Anchor_rc59n6p09u3u_rc59n6p09u3uBundle Components

Anchor_sra348i6vrgq_sra348i6vrgqDemo

Anchor_z5u2w5mdf6zr_z5u2w5mdf6zrTechnical Architecture

Anchor_fd27q8ivgxlh_fd27q8ivgxlhi2b2 Components

Anchor_pzplpfitssbs_pzplpfitssbstranSMART Components

Anchor_xpl7n52nemi9_xpl7n52nemi9System Requirements

Anchor_gyx8y3993xq5_gyx8y3993xq5i2b2

Anchor_yw94k3ijoviz_yw94k3ijoviztranSMART

Anchor_aovxr3keay9p_aovxr3keay9pInstallation

Anchor_esqg7wfji704_esqg7wfji704i2b2 Install

Anchor_wxtqm1dqcc0a_wxtqm1dqcc0atranSMART Install

Pages

Recently updated

Anchor
_ft6mbcw66jvk
_ft6mbcw66jvk
Data Science Bundle (alpha version!)

Anchor
_k7qnfgljn60b
_k7qnfgljn60b
Combining clinical and genomic data using i2b2 and tranSMART to perform complex analyses of real-world data

Anchor
_i0objq32md3z
_i0objq32md3z
Introduction

Anchor
_xwvjd9pbemy6
_xwvjd9pbemy6
Use Cases

Anchor
_rc59n6p09u3u
_rc59n6p09u3u
Bundle Components

Anchor
_sra348i6vrgq
_sra348i6vrgq
Demo

Anchor
_z5u2w5mdf6zr
_z5u2w5mdf6zr
Technical Architecture

Anchor
_fd27q8ivgxlh
_fd27q8ivgxlh
i2b2 Components

Anchor
_pzplpfitssbs
_pzplpfitssbs
tranSMART Components

Anchor
_xpl7n52nemi9
_xpl7n52nemi9
System Requirements

Anchor
_gyx8y3993xq5
_gyx8y3993xq5
i2b2

Anchor
_yw94k3ijoviz
_yw94k3ijoviz
tranSMART

Anchor
_aovxr3keay9p
_aovxr3keay9p
Installation

Anchor
_esqg7wfji704
_esqg7wfji704
i2b2 Install

Anchor
_wxtqm1dqcc0a
_wxtqm1dqcc0a
tranSMART Install