IDRT - Integrated Data Repository Toolkit
Space shortcuts
Space Tools

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Contributors: Igor Engel, Thomas Ganslandt 

Note

TODO TG: content

Relevance & Approach

Large collections of biosamples e.g. from tumors or "remainder" material from routine clinical laboratories play an increasing role in translational research. The intuitive query capabilities of i2b2 make it an ideal platform for querying combined data from clinical or study records, biosamples, and analysis data gained from biomaterial. Importing biosample data into i2b2, however, has so far required individually coded ETL pathways.

...

In the IDRT project, a driver was developed to import biosample data from the Starlims STARLIMS Biorepository® by Abbott Informatics (tested with version 10.5 and 10.7).

...

In this section, the specific implementation for the Starlims® STARLIMS® driver is described. The driver was implementation implemented on the Talend Open Studio platform in order to integrate with the other components of the IDRT toolkit. The driver creates CSV files for the full ontology and fact data generated from the source system, which is then imported using the standard IDRT CSV extractor. A driver for a different biosample management system only needs to implement the extration extraction and preparation of ontology and fact CSV files.

...

Context

Table

Description

Sample core data

INVENTORY

Biosample inventory objects (e.g. samples, aliquots, containers), including relevant core attributes (e.g. sample ID, material type)

 

INVENTORY_TRANSACTIONS

Actions taken with samples (e.g. splitting into aliquots, moving) including links to parent samples of aliquots

 

MATERIALS

material types

 

RASPROJECT_INVENTORY

links between samples and projects

Sample metadata

METADATA

metadata content for samples

 

METADATA_TEMPLATE_FIELDS

field definitions for flexible sample metadata

Storage hierarchy

DEPARTMENTS

top level of storage hierarchy

 

BUILDINGS

buildings inside of departments

 

ROOMS

rooms inside of buildings

 

LOCATIONS

recursive storage structure inside of a room (e.g. a freezer subdivided into slots, racks and rack positions)

...

The storage hierarchy requires further processing for the "lower" levels to determine the parent/child relations and hierarchy level within the recursive part.

Starlims® STARLIMS® provides flexible metadata templates which can be designed to fit individual project needs (e.g. to collect additional information about sample quality or processing steps not covered by SPREC). Metadata templates can be defined separately for each project and material type and are versioned. Samples within one project thus may contain metadata from different templates or template versions. In addition to TOS standard components, a small dedicated Java program is needed to generate a consistent concept hierarchy of templates, versions, fields and values from the raw source data.

...

An example of concept hierarchy after loading biosample data with the IDRT biomaterial extractor is shown in the following figure: