Developers Getting Started With i2b2
Space shortcuts
Space Tools
Developers Getting Started With i2b2 getstarted

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Data Export Configuration

Design and Architecture

The i2b2 breakdown architecture is modified to support the data table definition and new breakdown types for User created data requests. The database tables have been modified to support the data table definitions.

  • RPDO_TABLE_REQUEST table stores the data table definitions
  • HIVE_CELL_PARAMS has new parameters for global and email configurations and the data file generation location (defaults to the server's drive)
  • QT_RESULT_TYPE table will contain a new entry for each data table definition
  • QT_BREAKDOWN_PATH will contain a new entry for each data table definition which has the data export execution details.
  • QT_XML_RESULT contains metadata about each data request, such as status and e-mail details.  

User Roles and Actions

User visibility to Data Table Creation, Data Request and Data Export is based on User role configuration. Configuration is managed by the Admin user using Admin Dashboard plugin.

Below is the list of Actions available based on the User role. 

...

Database Configuration changes

RPDO_TABLE_REQUEST

In order to support the data Table design, a new table RPDO_TABLE_REQUEST is added.  The metadata in Data Table design is stored in the table when the data Table is saved.

Image Removed

Three default rows are initially loaded into the RPDO_TABLE_REQUEST, defining patient_dimension columns which are required in every export table. These are:  race_cd, age_in_years_num, and sex_cd

Image Removed

QT_BREAKDOWN_PATH

QT_Breakdown_Path table has been modified to include a new column Group_id. It stores the Project ID, in order to support project-scoped data table definitions.

...

Image Removed

Database updates- Data table definition creation

The database is dynamically modified when the user creates the data table definition or submits a data request 

RPDO_TABLE_REQUEST 

Rows for the data table design are dynamically inserted in the RPDO_TABLE_REQUEST when a data table is saved.  

...

auto generated incremental value, corresponding to each concept in the data table. Unique for each concept

example: 1282,1283, 1284

...

unique for each Data table ( same value for all the concepts underneath)

example: 17

...

logs the table definition variables details needed for the  execution of the stored procedure to create the data file

...

Example:

Image Removed
QT_QUERY_RESULT_TYPE

The data table is logged as a breakdown type  in the QT_QUERY_RESULT_TYPE table when the user saves the data table definition.

...

Auto generated number

example : 144

...

RPDO_<TABLE_INSTANCE_ID>

example: RPDO_17

...

LU (User tables)

LP (Project shared tables)

...

Example:

Image Removed

Info

We recommend data requests be limited to DATA_LDS users and data exports be limited to MANAGER users. The user role determines if exports/requests are visible and runnable. Refer to section on User Roles and Actions

QT_BREAKDOWN_PATH 

A new row is also added here when the user saves the data table definition. 

  • Name column references the  Table_instance_id when the user saves the table_definition. 
  • Value column contains the command to run the stored procedure that generates the data file.
  • Group_id has the project ID
  • The result_instance_id parameter gets the numeric value from the QUERY_RESULT_INSTANCE for the patientset (result_type_id =1) when the export file is created.

Example:

...

RPDO_<Table_instance_id>

Example: RPDO_17

...

EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=<Table_instance_id> @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}} @MIN_ROW=0 @MAX_ROW=10000

...

SQLServerLarge

Example:

Image Removed

Database updates- Data request submission

1.8.2 requests
QT_QUERY_RESULT_INSTANCE

When the User runs a query with User created Data Request breakdown option selected, a row is logged in the QT_Query_Result_Instance for the  result_type_id generated in the QT_RESULT TYPE

...

Example:

Image Removed

QT_BREAKDOWN_PATH 

The EXEC statement in Value column is updated with value for table_instance_id

Example:

...

RPDO_<Table_instance_id>

Example: RPDO_144

...

EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=17 @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}},@MIN_ROW=0 @MAX_ROW=99999999

{{{RESULT_INSTANCE_ID}}} gets replaced with QT_QUERY_RESULT_INSTANCE.Result_instance_id  as the stored-procedure in the back-end gets executed.

at the time of data file creation.

QT_XML_RESULT

A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries are stored in the QT_XML_RESULT table, in the XML_VALUE field. The metadata  gets updated as the data export process is completed (from data request submission to data file creation)

...

Data file format/location and Email parameters set-up

HIVE_CELL_PARAMS

The generated Data file format, location and email server parameters are configurable in the HIVE_CELL_PARAMS. 

Info

Wildfly must be restarted for changes to the HIVE_CELL_PARAMs to take effect.



Cell IDParameter NameExample ValueNotes
CRCedu.harvard.i2b2.crc.smtp.hostsmtp.partners.orgSMTP host
CRCedu.harvard.i2b2.crc.smtp.port25SMTP port
CRCedu.harvard.i2b2.crc.smtp.ssl.enabledFALSETRUE will enable SSL
CRCedu.harvard.i2b2.crc.smtp.authFALSETRUE will enable SMTP authentication
CRCedu.harvard.i2b2.crc.smtp.usernamenoneSMTP username (required for SMTP authentication)
CRCedu.harvard.i2b2.crc.smtp.passwordnoneSMTP password (required for SMTP authentication)
CRCedu.harvard.i2b2.crc.smtp.enabledFALSETRUE will enable e-mails
CRCedu.harvard.i2b2.crc.exportcsv.defaultescapecharacter"Escape character for export files
CRCedu.harvard.i2b2.crc.exportcsv.maxfetchrows-1Maximum number of rows to export, or -1 for no limit
CRCedu.harvard.i2b2.crc.exportcsv.defaultlineend\nLine ending for export files
CRCedu.harvard.i2b2.crc.exportcsv.defaultseperator\tField separator for export files
CRCedu.harvard.i2b2.crc.exportcsv.resultfetchsize50000Number of records retrieved during each database fetch.
CRCedu.harvard.i2b2.crc.exportcsv.filename{{{PROJECT_ID}}}/{{{DATE_yyyyMMdd}}}_{{{FULL_NAME}}}.tsvParameterized template for export file names. If the extension is .zip, the file is zipped.
CRCedu.harvard.i2b2.crc.exportcsv.defaultquotechar"Quote character for export files
CRCedu.harvard.i2b2.crc.exportcsv.workfolder/tmp/i2b2Folder on the i2b2 server for data exports
CRCedu.harvard.i2b2.crc.exportcsv.zipencryptmethodnoneEncryption method for the exported ZIP file. One of STANDARD, NONE, or AES.
PM_PROJECT_PARAMS

Stores the parameters for the

  • Email address of data Manager,
  • Subject line for the data User emails
  • Subject matter of  User email
  • mail address of the data Manager, and data request details for the User and data Manager


Image Added

Info

User receives email entered in the Query Options Email box at the time of data request submission

Design and Architecture

The i2b2 breakdown architecture is modified to support the data table definition and new breakdown types for User created data requests. The database tables have been modified to support the data table definitions.

  • RPDO_TABLE_REQUEST table stores the data table definitions
  • HIVE_CELL_PARAMS has new parameters for global and email configurations and the data file generation location (defaults to the server's drive)
  • QT_RESULT_TYPE table will contain a new entry for each data table definition
  • QT_BREAKDOWN_PATH will contain a new entry for each data table definition which has the data export execution details.
  • QT_XML_RESULT contains metadata about each data request, such as status and e-mail details.  

User Roles and Actions

User visibility to Data Table Creation, Data Request and Data Export is based on User role configuration. Configuration is managed by the Admin user using Admin Dashboard plugin.

Below is the list of Actions available based on the User role. 


User RoleCreate Data TableData RequestData Request Manager View DetailsData Export/Generate Data file Change Request Status
Obfuscated




Aggregate




ProtectedEnabledEnabledEnabled

LDSEnabledEnabledEnabled

Manager/No LDSEnabledEnabledEnabled

Manager/LDSEnabledEnabledEnabledEnabledEnabled
AdminEnabledEnabledEnabledEnabledEnabled


Database Configuration changes

RPDO_TABLE_REQUEST

In order to support the data Table design, a new table RPDO_TABLE_REQUEST is added.  The metadata in Data Table design is stored in the table when the data Table is saved.

Image Added


Three default rows are initially loaded into the RPDO_TABLE_REQUEST, defining patient_dimension columns which are required in every export table. These are:  race_cd, age_in_years_num, and sex_cd

Image Added


QT_BREAKDOWN_PATH

QT_Breakdown_Path table has been modified to include a new column Group_id. It stores the Project ID, in order to support project-scoped data table definitions.

Column NameData Type
Group_IdVarchar (50)


Image Added

Database updates- Data table definition creation

The database is dynamically modified when the user creates the data table definition or submits a data request 

RPDO_TABLE_REQUEST 

Rows for the data table design are dynamically inserted in the RPDO_TABLE_REQUEST when a data table is saved.  

RPDO_TABLE_REQUEST PARAMETERSDATA TABLE Values
TABLE_REQUEST_ID

auto generated incremental value, corresponding to each concept in the data table. Unique for each concept

example: 1282,1283, 1284

TABLE_INSTANCE_ID

unique for each Data table ( same value for all the concepts underneath)

example: 17

TABLE_INSTANCE_NAMEData table definition name
USER_IDdata request user
GROUP_IDproject id
SET_INDEXauto incremented for each concept row for the table instance
C_FACTTABLECOLUMN"concept_cd" for the fact table or "patient_num" for the patient_dimension
C_TABLENAMEcorresponds to table storing concept_cd/patient_num( concept_dimension or patient_dimension )
COLUMN_NAMEThe column name of the data table definition
C_FULLPATHcorresponds to the concept's c_fullname in the ontology table
C_COLUMNNAMEusually "concept_path" for concept_dimension
C_COLUMNDATATYPEcorresponds to ontology table  column value
C_OPERATORcorresponds to ontology table  column value
C_DIMCODEcorresponds to ontology table  column value
AGG_TYPEcorresponds to Aggregate option selection in data table definition
CONSTRAIN_BY_DATE_TOcorresponds to date constraint option selection in data table definition
CONSTRAIN_BY_DATE_FROMcorresponds to date constraint option selection in data table definition
CONSTRAIN_BY_VALUE_OPERATORcorresponds to set value constraint selection in data table definition
CONSTRAIN_BY_VALUE_CONSTRAINTcorresponds to set value constraint selection in data table definition
CONSTRAIN_BY_VALUE_UNIT_OF_MEASUREcorresponds to set value constraint selection in data table definition
CONSTRAIN_BY_VALUE_TYPEcorresponds to set value constraint selection in data table definition
SHAREDY (for project and system shared table definitions); N (User table definition)
DELETE_FLAGY (if deleted) or N 
C_VISUALATTRIBUTESvalues are LA  or LH  (based on visibility table definitions in the panel)
REQUIREDY for default variables ( race, age, sex); N for other variables
JSON_DATA

logs the table definition variables details needed for the  execution of the stored procedure to create the data file

GENERATED_SQLlogs the SQL query that is generated by stored procedure call  for each variable
DELETE_DATEdate of table definition deletion
CREATE_DATEdate of table definition creation
UPDATE_DATEdate table last updated
USE_AS_COHORTN by default; not used currently


Example:

Image Added
QT_QUERY_RESULT_TYPE

The data table is logged as a breakdown type  in the QT_QUERY_RESULT_TYPE table when the user saves the data table definition.


RESULT_TYPE_IDNAMEDESCRIPTIONDISPLAYTYPE_IDVISUAL_ATTRIBUTE_TYPE_IDUSER_ROLE_CDCLASSNAME

Auto generated number

example : 144

RPDO_<TABLE_INSTANCE_ID>

example: RPDO_17

User-created <Table_Name>RequestCATNUM

LU (User tables)

LP (Project shared tables)


DATA_LDSedu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientRequest


Example:

Image Added



Info

We recommend data requests be limited to DATA_LDS users and data exports be limited to MANAGER users. The user role determines if exports/requests are visible and runnable. Refer to section on User Roles and Actions

QT_BREAKDOWN_PATH 

A new row is also added here when the user saves the data table definition. 

  • Name column references the  Table_instance_id when the user saves the table_definition. 
  • Value column contains the command to run the stored procedure that generates the data file.
  • Group_id has the project ID
  • The result_instance_id parameter gets the numeric value from the QUERY_RESULT_INSTANCE for the patientset (result_type_id =1) when the export file is created.


Example:

NAMEVALUEGroup_id

RPDO_<Table_instance_id>

Example: RPDO_17

EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=<Table_instance_id> @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}} @MIN_ROW=0 @MAX_ROW=10000

SQLServerLarge


Example:

Image Added

Database updates- Data request submission

1.8.2 requests
QT_QUERY_RESULT_INSTANCE

When the User runs a query with User created Data Request breakdown option selected, a row is logged in the QT_Query_Result_Instance for the  result_type_id generated in the QT_RESULT TYPE

Column NameValue (Example)
Result_Type_ id144
Result_instance_id1729


Example:

Image Added

QT_BREAKDOWN_PATH 

The EXEC statement in Value column is updated with value for table_instance_id

Example:

NAMEVALUE

RPDO_<Table_instance_id>

Example: RPDO_144

EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=17 @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}},@MIN_ROW=0 @MAX_ROW=99999999


{{{RESULT_INSTANCE_ID}}} gets replaced with QT_QUERY_RESULT_INSTANCE.Result_instance_id  as the stored-procedure in the back-end gets executed.

at the time of data file creation.

QT_XML_RESULT

A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries are stored in the QT_XML_RESULT table, in the XML_VALUE field.

Example:

Image Removed

 

1.8.1 Requests

3 new default rows are logged in RPDO_TABLE_REQUEST when user runs a query and submits 1.8.1 data request for the first time . This will enable the 1.8.1 requests to be displayed on data request Manager panel along with 1.8.2 requests. Subsequent requests utilize the same default rows for display.

RPDO_TABLE_REQUEST

Image Removed

QT_QUERY_RESULT_TYPE

A row is logged in QT_RESULT_TYPE for the default table definition. Subsequent data request submissions utilize the same row for the  default rows result_type_id

Image Removed

QT_RESULT_INSTANCE

A row is logged in the QT_Query_Result_Instance for the  same  result_type_id generated in the QT_RESULT TYPE every time user submits a request

Image Removed

QT_XML_RESULT

A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries(including user email address) are stored in the QT_XML_RESULT table, in the XML_VALUE field. The metadata  gets updated as the data export process is completed (from data request submission to data file creation) 

Image Removed

Data file creation

1.8.2 Data file creation

CREATE FILES(s)  executes usp_rpdo2 stored procedure in the back end by calling the EXEC statement in QT_BREAKDOWN_PATH to create the data file.

  • The select statement in the VALUE column is executed to create the data file The result_instance_id value of the query is inserted  dynamically  each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH 
1.8.1 Data file creation

CREATE FILES(s)  executes the <data request name>.csv  in QT_BREAKDOWN_PATH.

  • The select statement in the VALUE column is executed for the default table data file creation. The result_instance_id value of the query is inserted  dynamically  each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH 
  • The select statement in the VALUE column  of the <data request name>.CSV for the data request is executed to create the data file. 

Data file format/location and Email parameters set-up

HIVE_CELL_PARAMS

The generated Data file format, location and email server parameters are configurable in the HIVE_CELL_PARAMS. 

Info

Wildfly must be restarted for changes to the HIVE_CELL_PARAMs to take effect.

...

Column Name Value (example)
Result_Instance_Id1729
XML_Value <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns10:i2b2_result_envelope xmlns:ns6="http://www.i2b2.org/xsd/cell/crc/psm/analysisdefinition/1.1/" xmlns:ns5="http://www.i2b2.org/xsd/hive/msg/1.1/" xmlns:ns8="http://www.i2b2.org/xsd/cell/pm/1.1/" xmlns:ns7="http://www.i2b2.org/xsd/cell/crc/psm/querydefinition/1.1/" xmlns:ns9="http://www.i2b2.org/xsd/cell/ont/1.1/" xmlns:ns10="http://www.i2b2.org/xsd/hive/msg/result/1.1/" xmlns:ns2="http://www.i2b2.org/xsd/hive/pdo/1.1/" xmlns:ns4="http://www.i2b2.org/xsd/cell/crc/psm/1.1/" xmlns:ns3="http://www.i2b2.org/xsd/cell/crc/pdo/1.1/">
    <body>
        <ns10:result name="RPDO_17">
            <data column="SUBMITTED" type="string">20250808_125429</data>
            <data column="EMAIL" type="string">rmetta@mgb.org</data>
            <data column="QUEUED" type="string">20250808_125631</data>
            <data column="PROCESSING" type="string">20250808_125631</data>
            <data column="DIRECTORY" type="string">/opt/dataexport/SQLServerLarge/724</data>
            <data column="FINISHED" type="string">20250808_125633</data>
            <data column="APPROVEDBY" type="string">demoManager</data>
        </ns10:result>
    </body>
</ns10:i2b2_result_envelope>

Example:

Image Added

 

1.8.1 Requests

3 new default rows are logged in RPDO_TABLE_REQUEST when user runs a query and submits 1.8.1 data request for the first time . This will enable the 1.8.1 requests to be displayed on data request Manager panel along with 1.8.2 requests. Subsequent requests utilize the same default rows for display.

RPDO_TABLE_REQUEST

Image Added

QT_QUERY_RESULT_TYPE

A row is logged in QT_RESULT_TYPE for the default table definition. Subsequent data request submissions utilize the same row for the  default rows result_type_id

Image Added


QT_RESULT_INSTANCE

A row is logged in the QT_Query_Result_Instance for the  same  result_type_id generated in the QT_RESULT TYPE every time user submits a request

Image Added

QT_XML_RESULT

A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries(including user email address) are stored in the QT_XML_RESULT table, in the XML_VALUE field. The metadata  gets updated as the data export process is completed (from data request submission to data file creation) 


Image Added



Data file creation

1.8.2 Data file creation

CREATE FILES(s)  executes usp_rpdo2 stored procedure in the back end by calling the EXEC statement in QT_BREAKDOWN_PATH to create the data file.

  • The select statement in the VALUE column is executed to create the data file The result_instance_id value of the query is inserted  dynamically  each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH 
1.8.1 Data file creation

CREATE FILES(s)  executes the <data request name>.csv  in QT_BREAKDOWN_PATH.

  • The select statement in the VALUE column is executed for the default table data file creation. The result_instance_id value of the query is inserted  dynamically  each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH 
  • The select statement in the VALUE column  of the <data request name>.CSV for the data request is executed to create the data file. 
Info
PM_PROJECT_PARAMS

Stores the parameters for the

  • Email address of data Manager,
  • Subject line for the data User emails
  • Subject matter of  User email
  • mail address of the data Manager, and data request details for the User and data Manager

Image Removed

User receives email entered in the Query Options Email box at the time of data request submission
Info

Software Changes:

  • Data: Entries in RPDO_TABLE_REQUEST, HIVE_CELL_PARAMS, PM_PROJECT_PARAMS, QT_QUERY_RESULT_TYPE, and QT_BREAKDOWN_PATH, QT_XML_RESULT define the data exporter functionality.
  • Java code: New breakdown classes and updates to existing java classes to support the data exporter functionality.

...

Developers Getting Started With i2b2 getstarted