Page History
...
The Data Exporter functionality enables an i2b2 user to create a data table definition and request patient data for the patients found by a desired query. In the configured workflow, data is requested and generated through new cohort. First, a table of variables is designed using create data table for export tool. The defined data tables are then displayed under the Data Request(s) breakdown types that can be selected in the Run Query dialog, After the query run, the Manager user can view and manage the data requests under the Data Request Manager tool. i2b2 Users will see "data request" breakdowns, which send an e-mail to a predefined data manager requesting a type of data. The data manager (Manager user role) be able to view their data requests status and details . An e-mail is generated for the Data Manager when the request is submitted. The Data Manager processes the request by rerunning generating the query with a special "data export" breakdowndata file under the Data Request Manager tool. The patient data is exported as a file and stored in a specified location for retrieval, and the user is notified by e-mail.
Info |
---|
The patient data |
...
file is generated as per the data table definition design specification |
Data Export Workflow
Data
...
Table Creation - User process
...
The Create Data Table for export feature allows User to design a table of variables of interest using the Design table feature. The table design can then be saved as a template that can be loaded and re- used to request data export.
Design Table
The DESIGN TABLE panel allows i2b2 User to create a data table definition with list of variables of interest.
User logs into the webclient web client and accesses Data Export create data table for export tool under the Analysis Tools plugin
The design table DESIGN TABLE panel displays list of predefined variables that will be included by default in the data export file. These predefined variables are (by default) Gender, Age, Race, and Ethnicity
From the Ontology Terms tab, drag and drop additional variables of interest onto the DESIGN TABLE grid. These variables will be added to the as a list of variables included in the data export. The predefined variables are locked and cannot be unchecked since they are required for every data table. Other variables can be deleted under Actions column to exclude them from the data set file.
Aggregation method for the variable can be applied by clicking on the Value drop down box in the Aggregation Method column to assign aggregate options to each variable
...
Aggregate Options
The availability of aggregation options depends on the type of concept it aggregates on.
...
Aggregation options for a non-numeric variable such as Diagnosis are Existence (Yes/No), Count: Number of Concepts, Count: Number of Dates, Count Number of Encounters, Count Number of Facts, Count Number of Providers, Date (First), Date (Most Recent), Most Frequent Concept (Codes)
Aggregation options for concepts that support numeric values such as labs include Average, Min, Max, etc. in addition to all the functions available to normal variables
Aggregation Options | Explanation |
Existence (Yes/No) | Whether the patient has an observation of this concept. This is the default option. |
Count: Number of Concepts | Total number of concepts |
Count: Number of Dates | Total number of dates for the participant |
Count: Number of Encounters | Total number of encounters for the participant |
Count: Number of Facts | Total number of observations |
Count: Number of Providers | Total number of providers for the participant |
Date |
: First |
Date | Date of earliest observation |
Date |
: Last Date | Date of the most recent observation |
Count
Total number of observations
All Concepts (Names/Text)
All concept names are listed
Most Frequent Concept (Names/Text)
Most frequent concept name
All Concepts (Codes)
All concept codes are listed
Most Frequent Concept (Codes)
Most frequent concept code
Minimum Value
Calc: First Value | Minimum value of all numeric values observations |
Calc: Last Value | Maximum value of all numeric values observations |
Calc: Number of Values | Total numbe of all numeric values observations |
Calc: Average Value | Average value of all numeric values observations |
Calc: Minimum Value |
Lowest value of all numeric values observations |
Calc: Maximum Value |
Highest value of all numeric values observations |
Calc: Median Value |
Median value of all numeric values observations |
Mode (Most Frequent Value)
Most frequent value for numeric, enum, blob observations
List of All Values
List of all values for numeric, enum, blob observations
Preview and Save Table
Preview Table
Table Design can be previewed to get a preview of the Table Design can be previewed to get a preview of the data export file that will be generated.
User clicks on Preview Table PREVIEW TABLE to verify the variables added are displayed displayed in the data columns in the format desired.User clicks on Save menu option to save the table definition under Save Table Definition panel
Info |
---|
DESIGN AND PREVIEW Design and Preview is an interactive, iterative process where you User can add and remove variables . Then assign an Aggregate Method in the DESIGN TABLE tab to the newly added varibles. |
Load Table
Data Export - Request Data Export (User process)
- User logs into webclient and creates and runs a query.
- In the Run Query dialog box, select one or more Data Request checkboxes
- Emails are automatically sent - one to the requester's e-mail, to inform them the request has been made; and one to the specified manager user to inform them to run the user's export request. The query name (in previous queries) is tagged with the query master id, so it can be identified by the manager.
Data Export - Request Data Export (Manager process)
...
The manager logs into the webclient and finds the query from their e-mail by locating the query master id, which is now part of the title of the previous query.
The manager user then re-runs the previous query. In the Run Query dialog, the manager selects the requested export option in the Data export breakdown list.
Info |
---|
It is very important to put the original query master id (or some other identifier that links the export to the request) into the query name. The export letter will contain the query name, and this is the only way to trace the export back to the original request. (Currently there is no way to tie the export directory to the data requester query_master_id.) If the data manager runs the query as a query-in-query, the query name will automatically be populated with this identifier. |
...
The data export runs in the background and the file is generated in a specified folder
...
Sample Letter generated for Data Manager
Results of the i2b2 request entitled - "Demographics_12_48_52", submitted on Apr 18, 2024 1:47:58 PM, are available.
Important notes about your data:
- Total number of patients returned in your data request: 133
- i2b2 reviewer:
Only persons specifically authorized and selected (as listed at the top of this letter) can download these files. If additional user access is needed, please ensure the person is listed on your project IRB protocol and contact the i2b2 team.
Specifically:
- Remove all PHI from computer, laptop, or mobile device after analysis is completed.
- Do NOT share PHI or PII with anyone who is not listed on the IRB protocol.
Your guideline for the storage of Protected Health Information can be found at: https://www.site.com/guidelines_for_protecting_and_storing_phi.pdf
*To download these files*
- You must be logged onto your site
These results are the data that was requested under the authority of the Institutional Review Board. The query resulting in this identified patient data is included at the end of this letter. A copy of this letter is kept on file and is available to the IRB in the event of an audit.
Thank you,
The i2b2 Team
Data Export - Manage Data Export Requests
Example Export files (all data are fake)
Demographics
Diagnosis
Data Export Configuration
Design and Architecture
The i2b2 breakdown architecture has been modified to support new breakdown types for data request and export. HIVE_CELL_PARAMS has new parameters for global configuration, the QT_RESULT_TYPE table has new entries for query export types, and the QT_BREAKDOWN_PATH defines the request and export. These are XML documents containing e-mails to send and SQL to execute. These breakdown types display in the run query dialog. User e-mails are sent to the e-mail address of the currently logged-in use. Data Manager e-mails are sent to an address configured in HIVE_CELL_PARAMS, and the data manager e-mail in an export is also saved as a file in the export location. The export file generation location, also configured in HIVE_CELL_PARAMS, is on the local drive hosting the i2b2 application.
Database changes
Currently QT_BREAKDOWN_PATH .value column has datatype of varchar(2000). In order to support the XML structure needed for the Data Exporter, the value column has been modified to CLOB datatype (ORACLE) and varchar(max) for (SQLServer).
Configuring Database Parameters
Request/Export query types: QT_QUERY_RESULT_TYPE
The data request and data export query types are configured as breakdowns in the QT_QUERY_RESULT_TYPE table. VISUAL_ATTRIBUTE_TYPE_ID should be set to LR for Data request and LX for Data export
USER_ROLE_CD should be set to DATA_LDS for Data Request and USER_ROLE_CD should be MANAGER for Data Export.
...
as well as assign or edit Aggregate options |
Save Table
- Click on Save
- The table definition can be either saved under MY TABLES or PROJECT SHARED TABLES
Info |
---|
MY TABLES are editable and will appear as Run Query options only for the current user PROJECT SHARED TABLES are editable and will appear as Run Query options for any user in the current i2b2 project SYSTEM SHARED TABLES is editable only for Admins and are templates for i2b2 users' tables |
Load Table
Load table definition enables the User to Load a saved table definition
- User clicks on Load menu option to display Load Table Definition with list of user saved definitions
- Select desired table definition and clicks on LOAD button
Info |
---|
Data file definitions from Load Table definitions can be further refined and saved |
Data Request - User process
Request 1.8.2 data exports
- User logs into web client and creates a query.
- In the Run Query dialog box, User created table definitions are displayed under Data Request(s) section
- Select one or more data request checkboxes.
- 1.8.2 requests (custom tables) appear as User Created:<table definition Request> checkboxes
- 1.8.1 data requests (long fact-oriented tables) appear as Request <xx> Data checkboxes.
4. Click run query
5. An email is automatically sent to the data manager to inform them of a new data request.
6. The table definition name with the FINISHED status is displayed as one of the breakdown items in the previous query results.
Data Request Manager
User's Data Requests List
User-submitted data requests (both 1.8.2 and 1.8.1) are logged under the Data Request Manager tool.
- Click on Tools=> Data Request Manager plugin
The Data Request Manager displays data requests along with their status and details
Info |
---|
For non-managers, only the user's data requests are displayed under the Request Data Manager. |
View Data Request Details
2. Click on View Details
View Details displays the Request details and Request status as well as other options
- Query Name (Click on Query ID number to display the previous query in the Find Patients window)
- Data Table Definition (Click on View under Data Request Type drop down Request item)
- Option to Withdraw Request
- Option to enter Comments
- Log info box displays the User's actions. and log of the Request status
Data Export - Manager process
The User-created data requests are managed by the Data Manager using the Data Request Manager tool.
Data Request Manager
Data Requests List
- Click on Tools=> Data Request Manager plugin.
List of User submitted data requests are displayed . The initial status of the requests is Submitted.
The Manager User can view the details of the Requests using VIEW DETAILS button and generate the Data files using Create File(s) button)
View Data Request Details
Click on View Details.
The View Details panel is similar in display for non-Manager user, except the Manager User can change the Data Request Status..
The Log info box displays the status of the Data Request as well as the Data File creation status.
Create Data File(s)
Click on Create File(s) on the Data Request Manager page
The data export runs in the background and the file is generated in a specified folder defined in the HIVE_CELL_PARAMS (see Configuration below). Data file generation status is displayed under the Status column under Data Request Manager
Status is updated as the Data File is processed, from Submitted to File in Progress to File Available. When the file is available, the requesting user is informed via e-mail. The process for giving the user access to the file will be site-specific.
Example Export files (all data are fake)
Two files are generated for each data table exported.
Definition file: has the variable names and data types for the data file
Data file: has the values for the variables
User-created Demographics
Definition file
Example:1969_Definition_For_Demographics_8_5_25.csv
TABLE_INSTANCE_ID | TABLE_INSTANCE_NAME | USER_ID | GROUP_ID | SET_INDEX | C_FACTTABLECOLUMN | C_TABLENAME | COLUMN_NAME | C_FULLPATH | C_COLUMNNAME | C_COLUMNDATATYPE | C_OPERATOR | C_DIMCODE | AGG_TYPE | CONSTRAIN_BY_DATE_TO | CONSTRAIN_BY_DATE_FROM | CONSTRAIN_BY_VALUE_OPERATOR | CONSTRAIN_BY_VALUE_CONSTRAINT | CONSTRAIN_BY_VALUE_UNIT_OF_MEASURE | CONSTRAIN_BY_VALUE_TYPE | CREATE_DATE |
7 | DEM_6_27 | demoManager | SQLServerLarge | 0 | sex_cd | patient_dimension | Gender | @ | @ | @ | @ | @ | Value | 37:14.3 | ||||||
7 | DEM_6_27 | demoManager | SQLServerLarge | 1 | age_in_years_num | patient_dimension | Age | @ | @ | @ | @ | @ | Value | 37:14.3 | ||||||
7 | DEM_6_27 | demoManager | SQLServerLarge | 2 | race_cd | patient_dimension | Race | @ | @ | @ | @ | @ | Value | 37:14.3 | ||||||
7 | DEM_6_27 | demoManager | SQLServerLarge | 3 | patient_num | patient_dimension | (children) < 18 years old | \\ACT_DEMO\ACT\Demographics\Age\< 18 years old\ | birth_date | N | > | dateadd(YY, -18, GETDATE()) | Exists | 37:14.3 |
Data file
Example: 1969_Demographics_8_5_25_08052025.csv
patient_num | Gender | Age | Race | (children) < 18 years old | 75-84 years old | >= 65 years old |
26 | M | 72 | @ | No | Yes | Yes |
53 | F | 55 | WHITE | No | No | No |
87 | F | 59 | WHITE | No | No | No |
92 | F | 88 | DECLINED | No | No | Yes |
94 | F | 67 | ASIAN | No | No | Yes |
153 | M | 77 | @ | No | Yes | Yes |
User-created Labs
Definition file
Example: 2282_Definition_For_LAB_TESTS_HUGEPT_AGGOPTIONS
TABLE_INSTANCE_ID | TABLE_INSTANCE_NAME | USER_ID | GROUP_ID | SET_INDEX | C_FACTTABLECOLUMN | C_TABLENAME | COLUMN_NAME | C_FULLPATH | C_COLUMNNAME | C_COLUMNDATATYPE | C_OPERATOR | C_DIMCODE | AGG_TYPE | CONSTRAIN_BY_DATE_TO | CONSTRAIN_BY_DATE_FROM | CONSTRAIN_BY_VALUE_OPERATOR | CONSTRAIN_BY_VALUE_CONSTRAINT | CONSTRAIN_BY_VALUE_UNIT_OF_MEASURE | CONSTRAIN_BY_VALUE_TYPE | CREATE_DATE |
310 | LAB_TESTS_HUGEPT_AGGOPTIONS | act | NODE9 | 0 | sex_cd | patient_dimension | Gender | @ | @ | @ | @ | @ | Value | 25:57.3 | ||||||
310 | LAB_TESTS_HUGEPT_AGGOPTIONS | act | NODE9 | 1 | age_in_years_num | patient_dimension | Age | @ | @ | @ | @ | @ | Value | 25:57.3 | ||||||
310 | LAB_TESTS_HUGEPT_AGGOPTIONS | act | NODE9 | 2 | race_cd | patient_dimension | Race | @ | @ | @ | @ | @ | Value | 25:57.3 | ||||||
310 | LAB_TESTS_HUGEPT_AGGOPTIONS | act | NODE9 | 3 | race_cd | patient_dimension | Ethnicity | @ | @ | @ | @ | @ | Value | 25:57.3 | ||||||
310 | LAB_TESTS_HUGEPT_AGGOPTIONS | act | NODE9 | 4 | concept_cd | concept_dimension | Cholesterol (Group:CHOL) | \\i2b2_LABS\i2b2\Labtests\LAB\(LLB16) Chemistry\(LLB17) Lipid Tests\CHOL\ | concept_path | N | like | \i2b2\Labtests\LAB\(LLB16) Chemistry\(LLB17) Lipid Tests\CHOL\ | NumFacts | 25:57.3 |
Data file
Example: 2282_LAB_TESTS_HUGEPT_AGGOPTIONS_08122025
patient_num | Gender | Age | Race | Ethnicity | Cholesterol (Group:CHOL) |
9 | M | 49 | white | white | 1 |
89 | F | 57 | white | white | 1 |
13 | F | 53 | white | white | 1 |
21 | F | 72 | white | white | 7 |
77 | F | 80 | white | white | 1 |
Data Export Configuration
Data file format/location and Email parameters set-up
HIVE_CELL_PARAMS
The generated Data file format, location and email server parameters are configurable in the HIVE_CELL_PARAMS.
Info |
---|
Wildfly must be restarted for changes to the HIVE_CELL_PARAMs to take effect. |
Cell ID | Parameter Name | Example Value | Notes |
CRC | edu.harvard.i2b2.crc.smtp.host | smtp.partners.org | SMTP host |
CRC | edu.harvard.i2b2.crc.smtp.port | 25 | SMTP port |
CRC | edu.harvard.i2b2.crc.smtp.ssl.enabled | FALSE | TRUE will enable SSL |
...
We recommend data requests be limited to DATA_LDS users (irrespective of Admin path role - admin, user, manager) and data exports be limited to MANAGER users (irrespective of data path role - data_lds, data_agg,data_obf, data_deid). The user role determines if exports/requests are visible and runnable.
...
DATA_LDS
...
DATA_LDS
...
Success
...
non-display
...
MANAGER
...
MANAGER
...
non-display
...
Success
...
MANAGER with
DATA_LDS
...
MANAGER
...
Success
...
MANAGER
with DATA_LDS
...
DATA_LDS
...
Success
Request/Export file definition details: QT_BREAKDOWN_PATH
QT_BREAKDOWN_PATH defines e-mail content (for request and export) and SQL code (for export) in the following XML format. Any parameters defined in an export XML override the default in HIVE_CELL_PARAMS (e.g., field separator character.)
...
- <request name>_REQUEST defines the request email generation parameters.
- Request email can be configured in PM_USER_DATA parameters to define the email address of the User.
- User email content configurable in <RequesterEmailMessage>/>xml value parameter
- Data Manager gets the data manager email (email address configured in the HIVE_CELL_PARAMS)
- Data manager email content configurable in the <DataManagerEmailMessage>/xml value parameter
...
<?xml version="1.0"?>
<ValueExporter>
<Version>[...version number...]</Version>
<CreationDateTime>08/09/2024 12:00:00</CreationDateTime>
<RequesterEmailMessage>[...details sent to User on data request...] </RequesterEmailMessage>
<DataManagerEmailMessage>[...details sent to data Manager on user data request...]</DataManagerEmailMessage>
</ValueExporter>
...
- <request name>_CSV defines the data export email parameters and the SQL to for the export file generation.
- The data export email has detailed info on the file location and access info and can be customized according to your local institution's policies.
- The exported file(s) are generated in a secure location on the server's file system along with the data export email file in a zipped folder
- Email content and export SQL are configurable in the XML entries for the request/export in the Value parameters.
- Multiple SQL statements can be configured in the Value <file></file> tags as one SQL statement per file tag
...
Example Requests
...
Export file format/location and Email parameters set-up: HIVE_CELL_PARAMS
Export File format, location and email server parameters are configurable in the HIVE_CELL_PARAMS.
Info |
---|
The <Filename> parameter in the QT_BREAKDOWN_PATH definition will override the default value specified in the HIVE_CELL_PARAMS. Wildfly must be restarted for changes to the HIVE_CELL_PARAMs to take effect. |
Cell ID | Parameter Name | Example Value | Notes | CRC | edu.harvard.i2b2.crc.exportcsv.datamanageremail | userid@partners.org | Email address used for sending request/export e-mails
CRC | edu.harvard.i2b2.crc.smtp. | smtp.partners.org | SMTP host|
auth | FALSE | TRUE will enable SMTP authentication | |
CRC | edu.harvard.i2b2.crc.smtp. | ||
username | |||
none | SMTP | ||
username (required for SMTP authentication) | |||
CRC | edu.harvard.i2b2.crc.smtp. | ||
password | FALSE | TRUE will enable SSL||
none | SMTP password (required for SMTP authentication) | ||
CRC | edu.harvard.i2b2.crc.smtp. | ||
enabled | FALSE | TRUE will enable | |
e-mails | |||
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | |||
defaultescapecharacter | none | SMTP username (required for SMTP authentication)||
" | Escape character for export files | ||
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | |||
maxfetchrows | none | SMTP password (required for SMTP authentication)||
-1 | Maximum number of rows to export, or -1 for no limit | ||
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | |||
defaultlineend | FALSE | TRUE will enable e-mails||
\n | Line ending for export files | ||
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | Data Manager | Name that e-mails will be sent from.||
defaultseperator | \t | Field separator for export files | |
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | datamanager@site.org | E-mail address that e-mails will be sent from. | CC|
resultfetchsize | 50000 | Number of records retrieved during each database fetch. | |
CRC | edu.harvard.i2b2.crc. | ||
exportcsv. | i2b2 Data Request | Subject line for e-mails||
filename | {{{PROJECT_ID}}}/{{{DATE_yyyyMMdd}}}_{{{FULL_NAME}}}.tsv | Parameterized template for export file names. If the extension is .zip, the file is zipped. | |
CRC | edu.harvard.i2b2.crc.exportcsv. | ||
defaultquotechar | " | ||
Quote character for export files | |||
CRC | edu.harvard.i2b2.crc.exportcsv. | ||
workfolder | /tmp/i2b2 | Folder on the i2b2 server for data exports | -1 | Maximum number of rows to export, or -1 for no limit
CRC | edu.harvard.i2b2.crc.exportcsv. | ||
zipencryptmethod | \n | Line ending for export files | |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultseperator | \t | Field separator for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.resultfetchsize | 50000 | Number of records retrieved during each database fetch. |
CRC | edu.harvard.i2b2.crc.exportcsv.filename | {{{PROJECT_ID}}}/{{{DATE_yyyyMMdd}}}_{{{FULL_NAME}}}.tsv | Parameterized template for export file names. If the extension is .zip, the file is zipped. |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultquotechar | " | Quote character for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.workfolder | /tmp/i2b2 | Folder on the i2b2 server for data exports |
CRC | edu.harvard.i2b2.crc.exportcsv.zipencryptmethod | none | Encryption method for the exported ZIP file. One of STANDARD, NONE, or AES. |
Export query definition variables
These variables can be used in HIVE_CELL_PARAMS entries and in query export definitions:
none | Encryption method for the exported ZIP file. One of STANDARD, NONE, or AES. |
PM_PROJECT_PARAMS
The parameters for the emails sent to the project's Data Manager about a data request and to the requesting user when export is complete are set on a per-project basis.
Parameter Name | Example Value | Notes |
---|---|---|
Data Request Template | This user {{{USER_NAME}}} in project {{{PROJECT_ID}}} requested ... | Template text for the body of the data request message; supports placeholders like {{{USER_NAME}}} and {{{PROJECT_ID}}} . |
Data Request Email Address | someone@somewhere.org | E-mail address of the Data Manager, where the data request e-mail will be sent. |
Data Request Letter | "Results of the i2b2 request entitled - '{{{QUERY_NAME}}}', ..." | Template text for the e-mail to the user announcing the successful completion of the data export; supports placeholders like {{{QUERY_NAME}}} . |
Data Request Subject | i2b2 Data Request | Subject line for the data request e-mail. |
Info |
---|
The e-mail address for the data request letter (sent when export is complete) is entered in the Query Options Email box at the time of data request submission |
Template substitution variables (parameters/placeholders inside VALUE)
Placeholder | Meaning (inferred) | Example |
---|---|---|
{{{USER_NAME}}} | Display name / username of the requester | Jon Smith |
{{{PROJECT_ID}}} | i2b2 project ID in which the request was run | ACT |
{{{QUERY_NAME}}} | Name/title of the query the user executed | Hypertension Adults |
{{{QUERY_STARTDATE}}} | Timestamp when the query was submitted | 2025-09-19 09:42 |
Value | Description | |
{{{USER_NAME}}} | Current user running query | |
{{{PROJECT_ID}}} | The project name | |
{{{RANDOM_xxx}}} | A random integer where xxx is the max integer size | |
{{{DATE_xxx}}} | Date/Time format, where xxx can be from the table below. The string yyyy-MM-dd would output dates that look like 2022-07-26 | |
{{{QUERY_NAME}}} | Query Name | |
{{{QUERY_STARTDATE}}} | Query Start date/time | |
{{{QUERY_ENDDATE}}} | Query End date/time | |
{{{PATIENT_COUNT}}} | Number of patients. | {{{FULL_NAME}}} | Fullname of user who ran query
{{{QUERY_MASTER_ID}}} | ||
Internal i2b2 Query Master ID of the request | 5412 | |
{{{ | ||
PATIENT_ | ||
COUNT}}} | Time to run the query | |
{{{FULL_SCHEMA }}} | Used in SQL export definitions. Inserts the full schema name (e.g., 'i2b2.dbo.'). Needed to be sure i2b2 accesses the correct data tables! |
Table of allowable characters in {{{DATE_xxx}}} strings:
...
Request and Export Results:
Metadata about data request and export queries are stored in the QT_XML_RESULT table, in the XML_VALUE field:
...
Number of patients returned by the request | 1999 |
Design and Architecture
The i2b2 breakdown architecture is modified to support the data table definition and new breakdown types for User created data requests. The database tables have been modified to support the data table definitions.
- RPDO_TABLE_REQUEST table stores the data table definitions
- HIVE_CELL_PARAMS has new parameters for global and email configurations and the data file generation location (defaults to the server's drive)
- PM_PROJECT_PARAMS has new parameters for e-mail notifications of data export
- QT_RESULT_TYPE table will contain a new entry for each data table definition
- QT_BREAKDOWN_PATH will contain a new entry for each data table definition which has the data export execution details.
- QT_XML_RESULT contains metadata about each data request, such as status and e-mail details.
User Roles and Actions
User visibility to Data Table Creation, Data Request and Data Export is based on User role configuration. Configuration is managed by the Admin user using Admin Dashboard plugin.
Below is the list of Actions available based on the User role.
User Role | Create Data Table | Data Request | Data Request Manager View Details | Data Export/Generate Data file | Change Request Status |
---|---|---|---|---|---|
Obfuscated | |||||
Aggregate | |||||
Protected | Enabled | Enabled | Enabled | ||
LDS | Enabled | Enabled | Enabled | ||
Manager/No LDS | Enabled | Enabled | Enabled | ||
Manager/LDS | Enabled | Enabled | Enabled | Enabled | Enabled |
Admin | Enabled | Enabled | Enabled | Enabled | Enabled |
Database Configuration changes
RPDO_TABLE_REQUEST
In order to support the data Table design, a new table RPDO_TABLE_REQUEST is added. The metadata in Data Table design is stored in the table when the data Table is saved.
Three default rows are initially loaded into the RPDO_TABLE_REQUEST, defining patient_dimension columns which are required in every export table. These are: race_cd, age_in_years_num, and sex_cd
QT_BREAKDOWN_PATH
QT_Breakdown_Path table has been modified to include a new column Group_id. It stores the Project ID, in order to support project-scoped data table definitions.
Column Name | Data Type |
---|---|
Group_Id | Varchar (50) |
Database updates- Data table definition creation
The database is dynamically modified when the user creates the data table definition or submits a data request
RPDO_TABLE_REQUEST
Rows for the data table design are dynamically inserted in the RPDO_TABLE_REQUEST when a data table is saved.
RPDO_TABLE_REQUEST PARAMETERS | DATA TABLE Values |
---|---|
TABLE_REQUEST_ID | auto generated incremental value, corresponding to each concept in the data table. Unique for each concept example: 1282,1283, 1284 |
TABLE_INSTANCE_ID | unique for each Data table ( same value for all the concepts underneath) example: 17 |
TABLE_INSTANCE_NAME | Data table definition name |
USER_ID | data request user |
GROUP_ID | project id |
SET_INDEX | auto incremented for each concept row for the table instance |
C_FACTTABLECOLUMN | "concept_cd" for the fact table or "patient_num" for the patient_dimension |
C_TABLENAME | corresponds to table storing concept_cd/patient_num( concept_dimension or patient_dimension ) |
COLUMN_NAME | The column name of the data table definition |
C_FULLPATH | corresponds to the concept's c_fullname in the ontology table |
C_COLUMNNAME | usually "concept_path" for concept_dimension |
C_COLUMNDATATYPE | corresponds to ontology table column value |
C_OPERATOR | corresponds to ontology table column value |
C_DIMCODE | corresponds to ontology table column value |
AGG_TYPE | corresponds to Aggregate option selection in data table definition |
CONSTRAIN_BY_DATE_TO | corresponds to date constraint option selection in data table definition |
CONSTRAIN_BY_DATE_FROM | corresponds to date constraint option selection in data table definition |
CONSTRAIN_BY_VALUE_OPERATOR | corresponds to set value constraint selection in data table definition |
CONSTRAIN_BY_VALUE_CONSTRAINT | corresponds to set value constraint selection in data table definition |
CONSTRAIN_BY_VALUE_UNIT_OF_MEASURE | corresponds to set value constraint selection in data table definition |
CONSTRAIN_BY_VALUE_TYPE | corresponds to set value constraint selection in data table definition |
SHARED | Y (for project and system shared table definitions); N (User table definition) |
DELETE_FLAG | Y (if deleted) or N |
C_VISUALATTRIBUTES | values are LA or LH (based on visibility table definitions in the panel) |
REQUIRED | Y for default variables ( race, age, sex); N for other variables |
JSON_DATA | logs the table definition variables details needed for the execution of the stored procedure to create the data file |
GENERATED_SQL | logs the SQL query that is generated by stored procedure call for each variable |
DELETE_DATE | date of table definition deletion |
CREATE_DATE | date of table definition creation |
UPDATE_DATE | date table last updated |
USE_AS_COHORT | N by default; not used currently |
Example:
QT_QUERY_RESULT_TYPE
The data table is logged as a breakdown type in the QT_QUERY_RESULT_TYPE table when the user saves the data table definition.
RESULT_TYPE_ID | NAME | DESCRIPTION | DISPLAYTYPE_ID | VISUAL_ATTRIBUTE_TYPE_ID | USER_ROLE_CD | CLASSNAME |
Auto generated number example : 144 | RPDO_<TABLE_INSTANCE_ID> example: RPDO_17 | User-created <Table_Name>Request | CATNUM | LU (User tables) LP (Project shared tables) | DATA_LDS | edu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientRequest |
Example:
Info |
---|
We recommend data requests be limited to DATA_LDS users and data exports be limited to MANAGER users. The user role determines if exports/requests are visible and runnable. Refer to section on User Roles and Actions |
QT_BREAKDOWN_PATH
A new row is also added here when the user saves the data table definition.
- Name column references the Table_instance_id when the user saves the table_definition.
- Value column contains the command to run the stored procedure that generates the data file.
- Group_id has the project ID
- The result_instance_id parameter gets the numeric value from the QUERY_RESULT_INSTANCE for the patientset (result_type_id =1) when the export file is created.
Example:
NAME | VALUE | Group_id |
---|---|---|
RPDO_<Table_instance_id> Example: RPDO_17 | EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=<Table_instance_id> @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}} @MIN_ROW=0 @MAX_ROW=10000 | SQLServerLarge |
Example:
Database updates- Data request submission
1.8.2 requests
QT_QUERY_RESULT_INSTANCE
When the User runs a query with User created Data Request breakdown option selected, a row is logged in the QT_Query_Result_Instance for the result_type_id generated in the QT_RESULT TYPE
Column Name | Value (Example) |
---|---|
Result_Type_ id | 144 |
Result_instance_id | 1729 |
Example:
QT_BREAKDOWN_PATH
The EXEC statement in Value column is updated with value for table_instance_id
Example:
NAME | VALUE |
---|---|
RPDO_<Table_instance_id> Example: RPDO_144 | EXEC i2b2synthea8.dbo.usp_rpdo2 @TABLE_INSTANCE_ID=17 @RESULT_INSTANCE_ID={{{RESULT_INSTANCE_ID}}},@MIN_ROW=0 @MAX_ROW=99999999 |
{{{RESULT_INSTANCE_ID}}} gets replaced with QT_QUERY_RESULT_INSTANCE.Result_instance_id as the stored-procedure in the back-end gets executed.
at the time of data file creation.
QT_XML_RESULT
A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries are stored in the QT_XML_RESULT table, in the XML_VALUE field. The metadata gets updated as the data export process is completed (from data request submission to data file creation)
Column Name | Value (example) |
---|---|
Result_Instance_Id | 1729 |
XML_Value | <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <ns10:i2b2_result_envelope xmlns:ns6="http://www.i2b2.org/xsd/cell/crc/psm/analysisdefinition/1.1/" xmlns:ns5="http://www.i2b2.org/xsd/hive/msg/1.1/" xmlns:ns8="http://www.i2b2.org/xsd/cell/pm/1.1/" xmlns:ns7="http://www.i2b2.org/xsd/cell/crc/psm/querydefinition/1.1/" xmlns:ns9="http://www.i2b2.org/xsd/cell/ont/1.1/" xmlns:ns10="http://www.i2b2.org/xsd/hive/msg/result/1.1/" xmlns:ns2="http://www.i2b2.org/xsd/hive/pdo/1.1/" xmlns:ns4="http://www.i2b2.org/xsd/cell/crc/psm/1.1/" xmlns:ns3="http://www.i2b2.org/xsd/cell/crc/pdo/1.1/"> <body> <ns10:result name="RPDO_17"> <data column="SUBMITTED" type="string">20250808_125429</data> <data column="EMAIL" type="string">rmetta@mgb.org</data> <data column="QUEUED" type="string">20250808_125631</data> <data column="PROCESSING" type="string">20250808_125631</data> <data column="DIRECTORY" type="string">/opt/dataexport/SQLServerLarge/724</data> <data column="FINISHED" type="string">20250808_125633</data> <data column="APPROVEDBY" type="string">demoManager</data> </ns10:result> </body> </ns10:i2b2_result_envelope> |
Example:
1.8.1 Requests
3 new default rows are logged in RPDO_TABLE_REQUEST when user runs a query and submits 1.8.1 data request for the first time . This will enable the 1.8.1 requests to be displayed on data request Manager panel along with 1.8.2 requests. Subsequent requests utilize the same default rows for display.
RPDO_TABLE_REQUEST
QT_QUERY_RESULT_TYPE
A row is logged in QT_RESULT_TYPE for the default table definition. Subsequent data request submissions utilize the same row for the default rows result_type_id
QT_RESULT_INSTANCE
A row is logged in the QT_Query_Result_Instance for the same result_type_id generated in the QT_RESULT TYPE every time user submits a request
QT_XML_RESULT
A row is logged for the result_instance_id of the data request submitted at the time of query run. Metadata about data request queries(including user email address) are stored in the QT_XML_RESULT table, in the XML_VALUE field. The metadata gets updated as the data export process is completed (from data request submission to data file creation)
Data file creation
1.8.2 Data file creation
CREATE FILES(s) executes usp_rpdo2 stored procedure in the back end by calling the EXEC statement in QT_BREAKDOWN_PATH to create the data file.
- The select statement in the VALUE column is executed to create the data file The result_instance_id value of the query is inserted dynamically each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH
1.8.1 Data file creation
CREATE FILES(s) executes the <data request name>.csv in QT_BREAKDOWN_PATH.
- The select statement in the VALUE column is executed for the default table data file creation. The result_instance_id value of the query is inserted dynamically each time of data file is created for the same table_instance_id. and does not store in the QT_BREAKDOWN_PATH
- The select statement in the VALUE column of the <data request name>.CSV for the data request is executed to create the data file.
Info |
---|
Currently,1.8.1 request submission send out one email per request selected to the User. |
Software Changes:
- Data: Entries in RPDO_TABLE_REQUEST, HIVE_CELL_PARAMS, PM_PROJECT_
Software Changes:
- Data: New entries in HIVE_CELL_PARAMS, QT_QUERY_RESULT_TYPE, and QT_BREAKDOWN_PATH, QT_XML_RESULT define the data exporter configurationfunctionality.
- Java code: New breakdown classes and updates to existing java classes to support the data exporter functionality.
...