Overview
The Data Exporter functionality enables an i2b2 user to request patient data for the patients found by a desired query. In the configured workflow, data is requested and generated through new breakdown types that can be selected in the Run Query dialog. Users will see "data request" breakdowns, which send an e-mail to a predefined data manager requesting a type of data. The data manager (Manager user role) processes the request by rerunning the query with a special "data export" breakdown. The patient data is exported as a file and stored in a specified location for retrieval.
The patient data export types, request options, and file specifications are configured in the database using specific request type entries and configuration parameters in the database.
Data Export Workflow
Data Request (User Process)
- User logs into webclient and creates and runs a query.
- In the Run Query dialog box, select one or more Data Request checkboxes
- An email is automatically sent to specified manager user to run the user's export request, and the previous query name is tagged with the query master id.
Data Export (Manager Process)
The manager logs into the webclient and finds the query from their e-mail by locating the query master id, which is now part of the title of the previous query.
Manager user then re-runs the previous query and clicks on export. The export option is displayed selected in the Data export breakdown list.
Data export runs in the background and file is generated in a specified folder
- Manger user manually sends a confirmation email to the user on the file generation and retrieval info.
Sample LetterResults of the i2b2 request entitled - "Demographics_12_48_52", submitted on Apr 18, 2024 1:47:58 PM, are available. Important notes about your data: Only persons specifically authorized and selected (as listed at the top of this letter) can download these files. If additional user access is needed, please ensure the person is listed on your project IRB protocol and contact the i2b2 team. Specifically: Your guideline for the storage of Protected Health Information can be found at: https://www.site.com/guidelines_for_protecting_and_storing_phi.pdf *To download these files* These results are the data that was requested under the authority of the Institutional Review Board. The query resulting in this identified patient data is included at the end of this letter. A copy of this letter is kept on file and is available to the IRB in the event of an audit. Thank you, The i2b2 Team |
---|
Sample Letter generated for Data Manager
Example Export files (all data are fake)
Demographics
Diagnosis
Data Export Configuration
Design and Architecture
The i2b2 breakdown architecture has been modified to support new breakdown types for data request and export. These display in the run query dialog corresponding to the entries in the QT_RESULT_TYPE table. When a manager user exports user request option, an export file is generated based on the entries in QT_BREAKDOWN_PATH table. Email notification sent to the manager is also based on the entries in the QT_BREAKDOWN_PATH table. The generated file is formatted as per the entries in HIVE_CELL_PARAMS.
Database changes
Currently QT_BREAKDOWN_PATH .value column has datatype of varchar(2000). In order to support the XML structure needed for the Data Exporter, the value column has been modified to CLOB datatype.
Configuring Database Parameters
Request/Export query types: QT_QUERY_RESULT_TYPE
The data request and data export query types are configured as breakdowns in the QT_QUERY_RESULT_TYPE table. VISUAL_ATTRIBUTE_TYPE_ID should be set to LR for Data request and LX for Data export
USER_ROLE_CD should be set to DATA_LDS for Data Request and USER_ROLE_CD should be MANAGER for Data Export.
NAME | DESCRIPTION | DISPLAYTYPE_ID | VISUAL_ATTRIBUTE_TYPE_ID | USER_ROLE_CD | CLASSNAME |
PATIENT_DEMOGRAPHIC_REQUEST | Request Demographics Data | CATNUM | LR | DATA_LDS | edu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientRequest |
PATIENT_DEMOGRAPHIC_CSV | Export Demographics Data | CATNUM | LX | MANAGER | edu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientDownload |
Request/Export file definition details: QT_BREAKDOWN_PATH
QT_BREAKDOWN_PATH defines e-mail content (for request and export) and SQL code (for export) in the following XML format:
Name | Value |
---|---|
| <?xml version="1.0"?> <ValueExporter> <Version>[...version number...]</Version> <CreationDateTime>08/09/2024 12:00:00</CreationDateTime> <RequestLetter>[...details sent to Manager on user data request...] </RequestLetter> </ValueExporter> |
| <?xml version="1.0"?> <ValueExporter> <Version>[...version number...]</Version> <CreationDateTime><date value></CreationDateTime> <RequesterEmailLetter>details sent to user on data request</RequesterEmailLetter> <LetterFilename><name of the file></LetterFilename> <Letter> <letter content></Letter> <Table> <Filename>filename</Filename> <Query>[... SQL query to generate the data export content...]</Query> <SeparatorCharacter>[...file format seperator...]</SeparatorCharacter> </Table> </ValueExporter> |
The data manager receives both the Requester Email and the Request Emails. Data Manager will then forward the Requester email to the data user that initiated the request.
Email content and export SQL are configurable in the XML entries for the request/export definitions.
Example Request types
NAME | VALUE |
PATIENT_MEDICATION_CSV | <?xml version="1.0"?> <ValueExporter> <Version>3.02</Version> <CreationDateTime>08/09/2024 12:00:00</CreationDateTime> <RequesterEmailLetter>Your request on {{{PROJECT_ID}}} requested i2b2 request entitled - "{{{QUERY_NAME}}}", submitted on {{{QUERY_STARTDATE}}}, with the query master of {{{QUERY_ID}}}</RequesterEmailLetter> <RequestLetter>This user {{{USER_NAME}}} in project {{{PROJECT_ID}}} requested i2b2 request entitled - "{{{QUERY_NAME}}}", submitted on {{{QUERY_STARTDATE}}}, with the query master of {{{QUERY_ID}}}. </RequestLetter> <LetterFilename>/{{{USER_NAME}}}/{{{QUERY_MASTER_ID}}}/Readme.txt</LetterFilename> <Letter> Results of the i2b2 request entitled - "{{{QUERY_NAME}}}", submitted on {{{QUERY_STARTDATE}}}, are available. Important notes about your data: - Total number of patients returned in your data request: {{{PATIENT_COUNT}}} - i2b2 reviewer: Only persons specifically authorized and selected (as listed at the top of this letter) can download these files. If additional user access is needed, please ensure the person is listed on your project IRB protocol and contact the i2b2 team. Specifically: - Remove all PHI from computer, laptop, or mobile device after analysis is completed. - Do NOT share PHI or PII with anyone who is not listed on the IRB protocol. Your guideline for the storage of Protected Health Information can be found at: https://www.site.com/guidelines_for_protecting_and_storing_phi.pdf *To download these files* - You must be logged onto your site These results are the data that was requested under the authority of the Institutional Review Board. The query resulting in this identified patient data is included at the end of this letter. A copy of this letter is kept on file and is available to the IRB in the event of an audit. Thank you, The i2b2 Team </Letter> <Table> <Filename>/{{{USER_NAME}}}/{{{QUERY_MASTER_ID}}}/Medication.csv</Filename> <Query>SELECT to_char(a.PATIENT_NUM) as "I2B2_PATIENT_NUMBER" ,a.start_date as "START_DATE" ,a.start_date as "END_DATE" ,b.name_char as "MEDICATION_NAME" ,b.concept_cd as "NDC_CODE" ,a.units_cd as "UNIT" ,a.quantity_num as "DOSE_QUANTITY" ,a.modifier_cd as "MODIFIER" ,m.name_char as "MODIFIER_NAME" ,a.instance_num as "INSTANCE_NUM" ,a.location_cd as "FACILITY" ,case v.inout_cd when 'O' then 'Outpatient' when 'I' then 'Inpatient' else 'Unknown' end as "ENCOUNTER_TYPE" ,p.name_char as "PROVIDER" ,a.encounter_num as "ENCOUNTER_NUMBER" FROM observation_fact a INNER JOIN concept_dimension b on a.concept_cd = b.concept_cd and b.concept_path like '\i2b2\Medications\%' JOIN {{{DX}}} c on a.patient_num = c.patient_num LEFT OUTER JOIN provider_dimension p on a.provider_id = p.provider_id JOIN visit_dimension v on a.encounter_num = v.encounter_num and a.patient_num = v.patient_num LEFT OUTER JOIN modifier_dimension m on a.modifier_cd = m.modifier_cd</Query> <SeparatorCharacter>\t</SeparatorCharacter> </Table> </ValueExporter> |
PATIENT_MEDICATION_REQUEST | <?xml version="1.0"?> <ValueExporter> <Version>3.02</Version> <CreationDateTime>08/09/2024 12:00:00</CreationDateTime> <RequesterEmailLetter>Your request on {{{PROJECT_ID}}} requested i2b2 request entitled - "{{{QUERY_NAME}}}", submitted on {{{QUERY_STARTDATE}}}, with the query master of {{{QUERY_MASTER_ID}}}</RequesterEmailLetter> <RequestLetter>This user {{{USER_NAME}}} in project {{{PROJECT_ID}}} requested i2b2 request entitled - "{{{QUERY_NAME}}}", submitted on {{{QUERY_STARTDATE}}}, with the query master of {{{QUERY_MASTER_ID}}}. </RequestLetter> </ValueExporter> |
Export file format/location and Email parameters set-up: HIVE_CELL_PARAMS
Export File format, location and email server parameters are configurable in the HIVE_CELL_PARAMS
Cell ID | Parameter Name | Example Value | Notes |
CRC | edu.harvard.i2b2.crc.exportcsv.datamanageremail | userid@partners.org | Email address used for sending request/export e-mails |
CRC | edu.harvard.i2b2.crc.smtp.host | smtp.partners.org | SMTP host |
CRC | edu.harvard.i2b2.crc.smtp.port | 25 | SMTP port |
CRC | edu.harvard.i2b2.crc.smtp.ssl.enable | FALSE | TRUE will enable SSL |
CRC | edu.harvard.i2b2.crc.smtp.auth | FALSE | TRUE will enable SMTP authentication |
CRC | edu.harvard.i2b2.crc.smtp.username | none | SMTP username (required for SMTP authentication) |
CRC | edu.harvard.i2b2.crc.smtp.password | none | SMTP password (required for SMTP authentication) |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultescapecharacter | " | Escape character for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.maxfetchrows | -1 | Maximum number of rows to export, or -1 for no limit |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultlineend | \n | Line ending for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultseperator | \t | Field separator for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.resultfetchsize | 50000 | ?????? |
CRC | edu.harvard.i2b2.crc.exportcsv.filename | {{{PROJECT_ID}}}/{{{DATE_yyyyMMdd}}}_{{{FULL_NAME}}}.tsv | Parameterized template for export file names |
CRC | edu.harvard.i2b2.crc.exportcsv.defaultquotechar | " | Quote character for export files |
CRC | edu.harvard.i2b2.crc.exportcsv.workfolder | /tmp/i2b2 | Folder on the i2b2 server for data exports |
CRC | edu.harvard.i2b2.crc.exportcsv.zipencryptmethod | none | ????????? |
These variables can be used in HIVE_CELL_PARAMS entries and in export query letter definitions:
Value | Description |
{{{USER_NAME}}} | Current user running query |
{{{PROJECT_ID}}} | The project name |
{{{RESULT_INSTANCE_ID}}} | The result instance ID <--- THIS IS COMMENTED OUT IN THE CODE https://github.com/i2b2/i2b2-core-server/blob/668813d1b50d6299d003b4418b9e0f61b4cea842/edu.harvard.i2b2.crc/src/server/edu/harvard/i2b2/crc/dao/setfinder/QueryResultPatientRequest.java#L483 |
{{{RANDOM_xxx}}} | A random integer where xxx is the max integer size |
{{{DATE_xxx}}} | Date/Time format, where xxx can be from the table below. The string yyyy-MM-dd would output dates that look like 2022-07-26 |
{{{QUERY_NAME}}} | Query Name |
{{{QUERY_STARTDATE}}} | Query Start date/time |
{{{QUERY_ENDDATE}}} | Query End date/time |
{{{PATIENT_COUNT}}} | Number of patients |
{{{FULL_NAME}}} | Fullname of user who ran query |
{{{QUERY_MASTER_ID}}} | The query Master ID |
{{{QUERY_RUNTIME}}} | Time to run the query |
Table of allowable characters in {{{DATE_xxx}}} strings:
Value | Description | Value | Description |
G | era | K | hour-of-am-pm (0-11) |
u | year | k | clock-hour-of-am-pm (1-24) |
y | year-of-era | H | hour-of-day (0-23) |
D | day-of-year | m | minute-of-hour |
M/L | month-of-year | s | second-of-minute |
d | day-of-month | S | fraction-of-second |
Q/q | quarter-of-year | A | milli-of-day |
Y | week-based-year | n | nano-of-second |
w | week-of-week-based-year | N | nano-of-day |
W | week-of-month | V | time-zone ID |
E | day-of-week | z | time-zone name |
e/c | localized day-of-week | O | localized zone-offset |
F | week-of-month | X | zone-offset 'Z' for zero |
a | am-pm-of-day | x | zone-offset |
h | clock-hour-of-am-pm (1-12) | Z | zone-offset |
p | pad next |
Currently, the export file generation location is designed to be within the application Server hosting the i2b2 application
Java Changes
Two new breakdown classes are designed to perform the file export process in addition to update to existing java classes to support the data exporter functionality.
- edu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientRequest
- edu.harvard.i2b2.crc.dao.setfinder.QueryResultPatientDownload
Limitations in current functionality
The current data exporter functionality has the following design limitations
- QT_BREAKDOWN_PATH table configuration for Request/Export file XML definition support is limited to SQL statements only.
- Data Export file location configuration is supported on the local machine hosting the application.
- Requester email is configurable to be generated as a txt file in the data export file directory The data Manager will be required to manually copy the file content into an email and send it to the data request User.
- The default export file location and format should be set in the HIVE_CELL_PARAMS and requires a restart of the wildfly for the changes to take effect.