Page History
...
When the CRC data is installed via ant, a new SQL script updates the age_in_years_num in the patient dimension based on the birth dates of the sample patients. As a reminder, this load process can be triggered with ant -f data_build.xml db_demodata_load_data in the CRC directory of NewInstall.
Concept Dimension Updater
Insert_Concept_FROMTableAccess is designed to populate concept_dimenison table using Table_access table records.
The stored procedure loops through the table_access and inserts values from each metadata table (specified in the c_table_name column), when
c_dimtablename is set to 'concept_dimension'
Example usage: exec Insert_Concept_FROMTableAccess
...
Synthetic patient data generated by Synthea can be loaded into i2b2. The Synthea SyntheticMass sample files have been converted to i2b2-ACT format, and scripts to load Synthea data from scratch are available here: https://github.com/i2b2/i2b2-synthea
Synthea Load Process
...
- Set up an i2b2 project with the ACT ontology.
- Either download the SyntheticMass 63k sample in i2b2 format from https://github.com/i2b2/i2b2-synthea/blob/main/syntheamass_63K_sample.zip, or follow the instructions below to load any Synthea dataset from scratch. This information can also be found on the Synthea-i2b2 Community Project page.
Loading Synthea data from scratch
- Download SyntheticMass Data, Version 2 (24 May, 2017)
- All data sets (1k, COVID 10k, COVID 100k) have been verified to work EXCEPT the 100k patients in the large SyntheticMass Version 2 download.
- The 100k patients in the large SyntheticMass Version 2 download needs an extra step to delete invalid records before import. In this case, download synthea_cleanup.pl to your disk, and then run "synthea_cleanup <directory-for-synthea-csv-files>" The fixed csv files will be in <directory-for-synthea-csv-files>/fixcsv.2. Set up an i2b2 project with the ACT ontology.
- Download the scripts from https://github.com/i2b2/i2b2-synthea)
- Run
create_synthea_table_<your dbServertype>.sql
in your project to create the Synthea tables. - Import the Synthea data you downloaded in step one into the Synthea tables in your project.
- Load the i2b2-to-SNOMED table in this repository into your project. https://www.nlm.nih.gov/healthit/snomedct/us_edition.html
- Click on the "Download SNOMED-CT to ICD-10-CM Mapping Resources" link to download. (You will need a UMLS account.)
- Unzip the file
- Import the TSV file into a table called SNOMED_to_ICD10 in your database.
- In Postgres and Oracle, follow the additional instructions in the comments at the top of
synthea_to_i2b2_<your dbServerType>.sql
to clean up the date formatting. - Run
synthea_to_i2b2_<your dbServertype>.sql
to convert synthea data into i2b2 tables (this will truncate your existing fact and dimension tables!)- Replace references to
i2b2metadata.dbo
in the script. Use the database and schema where your ACT ontology tables are.
- Replace references to
...
Note |
---|
The CPT4 ontology table is not included with i2b2 due to AMA restrictions on redistribution of CPT code information. Contact the ACT team to get a copy if your institution is an AMA member. |
ACT4 data load process
...
- Download the newinstall zip package from https://www.i2b2.org/software/download.html?d=452
- Extract the metadata\act folder from the downloaded zip folder
- Replace edu.harvard.i2b2.data\Release_1-7\NewInstall\Metadata\act folder with extracted new act folder
- Edit the db.properties file in your metadata folder to update the project properties to 'ACT' ; db.project=ACT
- From the edu.harvard.i2b2.data\Release_1-7\NewInstall\Metadata folder, run the ant command: ant -f data_build.xml db_metadata_load_data
- This will execute the SQL scripts from the edu.harvard.i2b2.data\Release_1-7\NewInstall\Metadata\act\scripts\<db type> folder and create and load ACT4 Ontology metadata tables
- You can now verify the new Ontology by logging into the webclient.
...