=20
=
p>
Synthetic patient data generated by Synthea can now be loaded into i2b2.=
The Synthea SyntheticMass sample files have been converted to i2b2-ACT for=
mat, and scripts to load Synthea data from scratch are available here: =
;https://github.com/i2b2/i2=
b2-synthea
Synthea Load Process:
- Set up an i2b2 project with the ACT onto=
logy.
- Either download the SyntheticMass 63k sample in i2b2 format from&=
nbsp;https://github.com/i2b2/i2b2-synthea/blob/main/synthe=
amass_63K_sample.zip, or follow the instructions below to load any Synt=
hea dataset from scratch.
Loading Synt=
hea data from scratch
- Download SyntheticMass Data, Version 2 (24 May, 2017)
- All data sets (1k, COVID 10k, COVID 100k) have been verified to work EX=
CEPT the 100k patients in the large SyntheticMass Version 2 download.
- The 100k patients in the large SyntheticMass Version 2 download needs a=
n extra step to delete invalid records before import. In this case, downloa=
d synthea_cleanup.pl to your disk=
, and then run "synthea_cleanup <directory-for-synthea-csv-files>=
;" The fixed csv files will be in <directory-for-synthea-csv-files>/f=
ixcsv.2. Set up an i2b2 project with the ACT =
ontology.
- Download the scripts from https://github.com/i2b2/i2b2-synthea)
- Run
create_synthea_table_<your dbServertype&=
gt;.sql
in your project to create the Synthea tabl=
es.
- Import the Synthea data you downloaded in step one into the Synthea tab=
les in your project.
- Load the i2b2-to-SNOMED table in this repository into your project. https://www.nlm.nih.gov/healthit/snomedct/us_edition.html
- Click on the "Download SNOMED-CT to ICD-10-CM Mapping Resources" link t=
o download. (You will need a UMLS account.)
- Unzip the file
- Import the TSV file into a table called SNOMED_to_ICD10 in your databas=
e.
- In Postgres and Oracle, follow the additional instructions in the comme=
nts at the top of
synthea_to_i2b2_<your dbServer=
Type>.sql
to clean up the date formatting.
- Run
synthea_to_i2b2_<your dbServertype>.s=
ql
to convert synthea data into i2b2 tables (this =
will truncate your existing fact and dimension tables!)
- Replace references to
i2b2metadata.dbo
in the script. Use the database and schema where your ACT =
ontology tables are.