Message-ID: <1766797223.8484.1711701698033.JavaMail.confluence@ip-172-30-4-17.ec2.internal> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_8483_109767883.1711701698030" ------=_Part_8483_109767883.1711701698030 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Synthea Data in i2b2 Home

Synthea Data in i2b2 Home

=20
=20
=20
=20

Synthetic patient data generated by Synthea can now be loaded into i2b2.= The Synthea SyntheticMass sample files have been converted to i2b2-ACT for= mat, and scripts to load Synthea data from scratch are available here: = ;https://github.com/i2b2/i2= b2-synthea

Synthea Load Process:

  1. Set up an i2b2 project with the ACT onto= logy.
  2. Either download the SyntheticMass 63k sample in i2b2 format from&= nbsp;https://github.com/i2b2/i2b2-synthea/blob/main/synthe= amass_63K_sample.zip, or follow the instructions below to load any Synt= hea dataset from scratch.

Loading Synt= hea data from scratch

  1. Download SyntheticMass Data, Version 2 (24 May, 2017) 
    • All data sets (1k, COVID 10k, COVID 100k) have been verified to work EX= CEPT the 100k patients in the large SyntheticMass Version 2 download.
    • The 100k patients in the large SyntheticMass Version 2 download needs a= n extra step to delete invalid records before import. In this case, downloa= d synthea_cleanup.pl to your disk= , and then run "synthea_cleanup <directory-for-synthea-csv-files>= ;" The fixed csv files will be in <directory-for-synthea-csv-files>/f= ixcsv.2. Set up an i2b2 project with the ACT = ontology.
    • Download the scripts from https://github.com/i2b2/i2b2-synthea)
  2. Run create_synthea_table_<your dbServertype&= gt;.sql in your project to create the Synthea tabl= es.
  3. Import the Synthea data you downloaded in step one into the Synthea tab= les in your project.
  4. Load the i2b2-to-SNOMED table in this repository into your project. https://www.nlm.nih.gov/healthit/snomedct/us_edition.html
    • Click on the "Download SNOMED-CT to ICD-10-CM Mapping Resources" link t= o download. (You will need a UMLS account.)
    • Unzip the file
    • Import the TSV file into a table called SNOMED_to_ICD10 in your databas= e.
  5. In Postgres and Oracle, follow the additional instructions in the comme= nts at the top of synthea_to_i2b2_<your dbServer= Type>.sql to clean up the date formatting.
  6. Run synthea_to_i2b2_<your dbServertype>.s= ql to convert synthea data into i2b2 tables (this = will truncate your existing fact and dimension tables!)
    • Replace references to i2b2metadata.dbo in the script. Use the database and schema where your ACT = ontology tables are.


=20
=20
=20
=20
=20
=20

Recent space activity<= /h2>

=20
=20 =20
=20
=20 =20
=20
=20

=20
=20
=20
=20

Space contributors

=20
=20
=20
=20
=20
=20


=20
=20
=20
------=_Part_8483_109767883.1711701698030 Content-Type: image/svg+xml Content-Transfer-Encoding: 7bit Content-Location: file:///C:/bfb3c6fb93263c190a876addc2b4da8b Default user Created with Sketch. ------=_Part_8483_109767883.1711701698030--