A Digital Twin is a concise, current, and true representation “in silico” of a functioning real-world entity. With origins in industry manufacturing and design, it can be used to assist the assembly of many complex and interacting parts prior to an analysis. In healthcare, the creation of a Digital Twin of a person consists of assembling data from many sources and calculating the assembled result to obtain an accurate representation of an individual. That representation can then be used with the assembles of other persons to run “in silico” studies. The Digital Twin makes possible more accurate population studies based upon real world data (RWD). Performing an extra layer of data reconciliation in the form of producing Digital Twins allows a population study based in Digital Twins to arrive at more accurate conclusions than when using raw data.
This bundle, distributed in a beta version, will allow i2b2 administrators to try the i2b2 Digital Twin Tools:
1) Loyalty Cohorts: Determination that data completeness is sufficient for creation of a Digital Twin. This is done through calculation of a “loyalty cohort” to assured that most of the care is received in the hospital systems producing the data set that is used for calculation of the twin. This step will provide the logic to exclude the conditions the individual does NOT have, as well as assure there is sufficient data to calculate the conditions that the individual does have.
2) Computational Phenotypes: We have previously found that half of patients with an ICD-9 or ICD-10 diagnosis code in the electronic health record (EHR) for Type 2 Diabetes (T2DM) do not actually have the disease. The code for T2DM thus has low "precision" for predicting the patient's true condition or "phenotype". Most diagnosis codes have this problem to varying degrees. One consequence of this is that clinical trials overestimate the number of eligible patients from the EHR. As a result, the trials have low yield in recruiting patients and are slow or unable to meet enrollment targets.
Installation:
The Digital Twin tools are in the i2b2-digitaltwin repository: https://github.com/i2b2/i2b2-digitaltwin. Steps for installation:
1) Download a copy of the repository. In the Release_1-8/NewInstall/Crcdata directory, edit the db.properties file as is done to install other i2b2 data components. Note that the project should be set to act, as some components require the ENACT ontology.
2) Run the following ant targets to install both loyalty cohorts and computational phenotypes.
Linux Run Command
ant -f data_build.xml create_crcdata_digitaltwin_tables_release_1-8
ant -f data_build.xml create_crcdata_digitaltwin_procedures_release_1-8
ant -f data_build.xml db_digitaltwin_load_data
Windows Run Command
%ANT_HOME%\bin\ant.bat -f data_build.xml create_crcdata_digitaltwin_tables_release_1-8
%ANT_HOME%\bin\ant.bat -f data_build.xml create_crcdata_digitaltwin_procedures_release_1-8
%ANT_HOME%\bin\ant.bat -f data_build.xml db_digitaltwin_load_data
This will create several new tables in your i2b2 CRC database:
SQL Table | Tool | Description |
---|---|---|
dt_keser_concept_children | KESER | Stores hierarchical relationships between concepts for Keser analysis. |
dt_keser_concept_feature | KESER | Contains features associated with concepts for Keser analysis. |
dt_keser_embedding | KESER | Contains embedding vectors for various concepts used in Keser. |
dt_keser_feature | KESER | Contains feature information used in Keser analysis. |
dt_keser_feature_cooccur | KESER | Contains co-occurrence data of features within patient records. |
dt_keser_feature_cooccur_temp | KESER | Temporary table for storing intermediate co-occurrence data during Keser processing. |
dt_keser_feature_count | KESER | Contains the count of patients per feature. |
dt_keser_import_concept_feature | KESER | Stores imported concept features for Keser analysis. |
dt_keser_patient_partition | KESER | Contains partitions of patients into training and test cohorts for parallel processing. |
dt_keser_patient_period_feature | KESER | Contains features for each patient over specific time periods. |
dt_keser_phenotype | KESER | Contains phenotype definitions identified by Keser. |
dt_keser_phenotype_feature | KESER | Links features to their corresponding phenotypes in Keser. |
dt_komap_base_cohort | KOMAP | Base cohort of patients for the KOMAP program. |
dt_komap_patient_feature | KOMAP | Contains features for each patient for KOMAP analysis. |
dt_komap_phenotype | KOMAP | Contains phenotype definitions used in KOMAP. |
dt_komap_phenotype_covar | KOMAP | Contains covariates used in phenotype analysis for KOMAP. |
dt_komap_phenotype_covar_inner | KOMAP | Contains intermediate covariate data used during KOMAP phenotype analysis. |
dt_komap_phenotype_feature_coef | KOMAP | Contains coefficients for features used in KOMAP phenotype computation. |
dt_komap_phenotype_feature_dict | KOMAP | Dictionary of features used in KOMAP phenotype analysis. |
dt_komap_phenotype_gmm | KOMAP | Contains Gaussian Mixture Model results used in KOMAP phenotype clustering. |
dt_komap_phenotype_gold_standard | KOMAP | Contains gold standard phenotypes used for validating KOMAP computational phenotype models. |
dt_komap_phenotype_patient | KOMAP | Links patients to their computed phenotypes in KOMAP. |
dt_komap_phenotype_sample | KOMAP | Contains sampled patient data for KOMAP phenotype analysis. |
dt_komap_phenotype_sample_feature | KOMAP | Contains features for each phenotype sample in KOMAP. |
dt_komap_phenotype_sample_results | KOMAP | Contains results of KOMAP phenotype analysis on sampled data. |
DT_LOYALTY_CHARLSON | LOYALTY | Charlson Comorbidity Index data for loyalty cohort analysis. |
DT_LOYALTY_PATHS | LOYALTY | Ontology paths associated with features used in loyalty cohort analysis. |
DT_LOYALTY_PSCOEFF | LOYALTY | Coefficients for loyalty score calculation in the loyalty cohort analysis. |
DT_LOYALTY_RESULT | LOYALTY | Results of loyalty cohort analysis. |
DT_LOYALTY_RESULT_CHARLSON | LOYALTY | Results of Charlson Comorbidity Index analysis for the loyalty cohort. |
DT_LOYALTY_RESULT_SUMMARY | LOYALTY | Summary results of the loyalty cohort analysis. |