Algorithm Outline

Implements a loyalty cohort algorithm described and evaluated in

Klann, Jeffrey G., Darren W. Henderson, Michele Morris, Hossein Estiri, Griffin M. Weber, Shyam Visweswaran, and Shawn N. Murphy. 2023. “A Broadly Applicable Approach to Enrich Electronic-Health-Record Cohorts by Identifying Patients with Complete Data: A Multisite Evaluation.” Journal of the American Medical Informatics Association: JAMIA, August. https://doi.org/10.1093/jamia/ocad166.

Developed from a regression equation validated in

Lin, Kueiyu Joshua, Gary E. Rosenthal, Shawn N. Murphy, Kenneth D. Mandl, Yinzhu Jin, Robert J. Glynn, and Sebastian Schneeweiss. 2020. “External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research.” Clinical Epidemiology 12 (February): 133–41.

Written primarily by Darren Henderson with contributions from: Jeffrey Klann, PhD; Michele Morris; Andrew Cagan; Barbara Benoit

Outline of algorithm

The program accepts a cohort definition consisting of: patient ids with per-patient index_dates (the date at which patient loyalty is to be evaluated), the number of years to look backwards from the index date to evaluate each binary variable, and a number of control variables to alter the behavior of the summary table output (such as number of lookback years and whether demographic data are stored in the observation_fact table). Both the model coefficients and the ontology elements mapped to each variable are stored in database tables and can be customized at each site. Specifically, the tool performs the following steps:

Remove patients from the cohort if they have no visits or data (other than demographics) during the measure period.
Patients that are 18 and under at the beginning of the measure period (and those with an unknown age) are excluded from the analysis, because a pediatric population would require different proxies and measurements of utilization that are likely not generalizable to adults and was not considered in the original model.
Patients that are deceased at any time from the beginning of the measure period onward, are excluded.
Compute the 20 binary variables by determining which patients have facts of the required types during the measure period, using a site-configurable table of ontology paths (i.e., folders).
Compute and save (in the database) the final loyalty scores using the published regression equation.
Produce a summary output that implementers can use to validate their loyalty cohorts’ characteristics are as expected. This includes flag prevalence in the cohort, age and sex breakdowns, and average Charlson scores.

The loyalty score script also computes the Charlson Comorbidity Index for each patient.

Compute and save Charlson comorbidities by examining diagnosis codes. At the index date for each patient in the cohort, a 1-year lookback is performed evaluating the diagnosis codes present in the ACT data model. Diagnosis codes associated with each Charlson Comorbidity disease group are retrieved from a data dictionary, and a patient is assigned the appropriate Charlson weights for each group present in the healthcare record. The index is calculated based on the patient's age group and the summation of these weights from each category. From this, the 10-year probability of survival can be calculated.

From supplementary data in Klann, Jeffrey G., Darren W. Henderson, Michele Morris, Hossein Estiri, Griffin M. Weber, Shyam Visweswaran, and Shawn N. Murphy. 2023. “A Broadly Applicable Approach to Enrich Electronic-Health-Record Cohorts by Identifying Patients with Complete Data: A Multisite Evaluation.” Journal of the American Medical Informatics Association: JAMIA, August. https://doi.org/10.1093/jamia/ocad166.

Variables and their coefficients

Variable	Coding System	Label
Any diagnosis code	I	Exactly 1 Diagnosis
Any ED visit	n/a	ED Visit
Any inpatient or outpatient encounter	n/a	In/Out-patient Visit
Any medication	R	Exactly 1 Medication
Any two diagnosis codes	I	2+ Diagnoses
Any two outpatient encounters	n/a	2+ Outpatient Visits
Any two visits with the same provider	C	2+ Visits with Same MD
Any three visits with the same provider	C	3+ Visits with Same MD
At least two medications	R	2+ Medications
At least two routine care fact types (bold-faced)	N/A	2+ Routine Care Facts
Body Mass Index measurement	I	BMI
Colonoscopy	C,H,I	Colonoscopy
Fecal occult blood test	C,H	Fecal Occult Test
General medical examination	I	Medical Exam
Having A1C ordered or value recorded	C,L	A1C
Influenza vaccine	C,H,I	Flu Shot
Mammography	C,I	Mammography
Pap smear	C,H	Pap test
Pneumococcal vaccine	C,H,I	Pneumococcal Vaccine
PSA Test	C,H,I,L	PSA Test

Table S1. The 20 variables used to compute a loyalty score, the terminologies within ACT to which we mapped the variables, and the label used in figures. Adapted from Table 2 in [29].

Terminology key: I = ICD-9 and ICD-10; R = RxNorm; C = CPT-4; L = LOINC; H = HCPCS

From supplementary data in Klann, Jeffrey G., Darren W. Henderson, Michele Morris, Hossein Estiri, Griffin M. Weber, Shyam Visweswaran, and Shawn N. Murphy. 2023. “A Broadly Applicable Approach to Enrich Electronic-Health-Record Cohorts by Identifying Patients with Complete Data: A Multisite Evaluation.” Journal of the American Medical Informatics Association: JAMIA, August. https://doi.org/10.1093/jamia/ocad166.

The regression equation can be found in appendix table S1 in Lin, Kueiyu Joshua, Gary E. Rosenthal, Shawn N. Murphy, Kenneth D. Mandl, Yinzhu Jin, Robert J. Glynn, and Sebastian Schneeweiss. 2020. “External Validation of an Algorithm to Identify Patients with High Data-Completeness in Electronic Health Records for Comparative Effectiveness Research.” Clinical Epidemiology 12 (February): 133–41.

Outline of algorithm

Variables and their coefficients

Pages

Recently updated