i2b2 ontologies have a column c_totalnum that can store the total count of patients associated with every item in the ontology tree. This can be visualized in the i2b2 webclient to assist with query building (e.g., to find concepts that have many patients) or be used for data quality (to find areas where patient counts do not make sense). It is also used by the query builder to optimize queries. The ENACT network uses these counts for additional analytics across their network. We recommend you run these counts after each ETL.
i2b2 users must have the DATA_AGG user permission to view the counts through the web client. The stored procedures loaded in the Metadata schema must have read access to the CRC schema (more information in Installation below). Mapping codes through the concept_dimension or Adapter Mappings files are not supported. i.e. the c_basecodes in your ontology tables must be the same codes used in your fact tables. |
This should have already occurred in previous sections of this guide, but verify you have run these steps:
i2b2 1.8 introduces a version that is 5-10x faster. This faster version is presently only available for MSSQL and has only been extensively tested with the ACT ontology. 1.8.1 and later versions will improve on this faster version. These replace the pat_count_dimensions
and run_all_counts
stored procedures.
The first time you run this and when your local ontology changes, you must run the preparatory procedure. This creates a view of distinct concept codes and patient nums (OBSFACT_PAIRS), a unified ontology table (TNUM_ONTOLOGY) and a transitive closure table (CONCEPT_CLOSURE). It could take an hour to run.
exec FastTotalnumPrep or exec FastTotalnumPrep 'dbo'
Run the actual counting. This relies on the i2b2 data tables and the closure and ontology tables created in step 1. It takes no parameters. Its output goes into the totalnum table, which was created when upgrading/installing i2b2 1.7.12 or 1.7.13 or 1.8. It typically runs in 1-3 hours.
exec FastTotalnumCount
Output the results to the totalnum_report table (as obfuscated counts) and into the totalnum column in the ontologies (for viewing in the query tool).
exec FastTotalnumOutput or exec FastTotalnumOutput 'dbo','@'
Optionally you can specify the schemaname and a single table name to run on a single ontology table (or @ for all).
Run the following commands in a SQL client.
exec FastTotalnumPrep or exec FastTotalnumPrep 'dbo'
(Run once when ontology changes.)exec FastTotalnumCount
(Actual counting, takes several hours.)exec FastTotalnumOutput or exec FastTotalnumOutput 'dbo','@'
(Output results to report table and UI.)It is possible to run counts on OMOP tables through the ENACT-OMOP feature in i2b2 1.8. The new 1.8 totalnum procedure works on OMOP - simply load the file totalnum_usp/sqlserver/totalnum_fast_prep_OMOP.sql
instead of totalnum_fast_prep.sql
.
If using multiple fact tables, the recommended approach is to create a fact table view as the union of all your fact tables. (This is essentially going back to a single fact table, but it is only used for totalnum counting. This is needed to correctly count patients that mention multiple fact tables within a hierarchy.)
e.g., create view observation_fact_view as select * from CONDITION_VIEW union all select * from drug_view
See database-specific instructions below. After running the scripts, results are placed in: c_totalnum column of all ontology tables, the totalnum table (keeps a historical record), and the totalnum_report table (most recent run, obfuscated). These total counts will also be visible in the ontology browser in the web client.
By Mike Mendis and Jeff Klann, PhD based on code by Griffin Weber, MD, PhD
Run with:
exec RunTotalnum or exec RunTotalnum 'observation_fact','dbo','@'
The optional parameters are:
Note that visit and patient dimension will only be counted in conjunction with the default (observation_fact) tablename!
Option 1) If you have at most one fact table per ontology, run this once with each fact table specified!
e.g., to use on a fact table called derived_fact with just the act_covid ontology: exec RunTotalnum 'derived_fact','dbo','act_covid'
Option 2) Create a fact table view as the union of all your fact tables. (This is essentially going back to a single fact table, but it is only used
for totalnum counting. This is needed to correctly count patients that mention multiple fact tables within a hierarchy.)
e.g.,
Example 1: Counting using OMOP tables
create view observation_fact_view as
select * from CONDITION_VIEW
union all
select * from drug_view
And then run the totalnum counter with the wildcard flag, to ignore multifact references in the ontology, e.g.,
exec RunTotalnum 'observation_fact_view','dbo','@','Y'
Example 2: Counting using a derived fact table and the regular fact table, using a single ontology
create view observation_fact_view as
select * from observation_fact
union all
select * from derived_fact
Run the totalnum counter with the wildcard flag, to ignore multifact references in the ontology, and specify an ontology table, e.g.,
exec RunTotalnum 'observation_fact_view','dbo','act_covid_v4','Y'
Note this approach does not work if you have conflicting concept_cds across fact tables.
By Mike Mendis, based on SQL Server code by Griffin Weber, MD, PhD
Performance improvements by Jeff Green and Jeff Klann, PhD 03-20
Run the procedure like this (but with your schema name instead of i2b2demodata):
begin
runtotalnum('observation_fact','i2b2demodata');
end;
You can optionally include a table named if you only want to count one ontology table (this IS case sensitive):
begin
runtotalnum('observation_fact','i2b2demodata','I2B2');
end;
Note: If you get the error as: ERROR at line 1: ORA-01031: insufficient privilege, then run the command:
grant create table to (DB USER)
Original PostgreSQL code by Dan Vianello, Center for Biomedical Informatics, Washington University in St. Louis
2019 - Modified for i2b2 1.7.12 release by Mike Mendis, Partners Healthcare
2020 - Updated to support reporting and single-table runs by Jeff Klann, Massachusetts General Hospital
Usage example:
select runtotalnum('observation_fact','public')
Run the ant command to execute the data_build.xml file with below specified target
Some users have reported difficulty executing the totalnum scripts due to user permissions. Lav Patel at UKMC has offered some solutions:
GRANT ALL PRIVILEGES ON DATABASE i2b2 to i2b2
ALTER USER i2b2 with SUPERUSER;
select change_schema_owner('i2b2demodata', 'i2b2');
select change_schema_owner('i2b2metadata', 'i2b2');
select change_schema_owner('i2b2pm', 'i2b2');
select change_schema_owner('i2b2hive', 'i2b2');
The scripts produce three outputs:
Parent folders will get counts (of all patients with facts in the leaves) except for ontology folders derived from visit_dimension or patient_dimension. These cannot be rolled up because of the way these terms are defined in the ontology. They will have no count at all (not a zero).