IDRT - Integrated Data Repository Toolkit
Space shortcuts
Space Tools

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Disabling Index Maintenance during Loading

As mentioned in the query optimization page, the i2b2 database schema contains several indexes to speed up query performance. When data is being loaded into indexed tables, the databases has to simultaneously update or extend both the actual table row data as well as the related index structures, in order to maintain index consistency at all times. Depending on the number of indexes this can incur a notable performance impact due to concurrent write operations across different areas of the physical hard drives. This effect becomes especially large when tables are not incrementally extended (like in a transactional setting), but truncated and fully loaded (like in a typical data warehouse setting). 

...

Especially during the design phase or when new data types are being added to an i2b2 warehouse, it may be necessary to frequently reload only a specific portion of the overall dataset (e.g. only lab data). If the OBSERVATION_FACT table has been partitioned (as described in the page on query optimization), individual partitions can be quickly truncated, which can be notably faster than issuing DELETE statements for individual records or wildcarded concept codes.

...