Uploaded image for project: 'i2b2 Core Software'
  1. i2b2 Core Software
  2. CORE-61

The crc_loader (triggered via a web service) does not correctly load data into the patient dimension



    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.5.2
    • 1.7.04
    • CRC Cell
    • None
    • i2b2 1.5.2 running on Linux using jdbc to connect to an SqlServer instance.
      uname -a reported the following: Linux lamp-api-16 #1 SMP 2011-07-14 14:47:44 +0200 x86_64 x86_64 x86_64 GNU/Linux
    • Rank:
    • Did not change anything. This works except possibly in postgres. There might be a timestamp issue, but it is minor and it is Java so Mike should fix it, in 1.7.05 I think.


      I've been working on producing a command line upload facility which uses the inbuilt web services route to import PDO data into i2b2. This looks an attractive route as we have a need to make small, regular imports of data based around a questionnaire. The questionnaire gains patient consent (required in the UK) as well as providing a fair amount of further information. For us this is what kicks off entry of a patient into i2b2. The plan is to augment this with further data from clinical systems once consent (and basic data) has been loaded.

      Having the facility as a command line utility enables us to script the upload so that it can be automated.

      I've studied the workbench import plugin to gain knowledge of how to do this. And so far we have a utility which can logon to the PM cell, upload a PDO file to the File Repository cell, and trigger the CRC loader to process the file. I'm particularly taken with the idea of the hive allocating it's own internal identifiers for new patients, although indeed we have tried two forms of PDO: one with i2b2 identifiers explicitly set (patient_num and encounter_num with HIVE as a source), and one where we are leaving the CRC loader to do the allocating and the sources within the PDO are only external.

      We are encountering problems whichever form of PDO we try. Both have the same failures for the same tables:

      (a) Patient_Mapping is perfect as far as I can see.
      (b) Encounter_Mapping could be OK. Each row seems to lack the patient-id and its source. But the encounter-id and its source are there.
      (c) Patient_Dimension. All the optional columns are null when they shouldn't be (eg: Birth_Date, Sex_Cd, Age_In_Years_Num etc).
      (d) Visit_Dimension is empty when it shouldn't be.
      (e) Observation_Fact is perfect as far as I can see. Everything recorded as it should be.

      This is annoying, because the two tables which look the most difficult to get correct, are correct as far as I can see.
      Looking at the relevant temporary tables:

      (1) The temporary eid mapping (corresponding to point b above) has the same data in it.
      (2) The temporary patient (point c above) has the same data in it: all the optional columns are null, and is clearly wrong.
      (3) The temporary visit (point d above) is not empty and maybe as it should be (although it is missing the internally allocated encounter_num)

      I've re-read the documentation.
      I've looked at the example PDO's.
      I've examined the programme that sends the PDO with it's processing options. This is the relevant section of code:

                              LoadOptionType loType = new LoadOptionType() ;
                              loType.setEncryptBlob( false ) ;
                              loType.setIgnoreBadData( true ) ;

                              FactLoadOptionType floType = new FactLoadOptionType() ;
                              // appendFlag is supplied by the command line args...
                              floType.setAppendFlag( this.appendFlag ) ;
                              // NOTES:
                              // (1) Need to consider the possible differences here.
                              // (2) ObserverSet is not requested.
                              loadType.setLoadPidSet( loType ) ;
                              loadType.setLoadObservationSet( floType ) ;
                              loadType.setLoadEidSet( loType ) ;
                              loadType.setLoadEventSet( loType ) ;
                              loadType.setLoadPatientSet( loType ) ;

      I cannot see many clues of where we are going wrong at the moment. Given the order of processing, it could be the encounter mapping or the patient dimension that is letting us down.

      As an alternative to using the web service upload, we have another scripted route that uses an XSLT transform of the PDO to generate SQL insert commands. This is our fall-back position. When we try this alternative route we get usable data within i2b2, so this somewhat validates our PDO formatting procedure.

      After writing the above, I have been code reading the crc_loader. There appears to be at least two problems with the code...

      The CRCLoaderDAO uses the attribute Name rather than Column as a source of keys to establish a map of params (see method buildNVParam(List<ParamType> paramTypeList) ). When this map is used within the PatientDAO it uses constants which reflect the column names, so nothing is found within the map. When the CRCLoaderDAO is changed, things start to appear. But...

      The PatientDAO has not been adjusted to use dates correctly. It tries to load plain string data as an SQL timestamp, and then fails.

      After making appropriate code changes the patient_dimension loads correctly, as far as I can see. Patient_dimension now OK, but I still have no rows inserted within the visit_dimension. I've briefly looked at 1.6 too, and I believe the same changes should be applied.





            jmd86 Janice Donahoe
            jefflusted Jeff Lusted
            0 Vote for this issue
            2 Start watching this issue