SCILHS Data Mart Script Generator

The SCILHS Data mart script generator is designed to assist administrators by simplifying the process of creating new data marts from an existing i2b2 database. This form will gather the appropriate information to generate a database script that can be run on your new database. The script will create the required CRC, Ontology, and Workplace data tables as well as populate them with the requested data from your existing i2b2 database.

REQUIREMENTS:

The following needs to be done prior to proceeding with this form.

  1. The new database schemas where your new data mart will reside needs to be set up.
    • The script that is generated from this plug-in will create all the required i2b2 tables; therefore, you do not need to create the i2b2 tables that will reside in your new database schemas.
    • However, you do need to know the name of the database schemas so they can be entered into the script correctly.
  2. Need to know the cases (i.e. i2b2 Patient Sets) that will be included and / or excluded in the new data mart.

Script Name:

You have the option of entering a name for the script that will be generated. This name will be added as a comment and will not be used when the script is run. It is simply a way to help you organize and keep track of the scripts you are generating.
Enter up to 255 characters.
Cases:

The new data mart will be created to include a set of cases or cohorts that are based on an existing set of patients. These patient sets were generated in the current i2b2 database and can be added to this form by dragging the name of the patient set from Previous Queries or Workplace to the appropriate box on this form.
Drop the Patient Set here

Identify the Cases to be included.

Click the Patient Set to remove it

Drop the Patient Set here

Identify the Cases to be excluded.

Click the Patient Set to remove it

Database System:

Select the type of Database Management System used in your i2b2 environment. This information is necessary to generate the appropriate script for your type of database.
Database Names:

Enter the CRC, and Ontology, database schema names for your existing i2b2 database. This "existing i2b2 database" will be used as the source directory when your new data mart is created. Therefore it is important you enter these schema names correctly.

Example: In the i2b2 demo data database the CRC database schema name is i2b2demodata, the Ontology database schema name is i2b2metadata, and the Workplace database schema name is i2b2workdata.
Enter the name of the CRC database schema. Limited to 100 characters.

Enter the name of the Ontology database schema. Limited to 100 characters.

When the above form is complete, click the "Submit" button to generate the database script.
Please return to the Specify Data tab and fill out the request form.

SCILHS Data Mart Script Generator

Workflow: Creating New Disease Specific Data Marts

The following workflow is an outline of the steps to follow when requesting and creating a new disease specific data mart for SCILHS related studies. It is important to note that the first few items need to be done prior to generating the database script to create the new data mart.

  1. In the SCILHS SHRINE Network, the Principal Investigator ("PI") runs a query to find the appropriate number of cohorts needed for the study.
  2. The PI flags this query so it can be easily identified in the local instance of the i2b2. The Site Administrator ("Admin") will use this query to generate the set of patients to be included in the new disease specific data mart.
  3. NOTE

    Depending on your study, there may be specific case studies that you do not want included in your new disease specific data mart. In this instance you will need to run a second query with the exclusion criteria and flag it so it can be used later on in the data mart creation process.

  4. The PI submits a request for a new project (data mart) to be created by the local Admin. The request needs to include the following:
    1. The name of the SHRINE query that will be used by the Admin to generate the list of patients to be included in the new data mart.
    2. The name of the SHRINE query that will be used by the Admin to generate the list of patients to be excluded in the new data mart. *Optional*
    3. The username of all i2b2 users who can have access to the new project. The list of users should also include the appropriate level of access for each user.
  5. The Admin creates the new database schemas for the CRC Data, Ontology Metadata, and Workplace tables.
  6. TIP

    Do not create the tables for these schemas as they will be created when running the create data mart script. You do need to make note of the new database schema names as they will be needed when generating the new data mart scripts.

  7. The Admin creates the database users for the new schemas.
  8. IMPORTANT

    Make sure the database user(s) for the new schemas have read / write access.

  9. In the local i2b2, using the SHRINE query flagged by the PI, the Admin creates the list of patients to be included in the new data mart.
  10. The Admin creates the list of patients to be excluded in the new data mart. This step is only necessary if the PI has flagged and identified a second SHRINE query for exclusion.
  11. Using the SCILHS Disease Data Mart Request plug-in, the Admin generates the database scripts for creating the new disease specific data mart.
  12. Generate the scripts for creating the new disease specific data mart by logging into the i2b2 Web Client and using the SCILHS Data Mart Script Generator plug-in.
  13. Additional Resources: Additional information on how to generate the database script can be found in the section titled Generating Data Mart Scripts in i2b2.

  14. The Admin creates the new disease specific data marts by running the data mart scripts created in the previous step. The data mart script will create the required tables and populate them with the appropriate data from your i2b2 database.
  15. Additional Resources: Additional information on how to use the database script can found in the section titled How to use the Database Script.

  16. Update the CRC_DB_LOOKUP, ONT_DB_LOOKUP, and WORK_DB_LOOKUP tables with the new database schema names and project information.
  17. Setup the new project and it's users in the i2b2 Admin module.
  18. Additional Resources: The i2b2 Installation Guide on the i2b2 Community Wiki will provide additional information about how to setup and manage Projects and Users in the i2b2 Admin module.

  19. The Admin notifies the PI that the new project has been setup and is available for use.

Instructions

The SCILHS Data Mart Script Generator plug-in is designed to assist administrators with the creation of a new disease specific data mart. The plug-in will use the information entered on the form to generate a database script. This database script will create a new data mart from an existing one.

Generating Data Mart Scripts in i2b2

The following instructions explain how to use the SCILHS Disease Data Mart Request plug-in to generate the scripts for creating a disease specific data mart in your environment.

  1. Verify the following information is entered on the Specify Data page.
  2. Script Name:

    This is an optional field that can be used to help organize and keep track of the various scripts you may be running. The name entered will be added to the script as a comment and will not be used when the script is fun against the database.

    Cases:

    In simplest terms, cases are a set of existing i2b2 patients that will be included or excluded when the new data mart is created. In order to populate teh new data mart with the appropriate patients and their observations, it is necessary you identify the cases to be included in the new data mart. Identifying the cases to be excluded is optional.

    The patient sets to be included or excluded are generated in the existing i2b2 database and can be added to the form by dragging the name of the patient set from Previous Queries or Workplace to the appropriate box.

    Database System:

    Depending on your environment, you will need to select the Database Management System that is appropriate for your i2b2 database. The plug-in uses this information to generate a script that is specific to your type of database. Currently there are two options; SQL Server and Oracle.

    Database Names:

    The database names are the database schema names for your existing i2b2 database. The script will use this information as the source for the data that will be inserted into the new data mart. Therefore, it is important that you correctly enter the names for your existing i2b2 database so the appropriate data is selected.


  3. Click the Submit button to generate the database script.
  4. IMPORTANT

    The information entered on the Specify Data page will be used to define the variables that will be used by the script when it is run in your environment. Therefore it is important you have defined everything correctly before submitting the form.

  5. The page will automatically change to the View Results tab once the SCILHS Data Mart plug-in finishes creating the database script.
  6. Create a new file on a local machine in your environment.
  7. Copy and paste the script into a file in your local environment.
  8. TIP

    To make it easier for users the plug-in will highlight the entire script in one simple step. All you need to do is use your mouse to click on the text of the generated script. At this point the text for the entire script will be highlighted. To copy the text and paste it in your new file you can use one of the following standard keyboard commands:

               Windows:
                    Copy = Ctrl+C
                    Paste = Ctrl+V

               Macintosh:
                    Copy = Command+C
                    Paste = Command+V

  9. Save the file containing the newly generated script.

How to use the Generated Database Script

The information in this section includes instructions on how to use the database script once it has been generated via the plug-in.

  1. Log into the application you use to manage your databases.
  2. Connect to your NEW database.
  3. Open the database script you previously created.
  4. Run the script from your NEW database.
  5. IMPORTANT

    The database script creates all the required i2b2 tables, indexes, and stored procedures when it is run against your new database schema. If a table with the same name already exists in the new database schema then the script will stop running.

    If the tables do exist in your new database then you will need to delete them before you proceed with running the scripts. Make sure you also delete the indexes and stored procedures if they also exist in your new environment.

    If you ran the script and it stopped running because of existing tables, then you will need to manually delete the existing tables, indexes and stored procedures; This includes any tables the script was able to create before it encountered the problem. Once you have deleted the table(s). you will need to run the script again.

  6. Finished creation of datamart” will display once the script finishes running. At this point your new CRC and Ontology tables have been created and the appropriate data inserted into them.

About the Database Script

What the Database Script Does

The following is an outline of what the database script will do when you run it against your new database.

  1. Drops and creates the required CRC, Ontology, and Workplace tables. (see list of tables created in the section called "Tables Created During This Process")
  2. Populates the QT_QUERY_RESUT_TYPE, QT_QUERY_STATUS_TYPE, and QT_PRIVELEGE tables with data from your existing i2b2 database.
  3. Creates the set of patients to use when selecting the data to insert into the patient oriented tables. This set of patients is based on the "include" and "exclude" patient lists you entered when submitting the form.
  4. Populates the Ontology tables with data from your existing i2b2 database.
  5. Populates the WORKPLACE_ACCESS table with the required data (Shared folder).
  6. Populates the OBSERVATION_FACT, PATIENT_DIMENSION, and VISIT_DIMENSION tables with selected data. All observations, patients, and visit data for the patients listed in your patient set will be inserted into the appropriate tables.
  7. Populates the remaining dimension tables (CONCEPT_DIMENSION, MODIFIER_DIMENSION and PROVIDER_DIMENSION) with data from your existing i2b2 database.
  8. Populates the mapping tables with self-mapping data.
  9. Creates the required indexes for the i2b2 tables.
  10. Creates the required stored procedures for the CRC data mart .

Tables Created During This Process

The following tables are created when you run the database script.

CRC Dimension and Observation Tables

  • CONCEPT_DIMENSION
  • MODIFIER_DIMENSION
  • OBSERVATION_FACT
  • PATIENT_DIMENSION
  • PROVIDER_DIMENSION
  • VISIT_DIMENSION

CRC Mapping Tables

  • PATIENT_MAPPING
  • ENCOUNTER_MAPPING

CRC QT Tables

  • QT_ANALYSIS_PLUGIN
  • QT_ANALYSIS_PLUGIN_RESULT_TYPE
  • QT_BREAKDOWN_PATH
  • QT_PATIENT_ENC_COLLECTION
  • QT_PATIENT_SET_COLLECTION
  • QT_PDO_QUERY_MASTER
  • QT_PRIVILEGE
  • QT_QUERY_INSTANCE
  • QT_QUERY_MASTER
  • QT_QUERY_RESULT_INSTANCE
  • QT_QUERY_RESULT_TYPE
  • QT_QUERY_STATUS_TYPE
  • QT_XML_RESULT

Ontology Tables

  • CODE_LOOKUP
  • PCORNET_DEMO
  • PCORNET_DIAG
  • PCORNET_ENC
  • PCORNET_PROC
  • PCORNET_VITAL
  • PCORNET_ENROLL
  • SCHEMES
  • TABLE_ACCESS

Workplace Tables

  • WORKPLACE
  • WORKPLACE_ACCESS

Glossary

Database Terms Defined

To avoid confusion the following table contains database terms that are used within this document and may have different meanings depending upon the database management system you use.

Term Oracle Definition 1 SQL Server Definition 2
Database An Oracle database is a collection of data treated as a unit. The purpose of a database is to store and retrieve related information. A collection of information, tables, and other objects organized and presented to serve a specific purpose, such as searching, sorting, and recombining data. Databases are stored in files.
Instance A system global area (SGA) and the Oracle background processes constitute an Oracle database instance. Every time a database is started, a system global area is allocated and Oracle background processes are started. The SGA is deallocated when the instance shuts down. A copy of SQL Server that is running on a computer.
Schema Collection of database objects, including logical structures such as tables, views, sequences, stored procedures, synonyms, indexes, clusters, and database links. A schema has the name of the user who controls it. The names of tables, fields, data types, and primary and foreign keys of a database. Also known as the database structure.
*Microsoft refers to this as the “database schema”.

NOTE

The behavior of schemas changed in SQL Server 2005. Schemas are no longer equivalent to database users; each schema is now a distinct namespace that exists independently of the database user who created it. In other words, a schema is simply a container of objects. A schema can be owned by any user, and its ownership is transferable. 3

About Us

This plugin was created by Mike Mendis and Janice Donahoe

Terms of Use

This plugin is distributed with the i2b2 Web Client Framework and may be used free of charge.

References