Uploaded image for project: 'i2b2 Core Software'
  1. i2b2 Core Software
  2. CORE-204

Duplicate data in GitHub repo i2b2-data

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.07
    • 1.7.07
    • Data
    • None
    • i2b2 Core
    • Removed duplicate files. Deleted the release_1-7 folder and all its contents.
    • Hide
      TEST STATUS: Completed
      COMPLETION DATE: 01/13/2016
      TESTED BY: Janice Donahoe

      ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      Test Date: 01/13/2016
      Build Number: 1.7.07.0002
      Test Status: Passed Testing

      Clients Tested :
           Not applicable

      Environments Tested :
           Browsers: Not applicable for this test
           Databases: Oracle, PostgreSQL, SQL Server
           Client OS: Not applicable for this test

      Test Comments:
      Tested with the latest Data build and it appears to be working correctly. Bamboo did not have any errors when running the install scripts.

      ISSUES FOUND:
      An unrelated issue was found with the Bamboo scripts. The tests for age related queries failed due to the new year. These tests will be updated to reflect the new age of the test patients.
      Show
      TEST STATUS: Completed COMPLETION DATE: 01/13/2016 TESTED BY: Janice Donahoe ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Test Date: 01/13/2016 Build Number: 1.7.07.0002 Test Status: Passed Testing Clients Tested :      Not applicable Environments Tested :      Browsers: Not applicable for this test      Databases: Oracle, PostgreSQL, SQL Server      Client OS: Not applicable for this test Test Comments: Tested with the latest Data build and it appears to be working correctly. Bamboo did not have any errors when running the install scripts. ISSUES FOUND: An unrelated issue was found with the Bamboo scripts. The tests for age related queries failed due to the new year. These tests will be updated to reflect the new age of the test patients.

    Description

      The total size of all files inside the GitHub repo i2b2-data (which is the equivalent of the old i2b2createdb artifact) is 3 GB. Previously, this artifact used to take up 1.51 GB when unzipped.

      Looking at the repo, there is the "edu.harvard.i2b2.data" folder, which has always been present. However, there is also an additional "release_1-7" folder which contains the same data as the "Release_1-7" folder underneath "edu.harvard.i2b2.data".

      This duplicate data is unnecessary, and one of these folders should be cut out. Looking at the repo, it appears the "release_1-7" hasn't been committed to in 6 months, while "edu.harvard.i2b2.data" has seen activity recently.

      Attachments

        Activity

          kdwyer Keith Dwyer created issue -
          Thanks Keith! I will take a look at this and fix it right away. I will also place a note in the current 1.7.07-RC1 release not to download as data is duplicated.
          jmd86 Janice Donahoe added a comment - Thanks Keith! I will take a look at this and fix it right away. I will also place a note in the current 1.7.07-RC1 release not to download as data is duplicated.
          jmd86 Janice Donahoe made changes -
          Field Original Value New Value
          Fix Version/s 1.7.07 [ 10201 ]
          Assignee Janice Donahoe [ jmd86 ]
          i2b2 Sponsored Project/s i2b2 Core [ 10196 ]
          Status New [ 10000 ] Open [ 1 ]
          jmd86 Janice Donahoe made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          kdwyer Keith Dwyer added a comment -
          Would it be possible to update both the master branch and the 1.7.07-RC1 tag with the removed duplicate data directory?
          Additionally, which folder (edu.harvard.i2b2.data/ or release_1-7/) is the one that will remain in the repository? My first guess would be the one that has most recently been updated (edu.harvard.i2b2.data, which matches the old createdb archives from i2b2.org). However, the previous 1.7.06 tag does not have this folder (it only has release_1-7/, which does not match the old createdb archives from i2b2.org).

          This would affect our internal deployment scripts.
          kdwyer Keith Dwyer added a comment - Would it be possible to update both the master branch and the 1.7.07-RC1 tag with the removed duplicate data directory? Additionally, which folder (edu.harvard.i2b2.data/ or release_1-7/) is the one that will remain in the repository? My first guess would be the one that has most recently been updated (edu.harvard.i2b2.data, which matches the old createdb archives from i2b2.org). However, the previous 1.7.06 tag does not have this folder (it only has release_1-7/, which does not match the old createdb archives from i2b2.org). This would affect our internal deployment scripts.
          jmd86 Janice Donahoe made changes -
          Developer Notes Removed duplicate files. Deleted the release_1-7 folder and all its contents.
          Status In Progress [ 3 ] Ready to Test [ 10001 ]
          jmd86 Janice Donahoe made changes -
          Status Ready to Test [ 10001 ] Testing [ 10002 ]
          jmd86 Janice Donahoe made changes -
          Testing Notes TEST STATUS: Completed
          COMPLETION DATE: 01/13/2016
          TESTED BY: Janice Donahoe

          ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

          Test Date: 01/13/2016
          Build Number: 1.7.07.0002
          Test Status: Passed Testing

          Clients Tested :
               Not applicable

          Environments Tested :
               Browsers: Not applicable for this test
               Databases: Oracle, PostgreSQL, SQL Server
               Client OS: Not applicable for this test

          Test Comments:
          Tested with the latest Data build and it appears to be working correctly. Bamboo did not have any errors when running the install scripts.

          ISSUES FOUND:
          An unrelated issue was found with the Bamboo scripts. The tests for age related queries failed due to the new year. These tests will be updated to reflect the new age of the test patients.
          Status Testing [ 10002 ] Testing [ 10002 ]
          I updated the master branch and it now has a new tag called v1.7.07-RC2. The folder that remains is edu.harvard.i2b2.data. We decided to keep this one to prevent any problems sites may have running their ant scripts. The tag and zip file for final released product will be called v1.7.07.
          jmd86 Janice Donahoe added a comment - I updated the master branch and it now has a new tag called v1.7.07-RC2. The folder that remains is edu.harvard.i2b2.data. We decided to keep this one to prevent any problems sites may have running their ant scripts. The tag and zip file for final released product will be called v1.7.07.
          jmd86 Janice Donahoe made changes -
          Resolution Fixed [ 1 ]
          Status Testing [ 10002 ] Resolved [ 5 ]
          jmd86 Janice Donahoe added a comment - - edited
          On 01/22/2016, the 1.7.07 Release was made available at the following locations.

          https://www.i2b2.org/software/
           - zip files for release 1.7.07 are available on this site. This includes both the code and documentation.

          https://github.com/i2b2
           - source code has been tagged with v1.7.07.
          jmd86 Janice Donahoe added a comment - - edited On 01/22/2016, the 1.7.07 Release was made available at the following locations. https://www.i2b2.org/software/  - zip files for release 1.7.07 are available on this site. This includes both the code and documentation. https://github.com/i2b2  - source code has been tagged with v1.7.07.
          jmd86 Janice Donahoe made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

          People

            jmd86 Janice Donahoe
            kdwyer Keith Dwyer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: