Encryption
Overview
When downloading studies, users may choose to encrypt the downloaded study, and encryption is on by default. This page explains how encryption in mi2b2 is accomplished. This document serves as a reference for those who want to determine whether the strength and techniques used in mi2b2 encryption is sufficient and suitable for their particular use case. This goal, however, necessitates some technical details on cryptography.
For how to setup encryption key and configure other download settings, please refer to this page.
In this document, when we say "random" or "randomly", we mean "pseudo-random" and "pseudo-randomly", that is, the randomness is produced by a deterministic algorithm to simulate true randomness that can only be reliably obtained via observing natural phenomena.
Encryption Standard
The mi2b2 client is written in Java 6 Standard Edition, and uses its java.crypto package. In this package, a number of encryption algorithms exists. We have chosen to use the AES algorithm with 128-bit long keys.
The AES algorithm is used in Cipher Block Chaining (CBC) mode to prevent issues that may occur when encrypted images data are still partially readable. CBC mode requires an initialization vector (a list of random numbers to start the encryption process), and the initialization vector is randomly generated before an encryption occurs.
A key (randomly generated via the key-generation mechanism or user-provided) and the initialization vector are then used in the encryption process.
Encrypting a Study
When users download a study from the mi2b2 server's cache, the study is copied from the cache, and written to the user's download location. The study resides on the cache as a zip file. As the zip file is being streamed to user's mi2b2 client, the client performs unzipping and encryption as the stream goes on. That is, the client does not wait for the entire zip file to complete downloading before unzipping and encrypting. As long as there is enough data to unzip, the client will unzip, encrypt that chunk of data, and write to disk when appropriate.
This ensures that there is not a unencrypted zip file sitting at the download location at any time. This measure prevents attackers, for example, interrupting the power supply after downloading completes but before encryption process takes place to obtain unencrypted images.
Writing Encrypted Files
When writing to an encrypted file, a cryptographic hash (secure one-way function) of the key and the initialization vector are pre-pended to the output file. This allows mi2b2 to verify if the encrypted key is the same as the key users have chosen when opening an encrypted image without divulging the real key.
The hash algorithm mi2b2 uses to hash the key is SHA-256, and produces a hash that is 256 bits (32 bytes) long. The initialization vector is the length of key (128 bits, or 16 bytes) plus 2 more bytes. This means every encrypted image file is pre-pended with 50 bytes of information. Otherwise the encrypted image files are exactly the same size as the unencrypted ones.
The study zip file is organized hierarchically by Study UID (top level), Series ID, and Image Instance ID. The first two levels are directories, named by their IDs. These names are produced using DICOM standards, and those who know how to read the IDs can identify what institution, department, machine, and date/time the particular study started, and use the assembled information to narrow down or identify the patient. To prevent this kind of attack, when encryption is chosen, mi2b2 will rename these studies, series, and images. Studie UIDs and Series IDs will be hashed (using the MD5 cryptographic hash function). The hashed value (16-bytes long) is then written out to be a string of 16 hexadecimal numbers (32 characters) and pre-pended with "Study-" or "Series-" appropriately and appended with "encrypted". The image files, on the other hand, are named "Study{-}num", where num stands for the instance number of that image in its series. The image files are also appended with "-encrypted". This helps the users identify which files appear before which.
Decrypting Images
The mi2b2 client first reads the first 50 bytes of the encrypted file. It apply the same SHA-256 algorithm to the key users supplied, and compare the result to see if the hash matches with the hash of the key in the file (first 32 bytes). If true, the mi2b2 client then uses the last 18 bytes of the 50 bytes to recreate the initialization vector. The key (supplied by user and verified to match the encrypted key) and the initialization vector are then used to initialize the decrypting algorithm.
The decryption occurs when users select a series from the Image Browser or when users move the image slider in the Image Viewer. For most images, the decryption process is fairly fast. However, users may experience lack of responsiveness for some larger images, some modalities are worse than the others. All decrypted data are stored only in RAM.