Image De-identification and Distribution Specifications

The HIRO follows a set of standard guidelines when de-identifying and organizing image data. Unless you request otherwise, the HIRO will process and format your image data according to the guidelines described below.

Image Format

File Formats

The HIRO's default image format is DICOM format, the standard medical image format for the healthcare industry. We are also capable of providing MR images in NIfTI format if desired (we will convert the data from DICOM to NIfTI using the dcm2nii program provided with the MRIcroN software package). If you require MR image data in PAR-REC and/or SPECTRO format, you will need to make special arrangements with the HIRO prior to the performance of the scans in question. You will also need to make special arrangements if you require "raw" data (like CT sinogram data) or "stacked" DICOM data (also known as enhanced DICOM format).

Dicom viewer clipart

All image data will be provided in DICOM format unless (1) the data is only available in another format, or (2) the requesting user explicitly specifies an alternate format. The HIRO will generally not provide image data in jpeg, tiff, bmp or other similar formats because they are non-diagnostic and there are many free programs available on the Internet that will convert DICOM images into any of these formats (the HIRO recommends ImageJ). If you would like the HIRO's assistance with converting a large number of DICOM files into another format (like jpeg), please feel free to contact us (hirohelp [at] bsd [dot] uchicago [dot] edu) and we will try to assist you.

Image Viewers

The HIRO does not normally provide copies of, or support for, DICOM image viewers. There are many free DICOM viewing and manipulation programs available on the Internet, and we can make recommendations if necessary. There are also links to some popular DICOM viewers on our Useful Links page (under Software and Utilities). If you require that a DICOM viewer be included with your image request, please indicate this in the comments of your request.

Image File Organization

When processing data for an image request, the HIRO offers two different methods for organizing the resulting image files. Both methods are described below. If you do not specify which method you would like the HIRO to use, the first method (Type 1) will be used (unless your data request is for a clinical trial).

TYPE 1 - Exam-Series Structure (default)

Example of a HIRO image data file tree (type 1)

The HIRO's default directory (folder) structure can be generalized as patient-study-series-images. That is, image files will first be organized by patient (or subject), then by the study (or exam or scan), and then by the image sequences within each exam (also known as image series). The series directories will contain the corresponding DICOM image files. An example of this structure is shown in the figure at right. In this example, there are five patients (labeled with subject numbers PATIENT1_00001, PATIENT2_00002, etc). The directory (folder) for PATIENT3_00003 has been expanded to show directories for three different scans performed on this subject (labeled with names that start with 'study_'). The directory for one of these scans has been expanded to show directories for the different sequences (or series) contained within the scan; in this case, it is a PET/CT scan that contains several different scan reconstructions. Within each series directory are its corresponding images.

The directories (folders) themselves are named according to the following convention: the patient directories will be named with the patient's subject number. The exam directories will be named with "study_dddddddd_hhhhhh_Description" where 'dddddddd' is the eight-digit exam date, 'hhhhhh' is a number that is unique to the exam, and 'Description' is a generic description of the exam (for example, 'CT-CHEST-WO'). The series directories will be named with "MMn_Description_hhh" where 'MM' is the modality of the series (CT, MR, etc), 'n' is the series number, 'Description' is a description of the series as entered by the imaging technologist, and 'hhh' is a code unique to the series. The image files themselves will be named as 'yyyyy_hhhhh.dcm' where 'yyyyy' is the image number within the series and 'hhhhh' is a number that is unique to the image.

TYPE 2 - DICOMDIR-compliant Structure

Example of a HIRO image data file tree (type 2)

The HIRO can also can also format your data into a DICOMDIR-compliant directory structure. This structure can be generalized as patient-exam-images. Data formatted in this way will comply with the DICOM Part 10 off-line media standard. This format is particularly useful for image data that will be burned to a disc and/or image data that you intend to review using clinical viewing software. This format is recommended for data that will be submitted to clinical trial central reviewers. If you request image data for a clinical trial, the HIRO will use this format by default. An example of this structure is shown in the figure at right.

The directories themselves are named according to the following convention: the patient directories will be named with the patient's subject number. The exam directories will be named with "study_hhhhhh_dddddddd" where 'hhhhhh' is a number that is unique to the exam and 'dddddddd' is the eight-digit exam date. Within each exam directory will be a file named "DICOMDIR" and a directory named "DICOM". The DICOMDIR file is an index file that can be read by most clinical image viewing programs. The DICOM directory will contain all of the images for the exam. The image files themselves will be named as 'IM_nnnnn' where 'nnnnn' is a number that is unique to the image.

Data Compression

Zip icon clipart

Once the HIRO has finished processing your images, the data will normally be compressed to reduce its size and to make it easier to download/transfer. The HIRO uses industry-standard, lossless zip compression. This type of compression will not impact the quality of the image data in any way, and most major operating systems (including Windows and Mac) have built-in tools for automatically uncompressing zipped data. The HIRO may compress your data into a single or multiple zip files depending on size.

CD/DVD Formatting

Disc Formatting

Strictly speaking, the DICOM specification requires that CDs/DVDs conform to a specific directory structure. The length of the image file paths is limited, and the names of the files themselves must adhere to a generic naming convention. The disc must also contain a special index file called a DICOMDIR file that describes the content on the disc. Although this formatting is not terribly useful from a research perspective, it is often beneficial (and sometimes required) if the images must be opened by a clinical image viewing program or imported into a PACS.

Folder with pictures clipart

As such, the HIRO has the capability to create CDs and DVDs containing your image data using either file organization method described in the image file organization section above. The HIRO can format your image data using either of these methods for cloud-based delivery or electronic submission as well. If you are requesting image data for a clinical trial, the HIRO recommends the DICOMDIR-compliant format, as this format will likely be the most familiar to the study sponsor.

Disc Labeling

If you request discs for a clinical trial, the HIRO will label them with the study's name (or abbreviated name) and number, the site number (if applicable), the subject ID number, the exam time point, the exam date, and the exam type. If you request discs for basic research purposes, the HIRO will label them with your name, the study name, and the HIRO request number. Disc labels for clinical trials will not contain references to the HIRO or the University of Chicago; disc labels for basic research will. You may request alternate disc labeling if desired. If your study's sponsor has provided you with discs for the purpose of submitting images, please feel free to drop these off with the HIRO and we will be more than happy to use them when processing your study's image requests.

Disc Type

Disc with data clipart

The HIRO attempts to conserve discs whenever possible. A typical CD will hold approx. 700 MB of data, and a typical single-layer DVD will hold approx. 4.7 GB of data. We will attempt to fit as much data onto a single disc as possible unless you specify otherwise (i.e., if you request one disc per subject or one disc per exam, etc). The HIRO will favor CDs over DVDs if all of your data will fit onto a CD unless you specify otherwise. Be aware, however, that many types of scans will often require DVDs due to their size, like breast MRI scans or high-resolution chest CT scans. Discs produced by the HIRO are compatible with Windows, Mac and Linux-based computer systems.

Online Data Delivery

The HIRO provides several online methods for distributing requested image data. You may indicate your desired download method in your image request, or the HIRO can make a recommendation based on the size and type of requested data.

Web-based Downloading

Some image data as well as supporting documents will be made available for download directly via the image request details page on the website. To download the data, simply click on the appropriate link(s) that will appear in the Downloads section of the request details page.

User-provided Media and Cloud-based Downloading

Drive transfer clipart

If you would like your image data copied directly to a device that you provide, like a portable hard drive or a USB stick, the HIRO is happy to accommodate you (please note your portable media must comply with all relevant BSD ISO policies). You can simply drop the media off at the HIRO's office and we will contact you once the data has been transferred. The HIRO can also copy image data directly to your CRI Lab Share network storage space if you have one (you will need to grant HIRO staff access to your share, please contact the HIRO directly if you would like to utilize this option) or your UChicago Box.com account (again please contact the HIRO directly to discuss the details of this option). The HIRO's full Image Data Delivery policy can be found in our Policies and Guidance Documents section.

Electronic Submission to Study Sponsors

If your study offers (or requires) an option to submit image data electronically via online methods directly to the study sponsor or a third party, the HIRO can accommodate you. The HIRO has experience with a wide variety of electronic submission methods, including FTP and SFTP, DICOM transfers, and web-based clinical trial management systems like AG Mednet, BioClinica/Clario Smart Submit (formerly WebSend), Ambra (formerly DICOM Grid), iMedidata and others. The HIRO can contact the appropriate sponsor team members and set up access to the necessary online systems, and once this access has been set up you may request that the HIRO submit your image data directly to the sponsor in your image data requests.

Image De-identification

Highlighted paper clipart

The HIRO's standard de-identification procedure will remove almost all information that is considered to be Protected Health Information (PHI) according to HIPAA standards. This information may be present in the DICOM headers of the image files or physically present on the images themselves (or both). The patient identifiers will either be replaced with alternate, research-related identifiers, or will simply be removed completely. This includes:

Patient IdentifiersInstitution Identifiers
First, Middle and Last NamesFacility and Department Names
Medical Record NumbersFacility and Department Addresses
Dates of BirthReferring and Ordering Physicians
Ages (if 90 years old or greater)Accession Numbers
Physical AddressesStudy UIDs
IP AddressesStation Names and Serial Numbers

Standard Tag Modifications

Shredder clipart

The HIRO will normally replace the DICOM Patient Name tag (0010,0010) and Patient ID tag (0010,0020) with the subject ID you provide. If you do not provide a subject ID, the HIRO will create its own sequential subject IDs starting with 00001; however, the HIRO recommends you provide your own subject IDs whenever possible. The HIRO will insert the patient's age at time of exam into the DICOM Patient Age tag (0010,1010). If the patient is over 89 years old, the age value will be capped at 90 (also known as age clipping). If a site number is provided, the HIRO will insert this information into the DICOM Institution Name tag (0008,0080). If time point information is provided, the HIRO will insert this information into the DICOM Clinical Trial Time Point tag (0012,0051). The DICOM De-Identification Method tag (0012,0063) will also be populated.

Dates of Service

The HIRO currently leaves exam dates (sometimes known as dates of service) intact by default, which makes most of the image data it produces Limited Datasets (as opposed to fully de-identified datasets) as defined by HIPAA. The HIRO is capable of performing date-shifting if desired so that the true exam date will be removed but the time between exams will be preserved; this may make the image data produced by the HIRO fully de-identified by HIPAA "expert determination" standards (but may still not necessarily comply with Safe Harbor standards, please contact the HIRO to discuss). The HIRO can also completely remove all dates of service, but users should be aware that the resulting image data may not technically be considered DICOM compliant. Users may request date-shifting or complete date removal when submitting an image data request.

Image Acquisition Tags and Private Tags

Information regarding imaging settings and parameters is generally left intact, including most "private" information that may be inserted by an equipment manufacturer. This information will only be modified or removed if it contains patient information (for example, if a study or series description tag contains patient-specific information along with the imaging settings, the patient information will be removed while leaving the imaging settings intact).

Custom Tag Modifications

If your study requires that certain types of data remain intact, and your IRB-approved research protocol allows it, the HIRO can customize its de-identification process for your study. It is not uncommon, for example, for a clinical trial to require that the birth dates of enrolled patients be left intact, or for a patient's name to be replaced with their initials. Clinical trials will also almost always require that the exam date remain intact. If your study has specific de-identification requirements, you simply need to provide the HIRO with the details and we will work with you to provide data that is de-identified according to your specifications. Where required, the HIRO will provide you with a key that will allow you to associate the de-identified data with the original patient data (provided your research protocol allows this).