home bibliography educational materials
Generation of Synthetic Medical Images
Synthetic Data Sets
The collection of medical image data for research can be an expensive time consuming task. Positron emission tomography (PET), x-ray computed tomography (CT), and magnetic resonance imaging (MRI) systems can easily cost over a million dollars. They may require dedicated staff, maintenance contracts, and access to expensive supporting equipment such as a cyclotron. In addition, collection of data for large studies may take months. The process is complicated by equipment schedules, organization of volunteers/subjects, use of potentially harmful electromagnetic radiation, radiopharmaceuticals, and contrast agents, as well as patient privacy rights. These difficulties limit the availability of clinical data, especially for smaller academic research programs.

Creating software models of the human anatomy and imaging systems, and modeling the medical physics of the imaging acquisition process can provide a means to generate realistic synthetic data sets. In many cases synthetic data sets can be used, reducing the time and cost of collecting real images, and making data sets available to institutions without clinical imaging systems.

Synthetic data sets can be used for training purposes and as evaluation data for image processing and analysis algorithms. One additional advantage of synthetic data sets, that can make them an invaluable tool, is that they have a known ground-truth. Ground-truth refers to having exact knowledge of the object being imaged. Ground-truth is, in many cases, nearly impossible to obtain for real images of living humans. In addition system models can be used to improve system design and study imaging parameter selection and acquisition protocols.

While medical image simulation software has been under development since the 1980s, until recently the complexity of the procedures and long computation times have limited the realism and accuracy of artificially generated images. Improvements in computational systems have facilitated simulations that were previously infeasible. Advancements in processor architecture, increases in speed and amount of memory, and development of large storage systems have enabled computers to be used for increasingly complex problems. The use of distributed systems and technologies provide unparalleled computational capabilities.

With these technological improvements and an increased understanding of human anatomy and medical physics, three-dimensional high resolution realistic synthetic medical data sets can be generated for the first time. In addition, the large quantity of images often necessary for studies, hundreds or thousands of images, can be quickly generated and made available. In some cases simulations can even be performed in real time.

Advancements at RIT
RIT and the Chester F. Carlson Center for Imaging Science provide an ideal location for research on medical image simulation. Faculty in the department and the Biomedical and Materials Multimodal Imaging Lab have an expert understanding of the image formation processes being simulated. In addition RIT has a well established reputation for computer technologies and software development. RIT Research Computing provides access to necessary advanced computational facilities, both on-site, and off-site through collaborative efforts such as NYSGrid.

Improvements made at RIT to both PET and MRI simulation software allows simulations to take full advantage of distributed systems. In particular, highly parallelized MRI simulator code, can take advantage of IBM Blue Gene series super computers. The 32,768 core IBM Blue Gene/L hosted at the Rensselaer Polytechnic Institute’s Computational Center for Nanotechnology Innovations allows simulations that would take two years on a modern Intel or AMD system to be completed in a single day.

Synthetic PET image of a brain phantom.
Synthetic PET image of a brain phantom. Brain phantom from: Berengere, A.-B., Evans, A. C., Collins, L., “A New Improved Version of the Realistic Digital Brain Phantom,” NeuroImage, 32:138-145 (2006).

In addition to utilizing modern technology to decrease run times, RIT researchers optimize simulation algorithms, and improve memory management to maximize simulation efficiency. New capabilities are investigated to increase the robustness of the simulators, and increase their applications. For example, efforts to model motion during the image acquisition process will support motion artifacts (e.g., from flowing blood), new modalities (e.g., diffusion tensor imaging), and new applications (e.g., fast cardiovascular imaging).

Related Efforts
A complementary project aims at using medical image simulators to enhance education. It is well known that interactive learning improves comprehension and retention. In addition, simulator technology has improved to the point that small simulations demonstrating specific concepts can be quickly executed. One goal is to create a tool that presents a user friendly interface, which will enable simulations to be run as part of a course, either to create teaching supplements or for hands on laboratory experimentation. This tool will provide an environment in which a trainee can experiment with various acquisition protocols and parameters as well as gain an understanding of different anomalies.

(a) Digital phantom imaged.  A single small spot is first moved towards the right, and then back to its original location as shown by the red arrows.  (b) Real portion of k-space (raw data) acquired by simulated imaging of the phantom.  (c) Reconstructed image after windowing to show side lobes, demonstrating the impact of spin packet motion.
(a) Digital phantom imaged. A single small spot is first moved towards the right, and then back to its original location as shown by the red arrows. (b) Real portion of k-space (raw data) acquired by simulated imaging of the phantom. (c) Reconstructed image after windowing to show side lobes, demonstrating the impact of spin packet motion.


Synthetic images of a brain using a standard spin-echo pulse sequence with a long repetition time (2000ms).  The affect of the echo time acquisition parameter is demonstrated by varying the parameter value (from left to right 20ms, 30ms, 40ms, 80ms).
Synthetic images of a brain using a standard spin-echo pulse sequence with a long repetition time (2000ms). The affect of the echo time acquisition parameter is demonstrated by varying the parameter value (from left to right 20ms, 30ms, 40ms, 80ms).

Simulated data is also currently being used for visual perception studies. This includes a collaboration with Dr. Roxanne Canosa from the Computer Science Department who is exploring the interpretation of medical images for diagnostic purposes. “The long-term objective of this research is to improve the ability of medical practitioners to extract useful information from single- and multi-modal medical images by providing a model of human visual attention integrated with mode-specific image saliency,” says Canosa. “The goal of this project is to create a biologically-inspired computational model of visual attention that highlights likely lesion locations in medical images for the purpose of improving the lesion detection abilities of radiology residents.”

Synthetic multimodal PET/MRI image with metabolically active lesion present in PET.
Synthetic multimodal PET/MRI image with metabolically active lesion present in PET.

Another exciting collaboration is with Dr. James Ferwerda of the Munsell Color Science Laboratory and Dr. Kent Ogden of the Department of Radiology at SUNY Upstate. Advancements in technology now allow displays with increased contrast, due to both a wider range of luminance (brightness), but also an increase in the number of perceivable gray levels. These so called high-dynamic range displays may prove beneficial in a clinical setting, improving the ability of radiologists to detect difficult to see, low contrast, malignancies. Synthetic data sets will play a role when evaluating observer performance using the new display systems.

Having the ground-truth and associated images provides data sets which can be used to develop and validate novel classification algorithms. In many medical modalities multiple images are collected over a period of time, allowing a change to be observed. One such example is PET where a radiotracer is introduced into the subject and the PET system is used to observe the uptake and release of the tracer by the body. Having the time dimension allows researchers to investigate sub-voxel classification techniques. These techniques attempt to identify the amount contributed by each tissue to the signal observed at each voxel (single element or point in the image), as opposed to most classification techniques which are limited to trying to identify the largest contributing tissue to a given voxel. Validation of these techniques, which can allow more detailed classification and identification of smaller features, can be prohibitively difficult with real images as limited underlying knowledge of tissue boundaries is available.