3D real-time parallel volume rendering using NVIDIA CUDA for medical imaging.
Asavei, Victor ; Ionita, Vlad-Valentin ; Moldoveanu, Florica 等
1. INTRODUCTION
The importance of medical imaging is widely acknowledged. There is
a natural need for doctors to be able to see inside the human body for
figuring out what's wrong. So different methods for doing this have
been developed (x-ray, ultrasonography, resonance imaging, and computer
tomography). The problem that arises is that the accuracy of the
diagnosis depends less on the machine and more on the person reading the
image. In this way the patient's health is very dependent on the
doctor's ability of interpreting the resulting image, and the
accurate diagnosis can be very hard to figure out sometimes.
A solution to this problem can be obtained if the results are
displayed in a more natural manner. If the doctor can look at the real
human body instead of some abstract representation, the underlying
problem can be noticed faster and with more precision.
3D volume rendering, for medical purposes, has been around for some
time, but most of the applications suffer either from poor performance
(we are referring to the possibility of real-time rendering), due to
slow hardware, or very high equipment costs.
Using regular computers for this job is obviously more cost
effective. By implementing and fine-tuning classic volume rendering
algorithms to the new architecture provided by NVIDIA in its new line-up
of consumer video cards, interactive volume rendering can now be
performed with a level of visual detail never before seen on a regular
personal computer (Akenine-Moller et al., 2008).
2. OUR SOLUTION
Splitting a volume up into even slices then stacking them up should
produce an approximation of the initial volume. For this purpose, a CT
or an MRI machine is used to "cut" slices into the human body.
A parallel ray caster algorithm is then used to shoot rays into the
stack and collect samples at different points. These samples are then
fed through a transfer function which maps transparency and colour to
them, and finally a painting algorithm produces a final image containing
the volume rendering (Lewiner, 2006).
[FIGURE 1 OMITTED]
MRI and CT produce images based on the density of the tissue
encountered in the body. In order to process them, the samples collected
by the ray can be interpreted as a measure of density. By assigning
different transparencies and colours to different density intervals,
certain unwanted tissues can be hidden from the rendering (Eg.: A doctor
wants to examine the patient's bones. To do this, he must be able
to hide the skin and muscle covering the bone. This effect can be easily
obtained in our implementation by modifying the level of transparency).
To make this approach practical, the transfer function must be
easily configurable by the doctor and this is also provided by our
implementation (***, 2009).
[FIGURE 2 OMITTED]
The doctor can chose between two painting algorithms. The first one
just alpha blends the samples and produces images similar to x-rays. The
second one takes into account the transparency and the order, in which
the samples are encountered along the ray, in a way similar to that used
by the painter algorithm to produce images where distant objects are
hidden by the closer ones. In this way more realistic images are
produced and the diagnosis process is more accurate.
[FIGURE 3 OMITTED]
3. IMPLEMENTATION
The CUDA architecture provides a considerable number of streaming
processors embedded on the main GPU which can be programmed and used for
various tasks, even if they are not related in any way to graphics.
Because the architecture supports a very large number of threads,
it is very well suited for parallel algorithms, especially those
involving repetitive calculations over huge data volumes.
Because ray casting doesn't involve much data dependency,
classic rendering algorithms of this type can be implemented very
efficiently on this architecture, obtaining a significant speed-up
(Lacroute, 1995).
Ray casting involves "shooting" rays from every pixel of
the image. In our implementation, each ray is assigned to a thread by
using an appropriate "kernel" function; because all threads
will run at the same time and process the data in a SIMD manner, the
performance is greatly increased and real time rendering is achieved
(Nguyen et al., 2008).
In our implementation the number of threads needed (based on the
number of pixels of the resulting image) is determined dynamically, and
the CUDA runtime library chooses how many streaming processors it will
use so that the threads will run efficiently. Due to this feature of
auto balancing the work load, our implementation solves the potential
problems regarding running the application on a GPU with a different
number of streaming processors and also provides future scalability for
the new architectures that will be developed (Nguyen et al., 2008).
4. CONCLUSION
Using regular computers for medical imaging is far less costly than
maintaining and operating expensive equipment. The power of parallel
algorithms can now be harnessed due to the features of CUDA, thus a
whole new class of applications, which until now were considered
impossible to develop due to slow architectures, is now possible to
explore and develop.
Another advantage of using our solution is given by the possibility
to remotely investigate the medical data. A doctor can conduct an MRI
investigation in a certain clinic, and then analyse the volume rendering
at home, or at his office. In fact, the patient just needs to show up
for the MRI and then the doctor can investigate and establish a
diagnosis without the patient being present there at all time.
Also the same technique that we have used for medical imaging can
be applied to seismic imaging and even oil, coal and gold prospecting.
Besides this, prosthetics can be developed with a much finer degree of
precision, due to the fact that automatic measurements can now be
performed on the volume itself.
A very interesting field to explore is that of automatic diagnosis,
or computer aided diagnosis. There is a real issue regarding the time
spent by the doctor to conduct the analysis on the volume that's
being rendered. Sometimes a lot of adjustments must be made in order to
make the problem stand out visually. For example, the colour and
transparency values must be very specific otherwise the problem will
just blend in or hide in the surrounding tissue, making it impossible to
spot.
To avoid this problem, a set of parallel algorithms can parse the
whole volume, even before the volume rendering takes place, and scan for
unusual densities (foreign objects, tumours, blood clots etc.) or
malformations (missing bones, malformed internal organs, brain damage).
To accomplish this task, certain guidelines must be first established
regarding density variations in certain places of the human body. For
example, how fast should the blood density vary inside the blood vessel,
before a clot can be signalled, or what would be the difference in
density between a brain tumour and healthy brain tissue?
The diagnosis algorithm will mark the troubled regions with density
values which are reserved and very different from those found in the
human body, so that upon rendering, these areas will light up and signal
(using colour codes) the trouble spots. Saving the doctor time and eye
strain.
5. REFERENCES
Akenine-Moller, T.; Haines, E. & Hoffman, N. (2008). Real-Time
Rendering Third Edition, AK PETERS, ISBN 978-156881-424-7, Welleslay,
Massachusetts
Thomas Lewiner. (2006). VSVR: a very simple volume rendering
implementation with 3D textures, Available from:
http://www.mat.puc-rio.br/[jenny]tomlew Accessed: 2009-02-19
Philippe Lacroute. (1995). Fast Volume Rendering Using a Shear-Warp
Factorization of the Viewing Transformation, Available from:
http://graphics.stanford.edu/papers/shear/ lacroute-shearwarp-sig94.pdf,
Accessed: 2009-02-19
Nguyen, H. (2008). GPU GEMS 3, Addison-Wesley, ISBN
978-0-321-51526-1
***(2009) http://graphicsrunner.
blogspot.com/2009/01/volumerendering-102-transfer-functions.html
--Graphics Runner. Volume Rendering 102: Transfer Functions, Accesed on:
2009-04-20