Image Segmentation
Hyeun-gu Choi
Background
Digital Image Processing
consists of several steps. The first step is image acquisition-that is,
to acquire a digital image. After a digital image has been obtained, the
next step deals with preprocessing that image. The key function of preprocessing
is to improve the image in ways that increase the chances for success of
the other processes. The next stage deals with image segmentation. Image
segmentation partitions an input image into its constituent parts or objects.
The next step is representation
and description. Representation is the transformation of raw data into
a descriptive form suitable for computer processing. Description deals
with extracting features that result in some quantitative information of
interest. Such descriptions are necessarily task specific. The last step
is recognition and interpretation. Recognition is the process that assigns
a label to an object based on the information of the object. Interpretation
assigns meaning to recognized objects (1).
Image segmentation
is an essential procedure in many applications of image processing. Image
segmentation can be classified to boundary representation and regional
representation. Each representation is identification of homogeneous regions
or contours of local inhomogeneity, respectively (2).
Segmentation algorithms for monochrome images generally are based on one
of two basic properties of gray-level values: discontinuity and similarity.
In the first category, the approach is to partition an image based on abrupt
changes in gray level. The principal areas of interest within this category
are detection of isolated points and detection of lines and edges in an
image.
The principal approaches
in the first category are based on edge detection, and boundary detection.
Basically, the idea
underlying most edge-detection techniques is the computation of a local
derivative operator. The first derivative of the gray-level profile is
positive at the leading edge of a transition, negative at the trailing
edge, and zero in areas of constant gray level. Hence the magnitude of
the first derivative can be used to detect the presence of an edge in an
image. (1)
Hough transform is
commonly used for edge linking and boundary detection. When we consider
a point, infinitely many lines pass through the point, but they all satisfy
a line equation for varying values of slope and intercept. In the parameter
(slope and intercept) space, when we find the line intersects, those intersect
points mean that they are on the same line. Therefore, we can find edges
and boundaries by this technique. (1)
The principal approaches
in the second category are based on thresholding and label region algorithm.
The concept of segmenting an image is based on discontinuity or similarity
of the gray-level values of its pixels. (1)
Thresholding is one
of the most important approaches to image segmentation. Suppose that the
gray-level histogram corresponds to an image, f(x,y), composed of light
objects on a dark background. Furthermore, suppose that the object and
background pixels have gray levels grouped into two dominant modes. One
obvious way to extract the objects from the background is to select a threshold
T that separates these modes. Then, any point (x,y) for which f(x,y) >
T is called an object point; otherwise, the point is called a background
point. If three or more dominant modes characterize the image histogram
(for example, two types of light objects on a dark background), it is sometimes
possible to segment the image by multilevel thresholding. This is generally
less reliable than its single-level thresholdings. Great care must be taken
with illumination because it plays a crucial role in establishing the shape
of the histogram in the resulting image. (1)
Label region is a procedure
that groups pixels or subregions into larger regions. The simplest of these
approaches is pixel aggregation, which starts with a set of “seed” points
and from these grows regions by appending to each seed point those neighboring
pixels that have similar properties (such as gray level, texture, and color).
Methods
Image Collection
Two groups of images
(medical and remote sensing images) were collected because those are easily
collectable and have various characteristics in terms of digital images.
Medical images are MRI (Magnetic Resonance Images) images. Production procedures
for MRI images are very different from the other kind of images (remote
sensing images) because MRI images are constructed by the information collected
from different kinds of molecules within the human body. Each molecule
gives different radiation after exposure to a strong magnetic flux. Therefore,
MRI represents images created by computer processing of non-visual information.
The remote sensing images represent visual light, thermal or infrared images.
The benefits of these two different kinds of images with clearly different
backgrounds is to provide different kinds of tests for a segmentation system.
Synthetic images were
used to create a controlled test environment for the system. This made
it possible to validate the software and to illustrate simplified situations
that demonstrate challenges provided by the real-world images.
Region Based Algorithms
IDL is very powerful and
helpful in prototyping and demonstrating digital image processing algorithms.
This programming language was used to construct two kinds of segmentation
algorithms.
Created region based
algorithms were thresholding and label region algorithms. Thresholding
algorithms were consisted of global, multi thresholding and the thresholding
combined with gradient operation. User could decide the threshold value
just by clicking a point in a histogram plot of a test image in the written
program.
Generally threshold value was chosen between
pick and pick in histogram.
Figure 1 - Original image
Figure 2 - Histogram of original image
with two big clusters
IDL language already
had a region growing algorithm so modified four neighbor region growing
algorithm was created and tested by using synthetic test images in order
to make sure it work well.
Edge Based Algorithms
Written edge based algorithms
were gradient operators and Hough transform. The tested first derivative
operator was Sobel and the second derivative operator was Laplacian operator.
Figure 3 - Sobel operator kernel
Figure 4 - Laplacian operator kernel
These operators had three by three local
derivative kernels.
 |
 |
| Figure 5 - XY plane |
Figure 6 - ab plane (slope-intercept
plane) |
rq
plane was adopted in created Hough transform algorithm to prevent both
the slope and intercept from approaching infinity as the line approaches
the vertical. This rq plane
is illustrated in Figure 7. r
value represents the distance between the zero point and the line. q
value represents the angle between x-axis and the normal line to the data
line. Figure 8 shows quantization of the parameter plane by accumulator
cells. Accumulator cell counts the number of data lines passing the accumulator
cell in order to find intersection points. The size of accumulator cells
in this research is two by two(1).

Figure 7 - rq
plane
Figure 8 - accumulator cell
The MRI and remote sensing
images were processed with the written segmentation algorithms. Generally,
the segmentation quality of each image was decided by human visual inspection
of the results presented on a computer display. Each image was tested by
both the region based and the edge based methods and compared.
The performance of the
segmentation methods depends upon many factors. It is difficult to understand
the effect of each factor with real images. Therefore, the factors that
appeared to be the most important were explored by constructing synthetic
images. This provided a controlled test of demonstration environment. The
synthetic images contained specific qualities that were assumed to be important.
Those synthetic images were used
to confirm characteristics of the two
image segmentation methods.
Demonstration of Image Segmentation (Widget
interface)
IDL(5)
version 5 programming language has the capability of widget interface.
Widget interface is a kind of interacting
window to give an easily recognizable demonstration of a program. Therefore,
the region growing and the contour
following segmentation methods could be easily demonstrated with a graphic
interface.
Figure 9 - Graphic user interface (Widget
interface)
As shown in Figure.
9, the created widget interface has six windows; five for images and one
for data plot. Users can easily control
the characteristics of an image and easily compare the results of the region
based and the edge based algorithms
by using these six windows. Each section of the eleven top menu have submenus.
Eleven top menus are consisted of file, switch, filter,arithmetic, add
noise, label region, histogram, color table, Hough, zoom, and done. The
benefits of this interface are easy organization, user friendly interface,
easy comparison among result images and easy modification.
Results and Discussion
Region based algorithms
Simple global thresholding
and double thresholding were developed and tested. The difference between
these two algorithms is the number of threshold values. Simple global thresholding
uses one threshold value and double thresholding uses two threshold values.
Double thresholding is chosen when a test image contains several kinds
of information and we want to find specific regions in a test image. Figure
10 - 14 shows the results of double thresholding. Each result has different
pixel threshold range and gives different regions. Figure 15 gives the
histogram information of Figure 10. It has about three peaks and the segmented
result images have the threshold values between pick and pick. Each cluster
may represent different property of the image. Therefore, the result images
contain the information of clustered pixel values in histogram domain and
give different regions with different characteristics.
 |
 |
| Figure 10 - Original remote sensing image |
Figure 11 - Segmented image with pixel value range 0-80 |
 |
 |
| Figure 12 - Segmented image with pixel
value range 80-130 |
Figure 13 - Segmented image with pixel
value range 130-230 |
 |
 |
| Figure 14 - Segmented image with pixel
value range 230-250 |
Figure 15 - Histogram of Figure 10 with
two big clusters and one small cluster |
Figure 16 - 19 gives
another example of double thresholding. The scaled threshold spin density
values are 70 and 230. These values are adopted from the results of reference
(3). This reference paper researched about
a multispectral analysis of brain tissues. Though Figure 18 is not a multispectral
segmentation it gives similar segmentation result to Figure 19. The segmented
region is CSF (cerebrospinal tissue) (3).
 |
 |
| Figure 16 - Original MRI image (spin density) |
Figure 17 Histogram equalized image |
 |
 |
| Figure 18 - Segmented CSF tissue region
by double thresholding |
Figure 19 - Reference image (CSF
tissue) (3) |
These thresholding
algorithms are simple and give very good results but deciding the threshold
values is not easy. There is no guarantee that histogram of an image has
well separated clusters in order to decide threshold values. Specially
this is really difficult problem for automated system because the system
should decide threshold values by itself not by human.
Another region based
algorithm is label region algorithm. IDL language already has a label region
algorithm as a library. However, this label region algorithm is very vulnerable
to noise because each noise is also classified to a new region. Label region
algorithm was modified to get over this disadvantage. The modified label
region algorithm has some margin. That is, when pixel values are compared,
only if the difference of pixel values is bigger than some value, the algorithm
classify the pixel to new class. Modified label region algorithm is less
vulnerable to noise but it still have problems. When a test image
is complicated, the result of modified label region algorithm gives better
result (Figure 22) than that (Figure 21) of normal label region algorithm.
However, modified label region algorithm may miss the regions which have
small pixel value differences though those are totally different regions.
Preprocessing like thresholding is required for this algorithm in order
to remove noise effect and to simplify the test image.
 |
 |
| Figure 20 - Original MRI image with preprocessing
(thresholding) |
Figure 21 - The result of normal label
region algorithm |
Figure 22 - The result of modified label
region algorithm
Edge based algorithms (Gradient operators and Hough transform)
Gradient operators (Sobel
and Laplacian operators) are to find changes of pixel gray levels by local
derivative operation. In terms of detecting edges, they detect edges as
well as noises. Gradient operators are vulnerable to noises. As shown in
Figure 23 - 26, the gradient operators cannot get away from noise effect
so these operators are generally used for preprocessing of other segmentation
algorithms.
 |
 |
| Figure 23 - Original image |
Figure 24 - The result of Sobel operator |
Figure 25 - Enhanced Figure 24 by histogram
equalization to see noise effect
Figure 26 - The result of Laplacian operator
Figure 27 - Enhanced Figure 26 by histogram
equalization to see noise effect
To overcome this noise
effect with gradient operators, an algorithm combined with thresholding
and gradient operators is suggested (1). Figure 28 gives the rule of threshold
selection based on boundary characteristic. This algorithm can remove noise
effect and find edges but it still has a problem to decide a proper threshold
value. Figure 29 has a background scene and Figure 30 is the result of
this algorithm with removed background scene.
Figure 28 - Threshold selection based
on boundary characteristic
Figure 29 - Original image with background
scene
Figure 30 - Segmented image without background
scene
Hough transform (1)
(4) was developed to detect a specific
shape. Not like other edge detecting algorithms, it looks for already chosen
or designed shape in an image. Therefore, its advantage is it can specify
a desired useful shape. Of course, disadvantages of Hough transform exist.
It has difficulty of defining the specific shape equation in mathematic
expression. As the searching shape is more complicated, it is more difficult
to find the shape because the equation of that shape will have more parameters.
That means it requires more complicated calculation. Hough transform requires
long processing time. When an image with 192 by 128 pixel size was processed
by Hough transform without any preprocessing, it took about 25 minutes.
Hough transform is consisted of gradient operation, transformation to parameter
space (slope-intercept plane) and calculation of accumulator cell. Hough
transform is vulnerable to noise effect without preprocessing. Because
the first step of Hough transform is the gradient operator, it can not
get away from noise effect but after thresholding, it gives right result.
 |
 |
| Figure 31 - Four dots |
Figure 34 - Found lines for four dots |
Figure 32 - Transformation data points
into parameter space
Figure 33 - Found intersection points
by accumulator cell
Figure 35 - 39 shows
the results of Hough transform without preprocessing. Because of noise
effect and complication of the test image, it gives diagonal lines of a
test image as a result. Diagonal directions have more data points than
other directions, so Hough transform finds diagonal direction as the best
guess.
 |
 |
| Figure 35 - Original image |
Figure 36 - Segmented image by Sobel operator |
Figure 37 - Transferred data points into
parameter space
Figure 38 - Found intersection points
in parameter space by accumulator cell
Figure 39 - Result of Hough transform
without preprocessing
Figure 40 - 44 shows
the result of Hough transform after preprocessing (thresholding). It clearly
give better performance. The used threshold value is about 110. Therefore,
Hough transform should have preprocessings in order to remove noise effect
and to get simple image.
Figure 40 - Segmented image of Figure
35 by thresholding
Figure 41 - Histogram of Figure 35
Figure 42 - Transferred data points of
Figure 40 into parameter space
Figure 43 - Found intersection points
in parameter space by accumulator cell
Figure 44 - Result of Hough transform
with preprocessing
Graphic User Interface (Widget Interface)
Figure 45 - Widget interface
A graphic user interface,
so-called Widget interface was created for demonstration of segmentation
algorithms in IDL (Interactive Data Language) (5).
The created widget interface has five image windows and one plot window.
User can decide which window is source window and target (result of segmentation
algorithms) window. This Widget interface is consisted of 11 main menus
and submenus of each main menu.
1. Files
· New - Getting a new image
· Save - Saving an target window’s
image
2. Switch - User can switch two windows.
3. Filter
· High pass, low pass, unsharp,
laplacian, vertical edges, horizontal edges, Sobel, median, and
Laplacian filters
4. Arithmetic - Adding and subtracting two
images.
5. Add Noise
6. Label region
· Label region - Normal four neighbor
label region algorithm
· MD label region - Modified label
region algorithm
7. Histogram
· Histogram of source window -
Displaying histogram of source window at the plot window
· Histogram equalization - Visually
enhancing an image by histogram equalization
· One threshold - Simple global
thresholding (User can choose threshold value by clicking mouse
in histogram plot in the plot window.)
· Two threshold - Two thresholding
(User chooses threshold values.)
8. Color table - Displaying and selecting
color table by an user
9. Hough
· Manual - An user can choose intersection
points in parameter space.
· Auto - Automatically doing all
procedures of Hough transform
10. Zoom - Zooming an image with several magnification
submenus (2x, 3x, 4x)
11. Done - Finishing this Widget interface
Conclusions
All tested segmentation
algorithms have a common disadvatage to noise. Noise effect weaken almost
all algorithms. This means that preprocessing to remove or reduce noise
effect must be required. Generally thresholding is used as this preprocessing
but thresholding has a ploblem. The threshold value should be decided so
automated systems may confront this problem. The shape of histogram can
be modeled as Gaussian function but the shape of histogram is affected
by illumination (1). There is no gaurantee that a histogram has Gaussian
function shape and well separated clusters. Another problem is we do not
know which cluster has useful information. Some clusters can be noise or
some clusters may contain useless information.
Label region
algorithm is vulnerable to noise because it classifies all noises to new
regions. To overcome this problem, when pixel values are compared, only
if pixel value difference is bigger than some threshold value, this algorithm
classify the pixel to a new class. However, this modified label region
algorithm also has problems. If the threshold value is too big, it can
not classify pixels having small gray level difference. If the threshold
value is too small, it become vulnerable to noise. It is very difficult
to decide a proper threshold value.
Gradient operators
also have noise problems. They can find edges as well as noises. However,
when gradient operators and thresholding algorithm are combined, it gives
much better results but it has the problem of deciding threshold value.
Hough transform also
has noise problem but it is less affected by random noise than other algorithms
because it looks for a specific shape. It is very vulnerable to complicated
and noisy image. It gives diagonal directions as the best fit results when
the test image is very complicated and noisy. The reason why the diagonal
direction lines are the best fit for complicated and noisy image is that
the diagonal directions have more data points than any other. Hough transform
is a time-cost algorithm. When a test image is bigger, it takes much more
time for processing because more data points should be transformed to parameter
space and counted by accumulator cell. Hough transform has a limitation
about the shape looking for. Because Hough transform uses mathematical
equation of the searching shape, the equation of the searching shape should
be known. When the equation is complicated, the processing time will increase
very much. For example, a circle has an equation (X-A)^2 + (Y-B)^2
= C^2. This equation has three unknown parameter variables so Hough transform
should adopt three dimentional parameter space.
Widget interface
has many benefits. It can be easily modified because each menu and submenu
is consisted of its own small algorithm and then integrated to one interface.
When a menu or submenu need to be modified, only that menu can be easily
modified. Widget interface also have the possiblity to add other algorithms
freely. That is, future upgrade is very easy and many small algorithms
can be organized into one. Widget interface is user friendly interface.
It is almost the same as normal window menus so there is no need to teach
users.
Table of Contents