Image Segmentation
Hyeun-gu Choi


    Digital Image Processing consists of several steps. The first step is image acquisition-that is, to acquire a digital image. After a digital image has been obtained, the next step deals with preprocessing that image. The key function of preprocessing is to improve the image in ways that increase the chances for success of the other processes. The next stage deals with image segmentation. Image segmentation partitions an input image into its constituent parts or objects.
    The next step is representation and description. Representation is the transformation of raw data into a descriptive form suitable for computer processing. Description deals with extracting features that result in some quantitative information of interest. Such descriptions are necessarily task specific. The last step is recognition and interpretation. Recognition is the process that assigns a label to an object based on the information of the object. Interpretation assigns meaning to recognized objects (1).
    Image segmentation is an essential procedure in many applications of image processing. Image segmentation can be classified to boundary representation and regional representation. Each representation is identification of homogeneous regions or contours of local inhomogeneity, respectively (2).  Segmentation algorithms for monochrome images generally are based on one of two basic properties of gray-level values: discontinuity and similarity. In the first category, the approach is to partition an image based on abrupt changes in gray level. The principal areas of interest within this category are detection of isolated points and detection of lines and edges in an image.
    The principal approaches in the first category are based on edge detection, and boundary detection.
    Basically, the idea underlying most edge-detection techniques is the computation of a local derivative operator. The first derivative of the gray-level profile is positive at the leading edge of a transition, negative at the trailing edge, and zero in areas of constant gray level. Hence the magnitude of the first derivative can be used to detect the presence of an edge in an image. (1)
    Hough transform is commonly used for edge linking and boundary detection. When we consider a point, infinitely many lines pass through the point, but they all satisfy a line equation for varying values of slope and intercept. In the parameter (slope and intercept) space, when we find the line intersects, those intersect points mean that they are on the same line. Therefore, we can find edges and boundaries by this technique. (1)
    The principal approaches in the second category are based on thresholding and label region algorithm. The concept of segmenting an image is based on discontinuity or similarity of the gray-level values of its pixels. (1)
    Thresholding is one of the most important approaches to image segmentation. Suppose that the gray-level histogram corresponds to an image, f(x,y), composed of light objects on a dark background. Furthermore, suppose that the object and background pixels have gray levels grouped into two dominant modes. One obvious way to extract the objects from the background is to select a threshold T that separates these modes. Then, any point (x,y) for which f(x,y) > T is called an object point; otherwise, the point is called a background point. If three or more dominant modes characterize the image histogram (for example, two types of light objects on a dark background), it is sometimes possible to segment the image by multilevel thresholding. This is generally less reliable than its single-level thresholdings. Great care must be taken with illumination because it plays a crucial role in establishing the shape of the histogram in the resulting image. (1)
    Label region is a procedure that groups pixels or subregions into larger regions. The simplest of these approaches is pixel aggregation, which starts with a set of “seed” points and from these grows regions by appending to each seed point those neighboring pixels that have similar properties (such as gray level, texture, and color).




Image Collection

     Two groups of images (medical and remote sensing images) were collected because those are easily collectable and have various characteristics in terms of digital images. Medical images are MRI (Magnetic Resonance Images) images. Production procedures for MRI images are very different from the other kind of images (remote sensing images) because MRI images are constructed by the information collected from different kinds of molecules within the human body. Each molecule gives different radiation after exposure to a strong magnetic flux. Therefore, MRI represents images created by computer processing of non-visual information. The remote sensing images represent visual light, thermal or infrared images. The benefits of these two different kinds of images with clearly different backgrounds is to provide different kinds of tests for a segmentation system.
    Synthetic images were used to create a controlled test environment for the system. This made it possible to validate the software and to illustrate simplified situations that demonstrate challenges provided by the real-world images.

Region Based Algorithms

    IDL is very powerful and helpful in prototyping and demonstrating digital image processing algorithms. This programming language was used to construct two kinds of segmentation algorithms.
    Created region based algorithms were thresholding and label region algorithms. Thresholding algorithms were consisted of global, multi thresholding and the thresholding combined with gradient operation. User could decide the threshold value just by clicking a point in a histogram plot of a test image in the written program.
Generally threshold value was chosen between pick and pick in histogram.

Figure 1 - Original image

Figure 2 - Histogram of original image with two big clusters

    IDL language already had a region growing algorithm so modified four neighbor region growing algorithm was created and tested by using synthetic test images in order to make sure it work well.

Edge Based Algorithms

    Written edge based algorithms were gradient operators and Hough transform. The tested first derivative operator was Sobel and the second derivative operator was Laplacian operator.

 Figure 3 - Sobel operator kernel

Figure 4 - Laplacian operator kernel
These operators had three by three local derivative kernels.
Figure 5 - XY plane   Figure 6 - ab plane (slope-intercept plane) 

    rq plane was adopted in created Hough transform algorithm to prevent both the slope and intercept from approaching infinity as the line approaches the vertical. This rq plane is illustrated in Figure 7. r value represents the distance between the zero point and the line. q value represents the angle between x-axis and the normal line to the data line. Figure 8 shows quantization of the parameter plane by accumulator cells. Accumulator cell counts the number of data lines passing the accumulator cell in order to find intersection points. The size of accumulator cells in this research is two by two(1).

Figure 7 - rq  plane                                        Figure 8 - accumulator cell

    The MRI and remote sensing images were processed with the written segmentation algorithms. Generally, the segmentation quality of each image was decided by human visual inspection of the results presented on a computer display. Each image was tested by both the region based and the edge based methods and compared.

    The performance of the segmentation methods depends upon many factors. It is difficult to understand the effect of each factor with real images. Therefore, the factors that appeared to be the most important were explored by constructing synthetic images. This provided a controlled test of demonstration environment. The synthetic images contained specific qualities that were assumed to be important. Those synthetic images were used
to confirm characteristics of the two image segmentation methods.

Demonstration of Image Segmentation (Widget interface)

    IDL(5) version 5 programming language has the capability of widget interface. Widget interface is a kind of interacting window to give an easily recognizable demonstration of a program. Therefore, the region growing and the contour following segmentation methods could be easily demonstrated with a graphic interface.

Figure 9 - Graphic user interface (Widget interface)

    As shown in Figure. 9, the created widget interface has six windows; five for images and one for data plot. Users can easily control the characteristics of an image and easily compare the results of the region based and the edge based algorithms by using these six windows. Each section of the eleven top menu have submenus. Eleven top menus are consisted of file, switch, filter,arithmetic, add noise, label region, histogram, color table, Hough, zoom, and done. The benefits of this interface are easy organization, user friendly interface, easy comparison among result images and easy modification.

Results and Discussion


Region based algorithms

    Simple global thresholding and double thresholding were developed and tested. The difference between these two algorithms is the number of threshold values. Simple global thresholding uses one threshold value and double thresholding uses two threshold values. Double thresholding is chosen when a test image contains several kinds of information and we want to find specific regions in a test image. Figure 10 - 14 shows the results of double thresholding. Each result has different pixel threshold range and gives different regions. Figure 15 gives the histogram information of Figure 10. It has about three peaks and the segmented result images have the threshold values between pick and pick. Each cluster may represent different property of the image. Therefore, the result images contain the information of clustered pixel values in histogram domain and give different regions with different characteristics.
Figure 10 - Original remote sensing image  Figure 11 - Segmented image with pixel value range 0-80
Figure 12 - Segmented image with pixel value range 80-130 Figure 13 - Segmented image with pixel value range 130-230 
Figure 14 - Segmented image with pixel value range 230-250  Figure 15 - Histogram of Figure 10 with two big clusters and one small cluster

    Figure 16 - 19 gives another example of double thresholding. The scaled threshold spin density values are 70 and 230. These values are adopted from the results of reference (3). This reference paper researched about a multispectral analysis of brain tissues. Though Figure 18 is not a multispectral segmentation it gives similar segmentation result to Figure 19. The segmented region is CSF (cerebrospinal tissue) (3).
Figure 16 - Original MRI image (spin density)  Figure 17 Histogram equalized image
Figure 18 - Segmented CSF tissue region by double thresholding  Figure 19 - Reference image (CSF tissue) (3) 
    These thresholding algorithms are simple and give very good results but deciding the threshold values is not easy. There is no guarantee that histogram of an image has well separated clusters in order to decide threshold values. Specially this is really difficult problem for automated system because the system should decide threshold values by itself not by human.

    Another region based algorithm is label region algorithm. IDL language already has a label region algorithm as a library. However, this label region algorithm is very vulnerable to noise because each noise is also classified to a new region. Label region algorithm was modified to get over this disadvantage. The modified label region algorithm has some margin. That is, when pixel values are compared, only if the difference of pixel values is bigger than some value, the algorithm classify the pixel to new class. Modified label region algorithm is less vulnerable to noise but it still have problems.  When a test image is complicated, the result of modified label region algorithm gives better result (Figure 22) than that (Figure 21) of normal label region algorithm. However, modified label region algorithm may miss the regions which have small pixel value differences though those are totally different regions. Preprocessing like thresholding is required for this algorithm in order to remove noise effect and to simplify the test image.
Figure 20 - Original MRI image with preprocessing (thresholding) Figure 21 - The result of normal label region algorithm

Figure 22 - The result of modified label region algorithm

Edge based algorithms (Gradient operators and Hough transform)

    Gradient operators (Sobel and Laplacian operators) are to find changes of pixel gray levels by local derivative operation. In terms of detecting edges, they detect edges as well as noises. Gradient operators are vulnerable to noises. As shown in Figure 23 - 26, the gradient operators cannot get away from noise effect so these operators are generally used for preprocessing of other segmentation algorithms.
Figure 23 - Original image  Figure 24 - The result of Sobel operator

Figure 25 - Enhanced Figure 24 by histogram equalization to see noise effect

Figure 26 - The result of Laplacian operator

Figure 27 - Enhanced Figure 26 by histogram equalization to see noise effect

    To overcome this noise effect with gradient operators, an algorithm combined with thresholding and gradient operators is suggested (1). Figure 28 gives the rule of threshold selection based on boundary characteristic. This algorithm can remove noise effect and find edges but it still has a problem to decide a proper threshold value. Figure 29 has a background scene and Figure 30 is the result of this algorithm with removed background scene.

Figure 28 - Threshold selection based on boundary characteristic

Figure 29 - Original image with background scene

Figure 30 - Segmented image without background scene

    Hough transform (1) (4) was developed to detect a specific shape. Not like other edge detecting algorithms, it looks for already chosen or designed shape in an image. Therefore, its advantage is it can specify a desired useful shape. Of course, disadvantages of Hough transform exist. It has difficulty of defining the specific shape equation in mathematic expression. As the searching shape is more complicated, it is more difficult to find the shape because the equation of that shape will have more parameters. That means it requires more complicated calculation. Hough transform requires long processing time. When an image with 192 by 128 pixel size was processed by Hough transform without any preprocessing, it took about 25 minutes. Hough transform is consisted of gradient operation, transformation to parameter space (slope-intercept plane) and calculation of accumulator cell. Hough transform is vulnerable to noise effect without preprocessing. Because the first step of Hough transform is the gradient operator, it can not get away from noise effect but after thresholding, it gives right result.
Figure 31 - Four dots Figure 34 - Found lines for four dots
Figure 32 - Transformation data points into parameter space

Figure 33 - Found intersection points by accumulator cell

    Figure 35 - 39 shows the results of Hough transform without preprocessing. Because of noise effect and complication of the test image, it gives diagonal lines of a test image as a result. Diagonal directions have more data points than other directions, so Hough transform finds diagonal direction as the best guess.
Figure 35 - Original image Figure 36 - Segmented image by Sobel operator

Figure 37 - Transferred data points into parameter space

Figure 38 - Found intersection points in parameter space by accumulator cell

Figure 39 - Result of Hough transform without preprocessing

    Figure 40 - 44 shows the result of Hough transform after preprocessing (thresholding). It clearly give better performance. The used threshold value is about 110. Therefore, Hough transform should have preprocessings in order to remove noise effect and to get simple image.

Figure 40 - Segmented image of Figure 35 by thresholding

Figure 41 - Histogram of Figure 35

Figure 42 - Transferred data points of Figure 40 into parameter space

Figure 43 - Found intersection points in parameter space by accumulator cell

Figure 44 - Result of Hough transform with preprocessing

Graphic User Interface (Widget Interface)

Figure 45 - Widget interface

    A graphic user interface, so-called Widget interface was created for demonstration of segmentation algorithms in IDL (Interactive Data Language) (5). The created widget interface has five image windows and one plot window. User can decide which window is source window and target (result of segmentation algorithms) window. This Widget interface is consisted of 11 main menus and submenus of each main menu.

1. Files

2. Switch - User can switch two windows.
3. Filter 4. Arithmetic - Adding and subtracting two images.
5. Add Noise
6. Label region 7. Histogram 8. Color table - Displaying and selecting color table by an user
9. Hough 10. Zoom - Zooming an image with several magnification submenus (2x, 3x, 4x)
11. Done - Finishing this Widget interface



     All tested segmentation algorithms have a common disadvatage to noise. Noise effect weaken almost all algorithms. This means that preprocessing to remove or reduce noise effect must be required. Generally thresholding is used as this preprocessing but thresholding has a ploblem. The threshold value should be decided so automated systems may confront this problem. The shape of histogram can be modeled as Gaussian function but the shape of histogram is affected by illumination (1). There is no gaurantee that a histogram has Gaussian function shape and well separated clusters. Another problem is we do not know which cluster has useful information. Some clusters can be noise or some clusters may contain useless information.
     Label region algorithm is vulnerable to noise because it classifies all noises to new regions. To overcome this problem, when pixel values are compared, only if pixel value difference is bigger than some threshold value, this algorithm classify the pixel to a new class. However, this modified label region algorithm also has problems. If the threshold value is too big, it can not classify pixels having small gray level difference. If the threshold value is too small, it become vulnerable to noise. It is very difficult to decide a proper threshold value.
     Gradient operators also have noise problems. They can find edges as well as noises. However, when gradient operators and thresholding algorithm are combined, it gives much better results but it has the problem of deciding threshold value.
    Hough transform also has noise problem but it is less affected by random noise than other algorithms because it looks for a specific shape. It is very vulnerable to complicated and noisy image. It gives diagonal directions as the best fit results when the test image is very complicated and noisy. The reason why the diagonal direction lines are the best fit for complicated and noisy image is that the diagonal directions have more data points than any other. Hough transform is a time-cost algorithm. When a test image is bigger, it takes much more time for processing because more data points should be transformed to parameter space and counted by accumulator cell. Hough transform has a limitation about the shape looking for. Because Hough transform uses mathematical equation of the searching shape, the equation of the searching shape should be known. When the equation is complicated, the processing time will increase very much. For example, a circle has an equation  (X-A)^2 + (Y-B)^2 = C^2. This equation has three unknown parameter variables so Hough transform should adopt three dimentional parameter space.
     Widget interface has many benefits. It can be easily modified because each menu and submenu is consisted of its own small algorithm and then integrated to one interface. When a menu or submenu need to be modified, only that menu can be easily modified. Widget interface also have the possiblity to add other algorithms freely. That is, future upgrade is very easy and many small algorithms can be organized into one. Widget interface is user friendly interface. It is almost the same as normal window menus so there is no need to teach users.


Table of Contents