Daniel Newland

1. Introduction

The analysis of remotely sensed imagery is generally reduced to three questions: What are we looking at, where is it, and what are its properties? New sensors are making it easier and easier to answer those questions. New techniques can overcome the hardware limitations (e.g., resolution) of those new sensors to provide solutions with ever-increasing assurance. One useful procedure, Spectral Mixture Analysis, (1) can calculate the abundance of various material types present within an individual pixel. In 1996 Gross (2) proposed a superior Stepwise Unmixing approach that gives a more intelligent statistics-based solution.

To this point, the stepwise unmixing algorithm has been developed and documented largely on synthetically generated data. It has yet to be rigorously quantitatively tested on true data, and the remote sensing community will have little confidence in the procedure until this has been done. This research, though limited in scope, attempts to verify the power of the stepwise technique. Ultimately, it is anticipated that the stepwise procedure will supplant classic unmixing and become widely used in image exploitation.

There is a central hypothesis for this
research: that the stepwise spectral unmixing algorithm shown to work on
synthetic imagery will also be successful on real-world data. That success
will be measured in part by an error metric that describes the performance
of the technique. It will also be measured by comparing the stepwise results
to those of the traditional method, as well as to a hierarchical version
of traditional unmixing, which is more complex but more successful.

The Background first introduces the fundamental
resolution compromise in remote sensing instruments that prompts the need
for subpixel analysis. We will explain the mixed pixel concept and the
assumptions required by this study. Considerations in gathering material
information and building spectral libraries will also be reviewed.

Images of ground scenes have a trade-off
between *spatial *and *spectral *resolution. The spatial resolution
of a sensor can be defined as the area of ground imaged by one pixel. High
spatial resolution means each pixel represents a small square of ground,
so it is easy to identify tiny features in an image. Spectral resolution
is the width of the regions of the electromagnetic spectrum that a sensor
will detect. High spectral resolution allows material identification through
a characterization of its spectrum.

The digital counts of an image are defined by a linear function of the detected radiance,

(1)

where dA (the differential area of a pixel) and (the differential solid angle through the optics onto the ground) combine to be the spatial resolution, and is the spectral resolution. With a finite amount of photons reaching a detector, there must be a practical compromise between the size of a detector and the width of its spectral bandpass, in order to keep the signal-to-noise ratio (SNR) of the sensor high. A low SNR is undesirable because it impairs radiometric precision.

Sensors tend to congregate around the
extremes of this tradeoff, as illustrated in Figure 1. *Panchromatic
*systems which have several hundred nanometer bandwidths offer far better
spatial resolution than *multispectral *(around ten bands) or *hyperspectral
*(hundreds of bands) systems. However, the many bands, each with different
spectral sensitivities, can potentially be used to increase understanding
of the spatial (i.e., subpixel) content of a scene despite spatial resolution
limitations.

**Figure 1: **Spectral vs. spatial resolution
(after Gross (2))

While some pixels in an image could be
clearly labeled as "trees" or "roads," for example,
a number of pixels will be *mixed*, containing two or more material
classes (called *endmembers*). Figure 2 displays the spectral information
from two example materials, and what the spectrum of a mixed pixel would
look like if it contained equal amounts of each. Note that these are image-derived
spectra; see Section 2.4 below for details. Figure 3 shows how two endmembers
could be spatially arranged in a given pixel.

**Figure 2: **Example spectral mixture

**Figure 3:** Basic spatial mixture types (after
Robinson (3))

An *intrinsic *mixture is a microscopically-combined
material. Describing one requires a nonlinear model, which will not be
covered by this research. An *aggregate *mixture is combined at the
macroscopic level, but the elements cannot be resolved with even a high-resolution
sensor. An *areal *mixture is similar to an aggregate mixture, except
the components can be meaningfully segmented spatially. The radiance leaving
these last two types of scenes is a function of the linear interactions
between their constituents and incident photons. (3)
The process known as Spectral Mixture Analysis, (4)
or *unmixing*, can calculate the fractional presence of those predetermined
materials.

The traditional unmixing method approaches
all pixels as a linear combination of pure-material spectral vectors; from
this it is relatively simple to find the fractional makeup of each pixel.
The end product is a set of *material maps* (or *fraction maps*)
where the value of a pixel indicates the relative abundance of an endmember.
A simple four-pixel two-material example can be seen in Figure 4.

**Figure 4:** Illustration of mixed pixels
and fraction maps

2.4 Endmembers

The unmixing routines require a library of spectra to apply to a given pixel. "Grass" and "road" would be two good, distinct endmembers for an example suburban scene. Depending on the requirements of a study, there could be much finer classes (e.g., dry soil, disturbed soil, wet soil, moist soil, shaded soil, etc.). The maximum number of materials that can be unmixed is the number of bands in the input image, though this will not yield good results because linear unmixing forces some fraction for every endmember. Physically fitting a hundred materials inside a pixel is unlikely; expecting to accurately unmix the contributions of a hundred materials is even less realistic. Instead, using the minimum number of endmembers necessary to characterize the scene will reduce errors. For better solutions to the unmixing equations (cf. Section 3.2), the endmembers should be linearly independent. Earlier research on the unmixing procedures used synthetically-generated data where every pixel was a pure material and came from libraries selected to construct particular scenes. In contrast, users analyzing real data must supply their own libraries.

There are two approaches for endmember
selection with real scenes. The simpler method is to have a human operator
examine the image and make a list of the distinguishable material types,
based on their abundance and spectral distinctions. These image-derived
classes are called *in-scene* endmembers. The reference spectral signatures
will be averages of selected pixels from uniform regions of unique endmembers.
Then the spectral library will have mean digital count values for every
endmember at every band of data. It has been suggested that the Pixel Purity
Index algorithm could be used to find spectrally pure pixels in a scene,
(3) but this may not produce as many
or as finely separated endmembers as a user would desire.

The other way is to correct the data for atmospheric effects and use spectral reflectance measurements taken on-site as the reference library. This provides more accurate endmember characterization, but atmospheric compensation does introduce the possibility of more errors. Using measured spectra also requires an interpolation from the reference wavelengths to match the bands of data in the hyperspectral image.

When ground data are not available, an
appropriate library can be assembled from literature spectra that are believed
to be in the scene (Note that the operator must be careful - a reference
"deciduous forest" spectrum could be significantly different
from the forest in his image. And if he only has access to spectra measured
in a laboratory, what are the chances he will find "deciduous forest"
listed?). This would again require user selection of the material types,
though it does allow testing of an unmixing procedure by using more spectra
than are actually present. (5)

We will cover three techniques for subpixel
material investigation: traditional linear unmixing, hierarchical linear
unmixing, and stepwise unmixing. The classic linear unmixing has been available
for a number of years. Hierarchical unmixing is a more successful but also
more complicated approach that iteratively unmixes a scene with a hierarchical
materials library. Stepwise unmixing is a relatively new statistics-based
approach that has many advantages over the two better-known methods.

3.2 Traditional Spectral Unmixing

The traditional unmixing equation (4) is described as follows:

(2)

R_{i} is the reflectance of a given spectrum in the ith spectral
band, N is the number of endmembers, R_{e,i} is the reflectance
of reference endmember e in the ith spectral band, f_{e }is the
unknown fraction of endmember e in the pixel, is
the error in band i for the fit of N endmembers, and k is the number of
bands in the image. The calculation can equivalently be performed in digital
count space when in-scene spectral libraries are used; that is,

(3)

with DC_{i} the image value of a pixel in the ith spectral band
and DC_{e,i} the digital count in the ith spectral band of the
library for reference endmember e. From one of these equations it is relatively
simple to find the N values of f_{e} in the pixel by minimizing with
least-squares techniques.

There are three constraint conditions
that may be applied to solving the equation. (3)
The first is *unconstrained*, described above, where fractions are
allowed to take any value needed to minimize the residual error. The second
condition is called *partially constrained*. Here, the sum of all
the fractions within a pixel must be unity, giving one equality constraint.

(4)

Both unconstrained and partially constrained unmixing allow positive
and negative fractions. The *fully constrained* condition requires
that each fraction lie between zero and one.

(5)

This has 2*N inequality constraints. In practice, the code that implements these algorithms still produces negative fractions and values greater than one regardless of constraint, but the frequency and magnitude of the aberrations are reduced by increasing the constraint of the process.

Processing time increases with more constraints (more iterations are necessary), which would seem to be unavoidable in order to achieve the best solution. But small negative fractions are acceptable because the reference endmembers are just averages of the true material vectors, which are assumed to be distributed in a gaussian fashion. Regardless of constraints, the weakness of the classic unmixing process is that it calculates the presence of all available endmembers in every pixel across an image. The introduction of too many materials results in errors in the fractions. (5)

3.3 Hierarchical Linear Unmixing

To overcome this primary fault of classic
unmixing, a *hierarchical *approach can be used to reduce the number
of endmembers found per pixel. Rather than solving fractions for all materials
at once, this technique first unmixes broad material classes, then proceeds
to a group's constituents only if the unmixed fraction is greater than
a given threshold. The figure below is an example of how a materials hierarchy
could be arranged.

**Figure 5: **Example Materials Hierarchy and
Complete Library

The upper-level class spectra are simple averages of the spectra below them. Here the two tree types would be averaged together to form a single representative "tree" class, which would be combined with the "grass" spectrum to define the "vegetation" class. A decision threshold would be set so that classes not significantly contributing to the mixed spectrum would not be refined in the next unmixing (We will have a threshold of 0.2 for this example.). The routine would first unmix the top tier of endmembers. Let's assume the results are as follows:

**Figure 6:** Hierarchical Unmixing Example

At the next iteration, the classes unmixed would be "concrete," "metal," "water," and "vegetation." Once a class fails a threshold, it is not subdivided later. If there were a third level to the "man-made" class, and "vegetation" had an unmixed fraction of 0.2 at the second iteration, a third unmixing would not include vegetation sub-classes. This makes the coarser unmixings more important, and allows the algorithm to find fewer endmembers in each pixel.

There are two disadvantages to this method. The first is the labor required to choose material groupings and average spectra together at the various levels. The example had just six endmembers; consider 50- or 100-endmember libraries with five to eight hierarchical levels. The second drawback is that at higher decision thresholds, a pixel might not reach any of the base materials. A user might have to run the code several times to find a threshold that had few or no empty pixels in the output.

3.4 Stepwise Spectral Unmixing

The *stepwise *technique designed
by Gross (6) employs statistical regression
to also select an subset of the full library of endmembers, but without
the cumbersome preparation of hierarchical unmixing. These endmembers are
those that minimize the error. The general predictive equation

(6)

is used, where is the estimated spectral vector for the pixel, A is the matrix of reflectance values, and x is a vector containing the fractions.

A basic ANOVA (Analysis of Variance) table
is illustrated in Figure 7. This table can be used to analyze the variance
in a predictive model into its component parts: one due to the relationship
with the predictor variable(s), and one due to error. Using Equation 6
as a model, let be
an m-vector, x an n-vector, and A an m x n matrix (m is the number of bands
and n is the number of endmembers). The first column in the table below
contains the variation source. The second column contains the degrees of
freedom for that source. The third column defines the Sum of Squares (SS).
The degrees of freedom and the Sum of Squares are referred to as *corrected
*because the mean of y is subtracted from the measurements. The fourth
and fifth columns show uncorrected degrees of freedom and how to calculate
the SS in matrix notation. The final column calculates the Mean Square
(MS) by dividing the SS by the appropriate degrees of freedom.

**Figure 7: **Basic ANOVA Table

The errors should have a chi-square (X^{2})
distribution if the regression model is a good one and the errors are gaussian
with zero mean. If the regression model is weak, then the errors will not
be chi-square distributed. Knowing that the ratio of two chi-square variables
divided by their degrees of freedom has an F-distribution, as described
by Equation 7, a hypothesis test can be formed where m and n are the degrees
of freedom for the two chi-square variables.

(7)

If the errors are gaussian, then SSR and
SSE are X^{2 }distributed

(8)

(9)

and the ratio of

(10)

will follow an F_{n-1,m-n} distribution. This ratio is compared
to a tabulated F-statistic with n-1 and m-n degrees of freedom at a desired
confidence level. If the ratio is greater than the F-statistic, then the
regression model is a good one. If the ratio is less than the tabulated
value, the model cannot explain enough of the variance to justify using
it, and it should be replaced by a better model.

Stepwise regression is based on an ANOVA
calculation of the *Extra Sum of Squares.* (7)
This method compares an n-term model with an (n-1)-term model to determine
the benefit of adding the additional term. The reduced-order term is defined
by

(11)

where z is an (n-1)-vector and W is an m x (n-1) matrix. The SS and MS are then calculated as shown in Table 3.

**Figure 8:** Extra Sum Of Squares ANOVA Table

As with Figure 7, the sum of squares is
X^{2} distributed. The ratio of MS_{extra_term}/MSE is
compared to the value in a F-statistic table with the appropriate degrees
of freedom at the desired confidence level. If the ratio is greater than
the tabulated value, then the regression model is valid, and the model
with the added term is required. If the ratio is smaller, then the simpler
model is retained. In the coded implementation, a fixed value of F-to-enter
and F-to-remove is used rather than an F-statistics table, regardless of
the degrees of freedom in the model of interest. Higher F-values produce
a more stringent endmember selection; lower values let the algorithm accept
more endmembers.

The stepwise selection continues until
the final subset contains the optimum combination of endmembers from the
reference library. The program has loops for both adding and removing variables.
This method can map a greater number of endmembers than the classic procedure,
and can also prevent extraneous fractions from being over-fit to the image
noise. The fraction maps can be generated with or without constraints;
if the problem is constrained, the solution is obtained through a *restricted
least squares*, involving linear equality and inequality constraints.
(2)

3.5 Error Metric and Performance Evaluation

When truth maps are available, the error
metric proposed by Gross (2) can be used
to quantitatively evaluate the fraction maps that are produced by the unmixing
algorithms. The *squared error* (SE) is defined for each set of fraction
maps by

(12)

The pixel summation is over all N pixels in the image; the material summation is over all materials in the library. The relative magnitude of this error is a measure of the closeness of the test fractions to the truth fractions. Errors of both comission and omission are penalized by this metric. Qualitatively, a simple visual check of the correlation between the fraction maps and the original data can also be performed.

The validation of the stepwise unmixing routine underwent two phases. This section describes initial testing and results on the stepwise and linear unmixing techniques. We will discuss data processing, endmember collection, truth map production, and the tools used to perform the operations. Section 5 details how the second approach was refined from the first to test the stepwise routine against a hierarchical linear unmixing.

We chose an image from the HYDICE (Hyperspectral Digital Imagery Collection Experiment) sensor, which is an airborne pushbroom imager. (8) HYDICE images have 210 spectral bands covering wavelengths of 0.4 - 2.5 microns in roughly 10nm bandwidths. The flight strips are 320 pixels in the across-track direction, with about six pixels reserved on either end for background measurement. The specific image chosen was taken over the ARM (Atmospheric Radiation Measurement) program site in Lamont, Oklahoma in June 1997. It was cropped to 320 pixels square for ease of processing.

With a sensor flying height of approximately 3.5 km and a GIFOV (Ground Instantaneous Field of View) of 0.5 mrad, the spatial resolution is 1.75 meters per pixel. The scene (Figure 9) is largely pasture, with a section of cut pasture in the center that has calibration panels arranged in it. Other notable features include trees, a road, a parking lot, and vehicles. An excellent ground truth collection accompanied the overflight. Photographs and diagrams of the entire scene, photographs and reflectance measurements from various features, and radiosonde data recording the atmospheric profile over the site were all very useful in analyzing the data.

**Figure 9: **Test image

To generate a "truth" image, every pixel in the image had to be classified into only one of a set of endmembers. With strips of zeros on either side of the image, the 320 x 320 image was further cropped to 300 x 300 pixels to avoid a "blank data" endmember. Using "blank data" in the unmixing would not only be meaningless for the majority of the image, but would also introduce errors. To make the input for the unmixing routines, the image was convolved down with a 4 x 4 averaging kernel. That is, the convolved image was sized 75 x 75 pixels, where each convolved pixel represented the simple average of a 4 x 4 window (16 total pixels) from the original image, for all bands. Figure 10 shows this spatial massaging. The plan was that the original high-resolution image would represent pure-material pixels, and the convolved image would then have mixed pixels containing fractions of those pure spectra.

**Figure 10:** Spatial processing

After spatial issues, spectral decisions came next. Using radiosonde data taken at the ARM site, MODTRAN (9) was run to determine which bands of the image were impacted by atmospheric absorption. Band 108 (1.4 microns), shown in Figure 11 below, is an example of the interference caused by the atmosphere. Figure 12 gives the atmospheric transmittance over the spectral coverage of the HYDICE instrument. Based on defaults from previous studies, (5) 128 of the 210 total bands were chosen to be used in the unmixing routines. When selecting endmembers, all bands were included so that the fullest possible spectral characterization could be viewed.

**Figure 11: **Band 108, noisy band example

**Figure 12:** Atmospheric attenuation and
band selection

Realistically, with a 1.75m ground spot for every pixel, all the pixels in the original image are mixed themselves. But the definition of "mixed" depends on what endmembers are used. Regardless of how well a sensor can spatially resolve detail in a scene, real images will always have mixed pixels at some level of distinction. A patch of "pasture" could be broken down into "soil" and "vegetation," which could in turn be divided into soils of different moisture and mineral content, and vegetation of different species and health. Blades of grass at different orientations would be a further extension. To make unmixing tractable, endmembers were defined broadly enough so that most pixels in the image would be "pure." Tables of the 24 classes and their descriptions follow. A few examples from the library are shown in Figure 14. The effects of the atmosphere can be seen in the large dips in the spectra past 0.9 microns.

**Figure 13:** Endmember listing

**Figure 14:** Sample endmember spectra

Note that these are all image-derived endmembers.
Actual reflectance spectra from the ground truth was available for many
materials, but using in-scene endmembers was better for two reasons. First,
the reflectance data did not cover the full scope of the materials we wanted
to investigate. Secondly, using reflectance material spectra would necessitate
some form of atmospheric correction to transform the image to reflectance
space, which would only complicate the process. It should also be noted that ground truth is a luxury
that many operational applications of the unmixing routines will not have.

Creation of the truth map was the largest hurdle to overcome in the first approach.. A truth map is essentially a classified image, where every pixel has been assigned a material class. The problem we were faced with is that "truth" will always be arbitrary for real images. The best truthing scenario would have scores of scientists combing through the entire area to be imaged, recording and identifying spectra (and making the "right" decisions) at every meter. That won't happen.

But we can make a good effort to mimic that approach. Classification algorithms operating on hyperspectral data use a variety of techniques to make those decisions. All the standard methods available in the hyperspectral exploitation package ENVI (10) were examined for their accuracy. Note that "examined" means a visual assessment of the material assignments, and "accuracy" is relative to what an operator believes is appropriate for a pixel. Again, truthing real data is an arbitrary task.

K-means and IsoData unsupervised classifiers, which choose their own endmembers, were run to view the spectral separability of the data. Supervised classifiers, such as Minimum Distance, Spectral Angle Mapper, and the Parallelepiped Classifier, were used to map the endmembers listed above across the scene. The Gaussian Maximum Likelihood and Mahalanobis Distance classifiers could not be applied because they require endmember sample sizes greater than the number of bands in the image - some classes, most notably the calibration panels, had relatively few sample points. The outputs were not very encouraging; none of the routines produced a satisfactory classification. The challenge we faced was that a human operator would have to classify all 90,000 pixels in the image, using the ENVI software to define material regions.

This has some advantages. A human user can incorporate texture information, neighboring spectra, and contextual information when making decisions. For example, classification algorithms wouldn't know that calibration panels had not been placed anywhere other than the cut pasture in the center of the scene. The endmembers were specifically chosen to make this human-driven task easier. For example, about 60% of the image can be quickly identified as the "pasture" class. The "unknown man-made" class is the repository for miscellaneous spectra unlike the major classes.

Different ways of displaying the data assisted material identification. Figure 15 gives true-color (RGB = 565, 530, 420 nm) and false-color IR representations of the scene (RGB = 870, 565, 530 nm). In false-color IR images, bright red indicates healthy or dense plant life, which is useful for segmenting the vegetation-related classes. Another technique, the Principal Components transform, decorrelates an image cube into bands of decreasing variance contribution. (11) The first two bands are shown below. Both have good contrast around the vehicles and calibration panels; band one also isolates the road well, and band two highlights vegetation health and parking lot detail.

(a) (b)

**Figure 15: **Test image, a) true color and
b) false-color IR

(a) (b)

**Figure 16: **Principal Components image,
a) band 1 and b) band 2

**Figure 17: **Truth map and materials key

Figure 17 is the operator-classified truth map of the original 320 x 320 scene. The cyan strips on either side of the image are from an earlier endmember set that included "blank data." Most regions are fairly large and homogenous, which makes interpretation of the unmixing outputs easier. Note the crisp boundaries, particularly around calibration panels. Because all pixels were assumed pure, an edge between a panel and the ground had to be one of the two classes, even if its spectra was a mixture of the two. In general, mixed spectra were classified with endmembers that had larger populations, keeping smaller regions more distinct.

An example of one of these compromises is shown below. A mixed pixel lay between "bottom vegetation" and "metal." As the pixels were in the region of the broader "parking lot" endmember, the mixed pixel was classified as that.

**Figure 18:** Example classification decision
in the truthing process

Basic data processing such as cropping and convolving was done with ENVI and IDL. (12) ENVI was used for all image and spectra viewing. Programs were written to compile the ENVI material regions-of-interest into a single truth image, to split the truth image into separate material maps, to calculate the squared error of unmixed fraction maps, and to produce mosaics of output fraction maps. Existing code from Konno's research (5) was modified to perform the linear and stepwise unmixing. Both routines offered unconstrained, partially constrained, and fully constrained options. ENVI also has a linear unmixing routine with unconstrained and partially constrained options. The squared error and appearance of fraction maps from the ENVI and Konno linear unmixing were approximately the same; Konno's algorithm was used in testing because of the extra constraint option and the ease of multiple-run scripting.

The linear and stepwise unmixing routines were applied to the convolved test image for all three constraint conditions. The F-to-enter/exit parameter for the stepwise unmixing was set at 20 based on results of Konno's studies. The plot below gives the squared error results, showing the clear superiority of the stepwise routine.

**Figure 19: **Squared error plot

The fraction maps reflect this success as well. The following figures first show the materials key and the reference truth fractions, then the partially constrained results for the two algorithms. The maps are scaled linearly between zero and one; values less than zero are set to black and values greater than one are set to white. Because of this scaling, some of the material maps with small populations (e.g., "shadow," "bad data") appear to be empty. Of the three constraint cases, partially constrained looks the best. The unconstrained outputs look similar but not as clean, and the fully constrained outputs look blocky and noisy compared to the other constraint conditions despite lower squared error. Complete linear unmixed results can be viewed here; complete stepwise results can be seen here.

**Figure 20: **Material labels for fraction
maps

**Figure 21: **Truth fraction maps

**Figure 22: **Linear unmixed fraction maps

**Figure 23: **Stepwise unmixed fraction maps

We see that the stepwise material maps are much cleaner. Although both algorithms received the same band selections, the linear routine appears to track band noise in the maps for endmembers with relatively small magnitudes - shadow and lower-reflectance panels, for example. The stepwise results are somewhat noisy, but there are far fewer false positives than the linear results.

Out-of-bound fractions do occur. With more constraints the algorithms produce fewer aberrant pixels with smaller magnitudes. The fraction maps can be color-coded to illustrate the effects of the constraints, as shown below. Negative values appear red and large positive values appear green, with a small amount of error on either end of the expected region. The smaller abundance of colored values in the fully constrained images explain their smaller squared error. Again, complete linear unmixed results can be viewed here; complete stepwise results can be seen here.

**Figure 24: **Key for color maps

**Figure 25: **Linear unmixed color maps

**Figure 26: **Stepwise unmixed color maps

Total-image squared error is simply the sum of the contribution from every pixel. 3D surface plots (Figure 27) of the individual values can be made to show where in the image the unmixing routines had difficulty fitting endmember fractions. Note the different scales in the images below: the linear plot ranges from 0.0 to 0.23, while the top of the stepwise plot is just 0.008. The plot of the linear unmixing shows that much of the error came from a few points at the left edge of the parking lot; other significant contributions are from the circular field in the lower left hand corner, the vehicles at the top of the cut pasture, and more points spread over the parking lot. The stepwise routine does well throughout the image, with perhaps slightly more error from the parking lot. This is unsurprising, since all these areas have pixels on boundaries (around cars, etc.) of different materials.

(a)(b)

The last proof of the stepwise routine is in the histograms of the individual fraction maps. Below we have examples of the "field" mapping for the partially constrained outputs of the two routines. The histogram for linear unmixing is very wide because it is finding some fraction for every pixel across the image. The stepwise unmixing histogram has a sharp peak around zero, because it has correctly rejected "field" from most of the image. This is indicative of its ability to intelligently select only a few materials per pixel.

**Figure 28:** Histogram comparisons of "field"
fraction maps

The stepwise routine had clearly outperformed traditional linear unmixing in the first approach. The goal of the second approach was to compare stepwise unmixing to a hierarchical linear unmixing. Other objectives included better band selection, a more thorough endmember library, and a more realistic truth classification.

5.2 Methods

We started with the same 320 x 320 x 210 image as before. Rather than cropping it, we kept the original spatial size, again convolved down by a factor of four to produce an 80 x 80 unmixing input. At the end of testing, when computing squared error and generating viewable fraction maps, pixels affected by blank data and bad data points were zeroed out. This allowed a study of more pixels and eliminated aberrations from bad data that were left in the results of the first approach.

**Figure 29: **Spatial processing

A new spectral band subset was chosen. The band
selection from code defaults was changed to better match the atmospheric
absorption for this particular scene, as shown in the figure below. Previously
the entire 210-band image was given to the unmixing routines and bad bands
were blanked out during processing. This time the bands were eliminated
beforehand, which shortened computation times.

**Figure 30: **Spatial processing

Many improvements were made in the endmember library, which was newly designed to support a materials hierarchy (cf. Section 3.3). The listing is described in Figure 31. Figure 32 shows some examples from the remade library. "Strong," "medium," and "weak" are used to distinguish spectra of similar shape but varying magnitude.

Some calibration panels, renamed after the peak digital counts in their spectra, were merged together. The spectral shapes and magnitudes of "16% Gray Panel" and "Emissivity Panel #4," for example, were very similar. "Bad data points" from the first library was eliminated not only because it was not a useful class, but because its high negative and positive values introduce errors into the unmixing. The old "vehicle/metal" endmember contained metal-like spectra of varying magnitudes; these are fairly well approximated by the new panel endmembers. This allowed better classification of trucks, vehicles, roofs, equipment, etc. Although many of the classes have similar spectral shapes, they are not simply linear combinations of one another. Real data has the natural variations necessary to insure linear independence of spectra and satisfy the unmixing calculations.

**Figure 31: **Endmember details

**Figure 32: **Sample endmember spectra

The three-level endmember hierarchy was arranged as shown below. The first level of "man-made" and "vegetation" is fairly intuitive for this scene, and the base endmembers fall believably into either of the two groupings. The second level is more arbitrary. "Parking lot" was closer to the "metal" spectra than one of the "road" types. The difference between the "cut" and "uncut" vegetation types is that the "cut" vegetation was cropped closer to the ground and had a more prominent soil contribution in the spectra. This can be seen in the plot above, as the "cut" spectra have a gentler slope towards the vegetation peak at about 0.7 microns.

**Figure 33: **Materials hierarchy

Endmember selection and construction of the truth map were done at the same time. The finer distinctions among endmembers necessitated an automated classification. Accurate separation of the vegetation classes in particular would have been an enormously time-consuming task if performed manually. Once again the ENVI classification techniques were applied. Classifications with Gaussian Maximum Likelihood and Mahalanobis Distance were made possible by selecting the same sample areas repetitively; this tricked the routines into believing that endmember populations were large enough to perform the calculations. After repeatedly modifying the endmembers based on classified results, the simplest technique, Minimum Distance to the Mean, was chosen. This algorithm assigns a pixel to the class whose spectrum is the smallest Euclidean distance from the pixel's spectral vector. (11)

**Figure 34: **Truth map and materials key

**Figure 35: **Truth detail

Figure 34 is the classified image. While more difficult
to interpret, it is a more realistic truth map, with natural transitions
between regions. The "medium vegetation" class does bleed into
the original "cut pasture" rectangle, which is an acceptable
classification based on the better health of the upper-half of the rectangle.
One interesting misclassification is that the bright vegetation in the
parking lot was identified as "strong vegetation" rather than
the "tree" class, though the "tree" class had been
sampled partially from that region. From the detail in Figure 35 the edge
effects are apparent. The boundary between a stronger calibration panel
and the vegetation next to it spectrally appears to be a weaker panel.
The classification more accurately describes such mixed pixels throughout
the image.

The only new tool introduced in the second
approach was the IDL code that performed the hierarchical linear unmixing.
This employed the linear routine used previously, but was lengthy because
it also performed the averaging of the spectra necessary to build the spectra
at the different hierarchical levels. Most of the program is hard-coded
for this particular study; the unmixing loop, for example, only iterates
three times.

Stepwise unmixing outperformed both the linear and hierarchical unmixings. The graph below gives the squared error results for the partially constrained cases of the three techniques. A few trials were run where the F-to-enter/exit of the stepwise routine was varied, but values other than 20 did not significantly improve squared error or appearance.

**Figure 36: **Final squared error results

As in the first testing phase, unconstrained
outputs did worse both quantitatively and qualitatively, and fully constrained
fraction maps had lower squared errors but poorer appearances. When the
hierarchical unmixing was fully constrained, in fact, the fraction maps
seemed barely related to the scene. Hierarchical tests were run for varying
decision thresholds, and as the threshold increased, the images got cleaner
and the squared error decreased. The problem was that, after a threshold
of 0.3, the number of unclassified pixels rose dramatically (see Figure
37 below). The 0.3 threshold gave the best performance without skipping
any pixels.

**Figure 37: **Hierarchical squared error results

The following images are fraction maps and key for the three unmixing routines at the partial constraint. Because of the new endmember set, the results are somewhat more difficult to interpret. The linear routine once again produces many noisy fraction maps, despite the more stringent band selection. In contrast both the hierarchical and stepwise techniques give fairly clean and accurate results. One noticeable problem with the hierarchical unmixing is that the parking lot area appears in many of the panel maps. The spectra are somewhat similar, but the stepwise routine does not encounter the same difficulty. Even in the vegetation classes it can be seen that the stepwise results are closer to the truth.

**Figure 38: **Material labels for fraction
maps

**Figure 39: **Truth fraction maps

**Figure 40: **Linear unmixed fraction maps

**Figure 41: **Hierarchical unmixed fraction
maps

**Figure 42: **Stepwise unmixed fraction maps

Color material maps again show the dismal performace of the linear routine, with out-of-bounds fractions occuring for most materials in 50-80% of the pixels. The stepwise maps are more speckled than the hierarchical maps, which have smaller and denser colored regions.

**Figure 43: **Key for color maps

**Figure 44: **Linear unmixed color maps

**Figure 45: **Hierarchical unmixed color maps

**Figure 46: **Stepwise unmixed color maps

The squared error plots for the three
algorithms look fairly similar, but note again the scale differences. High squared error appears along the bottom
of the parking lot at the same locations. These pixels were likely influenced
by material boundaries or bad data that were not blanked out. A few points
in the cut pasture are also noticeable in all three plots. The stepwise
routine has larger errors from both lower corners as well.

(a)(b)

(c)

The fraction map histograms of the linear and stepwise unmixings are similar to the results from the first approach: wide curves for the linear unmixing, sharp spikes for the stepwise unmixing. Hierarchical histograms are very close to stepwise histograms, with a small skirt of nonzero values around the central peak. An example is shown below.

**Figure 41: **Histogram comparisons of "strong
cut vegetation" fraction maps

The results support the research hypothesis that the stepwise routine performs better than linear and hierarchical linear unmixing. Unfortunately, there was not time to perform tests on other real images. But combined with earlier work on synthetic data, we believe there is sufficient proof that validates the algorithm. Previous research (5), (3) has examined such issues as spectral library size and spectral and spatial resolution. Quantitative evaluation requires a truth mapping; here we have used two different approaches for generating truth from actual data that both supported stepwise unmixing.

Linear unmixing, as stated previously (cf. Section 3.2), unmixed all endmembers in every pixel across the image, which gave poor results. This was expected, however. Traditional unmixing can handle about five endmembers fairly well, but begins to fail when too many materials are added to the model. An example of a successful large-scale application would be 30-meter pixels unmixed into urban, water, forest, and agriculture. Restricting endmembers to certain types of vegetation or rocks could also work in other scenarios if a user were only interested in particular materials. When higher-resolution data is examined, with the intent of a more thorough and realistic material mapping, classic unmixing will generally not perform well.

The hierarchical approach is just a method for nominating a subset of the full material library; this allows adaptive endmember selection on a pixel-by-pixel basis, which reduces errors. The unmixing math is the same as the classic technique, with fewer endmembers to unmix. In this study, hierarchical unmixing did twice as well quantitatively and much better qualitatively. The problem is that good hierarchical unmixings can only be produced by scene-, library-, and task-specific material groupings. It is possible that there are better ways of coding the algorithm and constructing the hierarchical library than the way it was done in this research. However, we believe that the process will typically be a tedious and time-consuming task. In addition, it may necessitate unrealistic material groupings.

In contrast the stepwise routine handles all the endmembers and still succeeds. We have tested with libraries of about 20 endmembers; the stepwise routine unmixed around 4-8 materials per pixel. This gives the subset selection that allows better unmixing without the lengthy preparation of hierarchical linear unmixing, plus the stepwise squared error was half of the hierarchical squared error. Given that computation times for the three routines were roughly the same, the performance and ease of use of the stepwise technique make it the clear choice for spectral unmixing.