1. Introduction

 

1.1 Background and Significance


         In this research I explored the application of derivative spectroscopy to remotely sensing water quality. What is meant by 'water quality' is simply the concentration of a few common components in the water that has been imaged by remote sensing. These constituents are water itself, sediment, chlorophyll, and dissolved organic matter (1, 2, 3, 4, 5). Each of these constituents has an absorption spectrum. This spectrum is a graph or display indicating the degree to which a substance absorbs radiant energy with respect to wavelength. In the case of the mentioned constituents, the absorption spectra have been measured by an absorption spectrophotometer and are previously known functions. An absorption spectrophotometer is a device used to measure the relative intensity or brightness of the spectral lines or bands of an absorption spectrum. In the case of remote sensing, when a target surface is imaged, the recorded information is related to these spectra.

         When remote sensing data is recorded, it is a function of the reflectance from the surface target and the propagation of the reflected energy through the earth's atmosphere. The latter part is removed from the data using calibration. This results in data regarding the reflectance of the surface target. In multispectral imaging, this data is taken over several relevant spectral bands, or wavelength regions. In hyperspectral imaging the data is approximately continuous, covering a wide range of wavelength values. The spectral reflectance of the target surface is to a large degree a function of its absorption. Incident light upon the surface is either scattered or absorbed. The level of energy absorbed is dependent upon the absorption spectrum of the components in the surface. The scattered energy is reflected and recorded. Therefore, the recorded energy, once the atmospheric effects have been removed, is to a first approximation, inversely proportional to the absorption spectra of the constituents of the target surface.

         The currently popular method of extracting data about a surface target's constituents from the recorded energy is spectral ratioing. Spectral ratioing is a multispectral image processing method that involves the division of one spectral band by another. This method is appropriate to multispectral data, containing only a few, carefully selected and usually discontinuous wavebands. However, these methods are not as appropriate for hyperspectral data. Few researchers have tried to employ approaches commonly used in spectroscopy or have manipulated data as truly spectrally continuous data (6). These researchers have revealed that not all methods in spectroscopy can be directly applied to remote sensing. Among the techniques that have been developed in spectroscopy, derivative analysis is particularly promising for use with remote sensing data (6).

         Derivative spectroscopy utilizes the derivative(s) of a spectrum to characterize it. This process is also known as feature extraction. The original spectrum may have hundreds of data values, and thus as many features. By extracting important features from the data, we can represent it less tediously. The derivation process yields helpful features, such as maxima, minima, and points of inflection.

        Previous work in derivative spectroscopy has been performed on the reflectance spectra (6, 1, 7, 8, 9, 10). In this case, the derivative(s) of the reflectance spectral data is examined to determine the composition of the target surface. The first derivative is the slope of the spectrum as a function of wavelength. The first derivative has a value of zero at wavelengths that the original spectrum has a maximum or minimum. Similarly, the second derivative has a value of zero where the first derivative reaches a minimum or maximum. These points also correspond to points of inflection in the original curve. In this way, the first and second order derivatives facilitate in extracting features that characterize the spectra. In past research applying derivative spectroscopy to remote sensing, the analysis has been performed on synthetically generated or experimentally collected reflectance data (11, 10, 12, 5, 7, 13, 3, 1, 8). However, my proposed research focused on characterizing the known absorption spectra of the main constituents. This process is called forward analysis because it begins with known functions and uses these to develop methods that can be applied to experimental data.
 
 

1.2 Theory

1.2.1 Smoothing


         Derivatives are painfully sensitive to noise in the original signal. Therefore, an appropriate smoothing algorithm must be applied to the signal before derivation. Smoothing is a method of reducing noise, ideally without losing the information in the signal. There are many types of smoothing practiced, based upon different assumptions about the noise. Several of these have been developed for simultaneously smoothing and differentiating. The most commonly used of these methods is perhaps that of Savitzky and Golay. This assumes that the noise has similar characteristics across the spectrum. Therefore their procedure exploits this assumed quality by being invariant with wavelength.

        This may not be appropriate however to remote sensing data. The noise present in a remote sensing signal is often machine noise. This is an artifact of the equipment used in capturing, recording, and transferring the signal. The degree to which this noise affects the signal varies with wavelength, usually affecting only a portion of the spectrum. Therefore Savitzky and Golay’s method is not ideal for this application.

        Kawata and Minami maintained that even random noise varies across the spectrum. They presented a method of minimizing the mean-squared error that accounted for a varying signal-to-noise ratio. The mean filter algorithm also smoothes the signal locally. It uses a window of a set width, replacing the center value in that window with the mean value computed over the window.

        The three methods introduced above were explored and experimented with, but only mean filter smoothing was chosen for use in the spectral modeling of this project. This algorithm is described in equation (1) below.

                  eq. 1

In this equation, n is the filter size, expressed in number of sampling points, and j is the index denoting the midpoint of the filter. This process however only smoothes the original signal, and therefore a separate algorithm must be chosen to perform derivations.
 

1.2.2 Derivation


         To compute the derivatives of the spectra, a method known as finite approximation was used. This algorithm is in fact a “finite divided difference approximation” of the derivative. It utilizes a set (finite) band resolution, Dl, to compute differences. Equation (2) below shows the finite approximation of the first derivative.

                       eq. 2
Here, Dl is the separation between adjacent bands: Dl = |lj - li|. Thus, the finite approximation of the first derivative at each wavelength band li is the difference between the value of the spectrum at this band and the next, lj, scaled by the inverse of the distance between the bands. The finite approximation of the second derivative works in a similar manner. This can be seen in equation (3) below.
                      eq. 3
The band resolution in this case is also the distance to the next farther band, lk. Thus Dl = |lj - li| = |lk - lj|. The finite approximation of the second derivative is a logical extension of the first. Thus, estimated value at the band li is the value of the original spectrum at this band, minus twice the value of the original spectrum at the next band, lj, plus the value of the original spectrum at the following band, all scaled by the inverse of the squared band resolution.



 

2. Methods

2.1 Constituent Spectra


         To begin work on this research, I first needed the absorption spectra of the major spectral components of water color that I’d chosen to study. These were water itself, chlorophyll, sediment, and dissolved organic matter. This data was already available to me through my research advisor for this project. The absorption spectra had been measured from water samples collected from lakes, ponds and rivers near Rochester, NY. Basic chemical methods allowed for separation and measurement of the various components independently.

         As discussed previously, the original spectra were soon found to be too noisy for effective derivative analysis. Therefore, they were smoothed in IDL (Interactive Display Language). To accomplish this, I wrote a simple program that utilized the languages pre-existing routines. This program first read the text file containing the spectrum and stored the data in an array. It then passed the array to a function designed to smooth it, using mean-filter smoothing. The window size for this operation was user-specified. Finally, the smoothed spectra was printed out to a text file and saved. For each spectrum needing smoothing, I ran this program several times, obtaining results for a range of window sizes. I then plotted each result in excel. By comparing the graphs, I was able to determine the most appropriate window size for the particular spectrum and noise under consideration. To make this determination, I distinguished between the noise and the signal, assuming the noise to be more high frequency, and the signal to tend towards more slowly varying characteristics. I was then able to compromise between the desired effect of noise-reduction and the undesired effect of signal-reduction. As the size of the window is increased, the degree to which the spectrum is smoothed also increases. Over-smoothing results in loss of relevant data in the signal, but under-smoothing leads to noise that is greatly amplified through derivation. Therefore a middle ground must be reached for each spectrum. The smoothed spectra are shown below in Fig. 1 - 4.

 Figure 1: Absorption Spectrum of Water  between 300 and 750 nm

Figure 2: Absorption Spectrum of Organic Matter between 300 and 750 nm
 

    Figure 3: Absorption Spectrum of Chlorophyll between 300 and 750 nm



Figure 4:  Absorption Spectrum of Sediment between 300 and 750nm

2.2 Modeling Concentration


         In Microsoft Excel, I scaled and summed the data in many various combinations to model many various concentration arrangements in water. First, I produced an appropriately formatted text file of each spectrum alone. This would be used as reference to guide our understanding of the spectra of the four components involved. Next I modeled water with varying concentrations of only one component. This was accomplished by choosing the spectrum of one component, weighting it with a series of values (resulting in an array of several spectra), and adding the spectrum of water to each result. These data files would give information about how each component acts in water at different levels of concentration. Finally, I combined one component with a constant weight, another component with a variety of weights, and water. In this case, several similar files were created, each with a different constant weight value applied to the first component. This led to information about how two components interact together at different levels of concentration in water.
 
 

2.3 Derivation


         Once this data was created and properly formatted, the derivations were performed in HyperSpec (15). HyperSpec is a program written in MATLAB, developed specifically for derivative analysis. It is capable of many different types of smoothing and derivation. These processes can also be computed seperately or concurrently. Also, HyperSpec allows great flexibility in choosing the size of the smoothing window, bandwidth, and other important variables. Thus the data was transported over the network to a machine with MATLAB, where Hyperspec can be run. The data was read into HyperSpec, in which program I used finite approximation to find the first and second derivatives of each data set described above. Since most files contained several spectra, the outputs were three-dimensional graphs. These graphs vary in wavelength on the x-axis, change in absorption per change in wavelength on the y-axis, and data set number on the z-axis. I then saved these figures in jpeg and tiff format. These images were then transported back over the network and printed for analysis.



 

3. Results

3.1 One Constituent in Water


         The first step in this research is to understand how each component behaves in water. This is demonstrated in the six graphs that follow. These graphs depict the first and second derivatives of a series of spectra representing water with varying concentration levels of each constituent alone. The concentration levels of each constituent in water increases along the z-axis, demonstrating a full range of concentrations as may be encountered in the greater Rochester area.

  Figure 5: First Derivative of Water with Organic Matter at  Nine Concentrations
 
 
 
 

Figure 6: Second Derivative of Water with Organic Matter at Nine Concentrations
 
 

Figure 7: First Derivative of Water with Chlorophyll at Five Concentrations
 
 


Figure 8: Second Derivative of Water with Chlorophyll at Five Concentrations
 


Figure 9: First Derivative of Water with Sediment at Ten Concentrations
 
 


Figure 10: Second Derivative of Water with Sediment at Ten Concentrations






3.2 Two Constituents in Water


         A more realistic case was built in order to understand the behavior of the constituents. This understanding can be gleaned from examining the following graphs, which depict the first and second derivatives of chlorophyll and organic matter in water. In each graph, chlorophyll varies in concentration, increasing along the z-axis. Organic matter varies in concentration between the graphs, increasing from Fig. 11 and 12 to Fig. 19 and 20.


Figure 11: First Derivative of Water with Chlorophyll (at Five Concentrations) and Very Low Concentration of Organic Matter
 


Figure 12: Second Derivative of Water with Chlorophyll (at Five Concentrations) and Very Low Concentration of Organic Matter
 


Figure 13: First Derivative of Water with Chlorophyll (at Five Concentrations) and Low Concentration of Organic Matter
 


Figure 14: Second Derivative of Water with Chlorophyll (at Five Concentrations) and Low Concentration of Organic Matter


Figure 15: First Derivative of Water with Chlorophyll (at Five Concentrations) and Moderate Concentration of Organic Matter


Figure 16: Second Derivative of Water with Chlorophyll (at Five Concentrations) and Moderate Concentration of Organic Matter
 


Figure 17: First Derivative of Water with Chlorophyll (at Five Conentrations) and High Concentration of Organic Matter
 


Figure 18: Second Derivative of Water with Chlorophyll (at Five Concentrations) and High Concentration of Organic Matter


Figure 19: First Derivative of Water with Chlorophyll (at Five Concentrations) and Very High Concentration of Organic Matter


Figure 20: Second Derivative of Water with Chlorophyll (at Five Concentrations) and Very High Concentration of Organic Matter












 

4. Discussion


         This research began with the idea that derivative analysis of the major spectral components that combine to produce water color (water itself, chlorophyll, sediment, and dissolved organic matter) can be used to create algorithms for quantifying the water columnar constituents. This idea was tested through spectral modelling, a forward analysis approach. Spectral modelling in this case consisted of smoothing the absorption spectra, wieghting and adding them in various combinations, and computing the first and second derivatives.  The spectral modeling completed indicates that derivative analysis can be used to estimate constituent concentration.

         These results suggest that the concentration of certain constituents can be estimated through derivative analysis. Figures 1 - 4 demonstrate some basic yet important characteristics of the spectra examined that influence their ability to be estimated. The absorption spectrum of water (Fig. 1) is present mainly in the higher wavelengths and has characteristic bumps around 600nm and 680nm. These bumps have proved to be crucial in estimating the concentration of a constituent. The absorption spectra of organic matter (Fig. 2) and sediment (Fig. 4) are similar in nature. This similarity makes them difficult to discern in concentration estimates. These spectra are both present only in the lower wavelengths and thus do not overlap significantly with the absorption spectrum of water. The absorption spectrum of chlorophyll (Fig. 3) however is present throughout the wavelengths examined. In particular it overlaps the spectrum of water, even the characteristic bumps around 600nm and 680nm.

         Figures 5 - 10 confirm these arguments. As the concentration of organic matter increases in Fig. 5 and 6, the effects are seen primarily in the lower wavelengths. The higher wavelengths are left unaffected, and thus show the first and second derivatives of the characteristic bumps from the water spectrum. However as the concentration of chlorophyll increases (Fig. 7 and 8), these characteristic bumps are superseded by the presence of the chlorophyll spectrum. Figures 9 and 10 show that increasing the concentration of sediment is similar in effect to increasing the concentration of organic matter. However, the effects are less dramatic and the outcome is noisier. The noise results from the absorption spectrum of sediment used in this research. Although it was smoothed prior to being used in analysis, a balance had to be maintained between smoothing out the noise and preserving the integrity of the information. Therefore some noise remains, which is amplified greatly through each derivation.

         Figures 11 – 20 show that increasing the concentrations of chlorophyll and organic matter have distinctly different effects. When the chlorophyll concentration is high and the organic matter concentration is low (Fig. 11 and 12, magenta and light blue trendlines), the characteristic bumps in the water spectrum are not visible, and neither is the downturning in the lower wavelengths characteristic of the organic matter spectrum. In the reverse situation however (chlorophyll concentration is low and organic matter concentration is high: Fig. 19 and 20, blue and green trendlines), the characteristic bumps are quite visible and the negative presence of the organic matter spectrum in the lower wavelengths is strong.



 

5. Conclusions

         Recall that reflectance is a function of spectral backscatter and absorption. It is this relationship between absorption and reflectance that lends significance to the work preformed here upon the absorption spectra of the constituents. It is through the following equation that this information may be translated into useful remote sensing algorithm, applicable to remote sensing data.
                      eq. 4
Here R is reflectance, bb is the backscatter, a is the absorption, and the sumations are over all constituents (14). This illustrates the inverse relationship between absorption and reflectance. This relationship is of integral importance to this research because it establishes the connection between the absorption spectra with which we worked and the reflectance data found in remote sensing images.
         The results suggest that it is possible to determine water quality to some degree of accuracy through derivative analysis of absorption spectra. The results show that while some of the main constituents that combine to create water color have a distinct effect on the resultant spectrum, others may not. Of the constituents examined, the concentration of chlorophyll promises to be the most estimable. Organic matter also has a notable effect on the resultant spectrum as well, though not as unique. Organic matter and sediment could be difficult to discern, since they seem to have similar effects on the resultant spectrum. With more information about the absorption spectrum of sediment it may be possible to determine characteristic features that distinguish it from organic matter. These features are possibly present, but not visible from the data used due to noise. If this is the case, it would be useful to examine the sediment and dissolved organic matter spectra at longer wavelengths. This is because while water with high concentration of sediment reflects the wavelengths in the near-IR portion of the spectrum, dissolved organic matter doesn't.
        This research has identified the spectral features that will be useful in creating algorithms for estimating the concentration of constituents in water. These features now need to be translated into parameters that will be useful in these algorithms. It will also be important to verify the results, using syntheitically generated data and experimental data obtained via remote sensing. The synthetic data should be obtained through a radiative transfer model such as HydroLight (15). Generating the reflectance curves in this way, that correspond to the absorption spectra studied here, will give light to how these features transfer into reflectance. It will then be useful to obtain experimental data with determined component spectra. Using the component spectra to compare these reults to those from HoydroLight will also illuminate the effectiveness of this method.

Table of Contents