This chapter starts describing the details of the experiments that have been carried out throughout our research study. First, in order to be able to describe the input images in terms of their color properties, several color classification assignments were tested. We examined numerous color representation systems to see which one of them could provide the most stable representation of the perceived color experiences. Chapter 3 introduces all the measurements that were tested and their performance results. In order to account for the varying illumination conditions, a reflectance finding algorithm, inspired by Edwin Land’s work, was also implemented and evaluated. As the image processing experiments revealed some distortion effects on the inputs, we added a linearization procedure to the algorithm. This was expected to eliminate the non-linear aberrations that had prevented accurate calculations. This chapter also describes how the set of reflectance values assigned to the locale model regions were selected.
3.1 Color Classification
3.1.1 Basic Color Terms
In the first stage of the development of our research, the problem of color classification was attacked. This involved identifying terms with color visual experiences. As the goal of our research was to match input images with an already existing environment representation by using color information, it seemed to be desirable to enable Susan B. to actually name the color of regions visible on a given image. Several color classification algorithms were considered that would associate a particular set of characterizing values with an English color name. Some of the questions that emerged were the following. What colors should be selected to represent a wide range of visual experiences? What terms would be appropriate to identify these?
To answer the above questions, the initial ideas were obtained from the Basic Color Terms [Kay, 1991]. The authors, Kay and Berlin, described their research on the chronological and ethnological development of color terms and examined the question whether there exists a set of basic color terms that was common to almost all the existing languages and were encoded by all ethnic groups. They were also curious to know whether it was possible to show a systematic chronological development in the enrichment of color vocabularies and whether this process was related to the state of cultural development at all.
The authors used very rigorous definitions to interpret the notion of a "basic color term". In the following, the four most significant restrictions, which were applied by Kay and Berlin to their classification task, are described:
Kay and Berlin also demonstrated that color categorization was not a random procedure. In fact, all languages contain a similar set of color terms. The partial order, indicated on Figure 7, represents two major ideas of their study.
First, the table represents a chronological order of the lexical encoding of the basic color categories in all languages. That is to say, the groupings of Figure 7 are also interpreted as evolutionary stages in the linguistic development of expressing color sensations. As an example, white and black are color notions that people distinguished between for the first time, and all the rest of the color terms appeared later (in the given order) during the development of language vocabularies.
Secondly, the table also indicates the distribution of applied color terms in contemporary languages. This simply means that if a language encodes merely three color-terms, for instance, then these terms cannot be other than black, white and red. The "<" ("less than") sign on the figure indicates the partial order between the color equivalence classes. If (color1 < color2), then color1 is present in all languages that encode color2 and it is also present in some languages where color2 is not identified. Based upon these rules, Kay and Berlin sorted the examined language groups into seven categories. (The authors determine the green and yellow group as two instead of one stage.) The more basic color terms a language encodes, the "higher developed" it is declared to be.
According to Kay and Berlin, the English language belongs to the most prominent category, as it encodes all the elements of the presented basic group. That is the reason why the first color classification experiments of our research project applied exactly the above listed eleven color terms.
The reader should note that this categorization does not mean that English speakers do not differentiate between more than eleven color terms. It only indicates that all the existing terms expressing color sensation are built upon this basis group. Humans, in fact, are able to recognize several thousands of color variations. However, as it is demonstrated in [Fathima, 1992], they divide their color space into no more than a total of 220 categories.
3.1.2 Experiments with Color Classification
The next challenge that emerged in our study was related to the quantification of colors. In other words, it was necessary to associate each of the color terms with a characteristic set of numbers (usually a triplet). These numbers were to capture and measure the distinguishable properties of the color sensations in order to enable the classification procedure to effectively differentiate between the eleven categories. That task, in general, addressed several interesting questions. Will the RGB color-coding of our system provide a sufficient solution? How well can a set of triplets represent all different shades of the same color? Is the change between various tints of a color linear? How can the best fitting nomenclature be identified in three-dimensional-space?
3.1.2.1 RGB
Our first experiments utilized the RGB color values. Eleven constant RGB triplets were defined to identify the above-described basic color terms. Then, for each input, the Euclidean distance of the input color was measured from all the color terms. The triplet with the smallest distance was identified as the examined color. This simplistic approach did not turn out to be adequate. The main problem with the pure distance calculation lay in the fact that the total distance did not indicate the wavelength band in which the main difference occurred. For instance, both (0, 200, 200) and (200, 200, 0) are of the same distance from green (0, 255, 0), but they are very different in nature. While the first triplet represents a light blue color, the second refers to a dirty yellow. Additionally, the distance measure did not incorporate the possible intensity level differences of the same color. For example (0, 0, 255) and (0, 0, 125) both should be identified as "blue", but they definitely have different distances from any of the stored constant triplets.
3.1.2.2 Color-Intensity
After several sets of experiments, where the input colors were "misidentified" because of the above-mentioned reasons, another color basis was tested. In VLSys, there exists a color-intensity representation that can be described as follows. The term "Color" refers to a normalized set of color values in the range of [0, 255]. If the original RGB triplet of the input is represented by (c1, c2, c3), all the variables are multiplied by {255/max (c1, c2, c3)}. In this way one indicator always takes the 255 value. The "Intensity" measure signifies the quantity of the original maximum descriptor value divided by 255, that is {max (c1, c2, c3)/255}.
This representation was more efficient in handling the delicate difference between darker and lighter tints of a color than the earlier tested one. For example, light red and dark red were both classified as red. It was not their color but their intensity measure that was different.
Recognizing intensity differences, in fact, is an important feature in classification tasks. In case the lighting conditions are different in two environments, the surface colors are to be matched despite their varying intensity attributes. For the Color-Intensity color basis, the classification strategy was modified. As a first step, the input color was assigned into one of three groups based on its intensity value. The color was labeled "dark" or "light" in the extreme cases and did not take an extra tag if it had its brightness value in between. Then, the Color categories were compared. Instead of merely calculating the distance (dist) between two Color values, a confidence measure (c) was defined. The confidence level assigned to a Color indicated how similar it was to another Color term.
In the above equation, the term "max_dist" (~441.673) refers
to the maximum possible distance between two color triplets. This exists
between the color black and white or (0, 0, 0) and (255, 255, 255). Then
the intensity difference ("d") between the examined terms was also calculated
and incorporated into the formula. To find the appropriate color name of
the input, the algorithm selected the term that made the expression
take
its minimum value.
According to the evaluation tests, this last version of color-naming schemes operated more effectively than the earlier ones. It was able to distinguish consistently between different intensities of the colors white and brown, for example, and even if it misidentified a color (predicted gray instead of white) the mistakes could be labeled less severe than the ones resulting in previous operations.
3.1.2.3 New Set of Color Terms
Throughout the experiments another shortcoming of the classification algorithm was discovered. The eleven basic color categories, introduced by Basic Color Terms, did not give an adequately detailed representation of the colors that appear in Susan B.’s office environment. The eleven colors appeared to be "too pure" and not generally visible in a usual office setting. For instance, it is very rare to find pink or purple objects in the hallway of the science laboratory. However, one can easily count six or seven different shades of brown. That realization was followed by the enlargement and modification of the collection of constant color terms that were used in the classification procedure. Several new tints of earth colors (e.g.: dark timber, light brown) were added to the collection and others (e.g.: pink, purple, yellow) were eliminated.
The introduction of the new characterization set could not correct for all the above-mentioned limitations of the algorithm. It definitely increased the chance of a correct match by the goal-specific tailoring of the constant color set, however, it could not identify color regions of the same surface consistently on different input images. Most frequently, it was the differences caused by varying illumination settings that prevented the system to correctly identify the input color. Hence, it was desired to obtain a method that would be able to provide a system that is independent of (or definitely less dependent on) the lighting conditions. In Chapter 2 (Section 2.4), Land’s experiments with surface reflectance measures have already been briefly introduced. He carefully described the phenomenon of human color constancy and then implemented an algorithm that was to produce reflectance values independent of uneven illumination. As our project was in need of an indicator that behaved in such a way, surface reflectance values were evaluated. Eventually these color measures were the ones selected to work with the scene identification/localization algorithm.
3.2 Surface Reflectance Measures
The color of an object is often referred to as surface color and the nature of it is determined by surface reflectance properties. Humans are able to make good judgements about these relative surface reflectance measures despite varying illuminating wavelengths. The automation of this process, however, is not trivial. It is, in fact, impossible to obtain reflectance information from image brightness measurements without forming assumptions. Brightness (the intensity value registered by the sensory organs) is proportionally dependent both on surface reflectance and illumination. If both of these variables are unknown, their value cannot be computed from one equation. One of the terms has to be estimated first in order to obtain a value for the other. After obtaining the reflectance values, color information can be acquired.
Reflectance values indicate the percentage of all incident light that is reflected (not absorbed or refracted) from the object surface in one of the three color-bands. It is determined by the microstructure of the surface [Horn, 1970]. This value is presumed to be a unique feature of surfaces and hence it might be used efficiently in characterizing objects.
3.2.1 Computing Surface Reflectance Measures
The theory of human color constancy states that humans are able to identify surface color values even under varying illumination conditions. Edwin Land, when demonstrating his results in the "Retinex Theory", suggested an algorithm to calculate surface reflectance measures that are independent of varying radiance. In order to justify these calculations, he formulated an assumption stating that illumination is a smoothly varying entity. In other words, in the immediate proximity of a region, the magnitude of the incident light does not significantly change.
Edwin Land, in [Land, *], emphasized the significance of edges in defining areas or objects on a scene. He first examined the ratio of two luminance detectors focused on opposite ends of the same board that was of one color. If the illumination of the board was not uniform, the readings on the light detectors were naturally different. As the two detectors were moved closer together, their ratio approached unity. If, however, in the middle of the original board there was another region located with a different color, the ratio of the luminance detectors eventually approached the ratio of the region reflectances. Land realized that taking the ratio between two adjacent points could both detect edges along surface regions and eliminate the effect of non-uniform illumination. If the whole input image is processed in terms of ratios of luminances at closely adjacent points, dimensionless numbers are generated that are independent of illumination. In order to obtain surface reflectance measures, the light ratios have to be related to each other. That is why the ratios have to be multiplied together, along the path of calculation. The outcome of this process is a sequential product that can be used as a substitute for the placement of any two areas adjacent to each other. Forming the sequential product can take place anywhere on the image. It is also important to note that if the sequential product exceeds unity during the computations, the sequence has to start afresh from the new region, the one that caused the exception. That step assures of finding the highest reflectance value on the input after traversing the whole input. Figure 8 represents a color board segment with rectangular regions of different reflectance. If one begins with the top left region of the setup and then follows the indicated path, the sequential product is formed as the following: (16/9)*(9/5)*(5/11)*(11/7)*(7/4)*(4/8). According to Land, the sequential multiplication of edge ratios can generate values equivalent to relative reflectance for all areas along the path [Land, 1971].

3.2.2 Related Research Results
There seem to be successful applications of the Retinex Theory algorithm introduced above. In [Marni, 1993], for example, the authors gave an account of the algorithm’s role in improving image equalization, color correction and constancy against varying light characteristics and white balancing. Some, however, criticize Land’s approach. The authors of [Brainard, 1986] believe that Land’s algorithm is too sensitive to changes in the color of nearby objects. They state that the main difficulty with the Retinex Theorem is that it normalizes to different reference surfaces for different scenes. They conclude that the Retinex algorithm corrects for the light in a manner that strongly depends on the composition of the nearby surfaces in the image and therefore, it could not be an adequate model of human color constancy. These psychologists are right about the fact that the separation of the examined product (Incident Illumination ´ Reflectance), without any additional information, is mathematically impossible. There is one equation given with two unknown variables. However, by introducing certain assumptions, it is believed to be feasible to obtain close estimates of both of these values.
3.2.3 Our Implementation
After comparing results described by the literature, it seemed to be
a promising approach to assign reflectance measures to regions according
to Land’s method. In fact, one of the main motivations behind our study
was to confirm the validity of the outcomes of an algorithm based on Land’s
calculations and to determine how successful it can be if applied in a
more natural setting. In an office environment, the spatial and the spectral
distribution of illumination are generally not standardized and the analyzed
objects are in a three dimensional instead of a two-dimensional setting.
The extra dimension in the experiments often leads to distorting shadows
and reflections appearing on the input images. For example, Figure 9, clearly
shows the shadows and the inter-reflection effects (blue reflected light
arriving to the surface of the wall from the plastic can) that make the
automated recognition of the otherwise uniform wall color extremely complex.
These phenomena were not accounted for in Land’s algorithm. That complexity
allowed us for testing the magnitude of "simultaneous contrast" effect
that signifies the changes in color appearance caused by variation in the
surface reflectance functions of the surrounding objects. More details
about the consequences of this distorting phenomenon can be found in Chapter
4.

Figure 9: The calibrated version of the "cans.clr" image
A color constancy algorithm, inspired by the original version of Land’s calculations, had already been implemented in VLSys, independently of our project. That version unites the region analysis algorithm described in [Brice, 1970] with finding the reflectance values. Grouping together pixels of the exact same intensity value, the system first forms a set of elementary regions. The reflectance measures are calculated next. The algorithm traverses the whole input image while taking intensity ratios of the neighboring regions encountered on the way. This algorithm does not start its operation all over again, in case the sequential product exceeds unity. Instead, it normalizes all the reflectance values to the highest output of the sequential assignments. Then, some of the elementary regions are further merged together by the Weakness algorithm. This process joins regions together based upon the strength of their common boundary (I) that separates them [Brice, 1970]. The strength of a boundary is defined as the difference between the properties of the picture elements (pixels) on the right and on the left of it. In this case, this is the absolute value intensity difference on the two sides of the boundary. The length of the weak (W) segment is defined as the number of boundary vectors that have strength value below a threshold. Two regions are joined if W/I > T, i.e. the ratio of the weak part of the common boundary over the total length of common boundary is greater than an experimentally determined threshold (T).
The reader should note that the individual image regions are not necessarily colored correctly on the screen after the image region analysis. It is because changes in the magnitude of illumination are not considered. In case of all regions, a constant lighting magnitude value, 255, is applied for representation purposes. As long as the numerical calculations are consistent, this display property is of diminishing importance.
In order to make the reflectance information illumination-independent, Land’s algorithm takes sequential products of color intensity values in neighboring regions. The original implementation of the algorithm in VLSys started this process by selecting an arbitrary region from the input image. Then, by continuing the calculations with the neighbors of the neighbors of the starting region, the whole image was eventually scanned. The arbitrary selection of the beginning region could not ensure that it was the one with the highest reflectance value on the whole image (which is a requirement by Land). Hence all the ratios were normalized at the end of the computations, to the highest numerical value located on the image.
This implementation closely followed Land’s theory, which stated that the reflectance calculations could be started at an arbitrary location on the input. Truncation errors, however, modify this assumption for practical implementation and introduce serious distortions to the examined image. When the normalizing variable (the largest value assigned by the "FindReflectances ()" method) was high at the end of the analysis (the value was greater than 15), the input image was significantly dimmed and when it was below 5, the transformation behaved as expected. For that reason, the algorithm had to be modified in a way that the reflectance computations produced a relatively small normalizing variable. One way to achieve that goal was to begin the analysis with the region that possessed the highest reflectance value instead of an arbitrary one. To provide a quick approximation for that measure, a function was written to select the region with the highest amount of total intensity value (IR+IG+IB). This method did not necessarily ensure the retrieval of the highest reflectance value from the input image, but it provided a close estimate. This was justified by the outcomes of the new reflectance analysis. After the adjustment was completed, all the processed images retained their original colors and the Weakness algorithm correctly formed the growing regions. Look at Figures 10, 9 and 11 as a sequence.


3.2.4 Reflectance Measures of Environment Surfaces
The aim of the experiments that followed was to find stable reflectance measures that were characteristic of various surfaces in the office environment. These values then could be stored in the model and then utilized in the region identification procedure. To obtain their values by the newly created reflectance finding operator turned out to be an impossible task, though. The values calculated by the implementation of ideas from the Retinex Theory greatly varied from one image to another. For instance, in case of two images that were taken of the same wall segment across Room 401a, the reflectance values (W1, W2) for the wall regions were the following: W1 = (0.612; 0.634; 0.588) and W2 = (0.646; 0.707; 0.643). Such a big variance in the descriptors (especially in vase of 0.634 vs. 0.707) could not allow for effective classification.
As the critiques pointed out, the Land reflectance indicators were all dependent on the actual scene where the image was taken in. The normalization procedure always assumes the presence of a 100% reflectance value on the input and compares the rest of the regions to that. In practical experiments, though, pure white faces are not always included. Thus the base of comparison is different for each individual image. How could then the "true" reflectance values describing the environment model be calculated?
In order to carry out the localization procedure, the analyzed input reflectances have to be compared to a standard set of model descriptors. The following sections introduce a method that allowed for measuring the "real" reflectance triplets of the environment surfaces that were later stored in the locale model.
3.3 Single Reference Value
An alternate procedure that could produce comparable reflectance attributes applied a well-defined reference object, a gray card [Figure 12, left].

A standard gray card is calibrated to reflect exactly 18% of the incident light in all three of the color-bands. With that information and the image intensities obtained by the color camera, the magnitude of illumination can be computed. This technique is widely used in photography-related industries.
To calculate the reflectance values of surfaces with this tool, a new set of images had to be taken. In all of these, the gray card was included in the immediate neighborhood of the region of interest.
That was necessary in order to insure that approximately the same amount of light falls on the regions that were to be compared. That is to say, adjacent regions of an image were exposed to the same lighting conditions, which changed more significantly with the distance growing between the two locations. Then, surface reflectance measurements could be obtained relative to the calibrated value. To carry out the computations, some of the basic principles of geometrical optics were applied.
In Physics the relationship between the color intensity, irradiance (I) and reflectance (Ref) measures of a region can be expressed, in each of the three color-bands (R, G and B), as follows:
In a picture that does not include the gray card these equations have two unknown variables: the reflectance and the irradiance. Knowing only the intensity measure is not enough to separate these unknowns. However, in case of the calibrated gray card, the reflectance value of the surface was known. Thus, the amount of incident light on the card’s surface could be expressed in the following way:
As it was assumed that the intensity of illumination was smoothly varying in case of closely located surfaces, the illumination values computed for the gray card could then be used to obtain the reflectance measures for all adjacent regions in its proximity.
Several reflectance triplets were calculated by this method and then tested on another set of images that also contained the gray card. In the second, testing case, however, the arrangement of the objects in the neighborhood of the examined surfaces was modified. Such a modification could involve change of object location in the environment, change in the composition of the neighboring surfaces and change in the composition or strength of illumination. An example pair of the results is the following: (0.246; 0.193; 0.139) and (0.285; 0.203; 0.16). Both of these values were calculated for a timber segment of the entrance of Room 418. The reflectance measures calculated for identical surfaces on different input images were not as close to each other as it was expected. In order for surface reflectance values to be stable distinguishing factors, such an inconsistency could not be allowed.
During the experiments there were two other significant phenomena observed. First, the input images acquired from the color camera were all grayish and dimmed. Secondly, the color intensity values measured for the standardized gray region significantly varied in the three color-bands (e.g.: 118.66; 149.62; 180.94). The collection of these observations suggested that some type of a distortion must be present in the input images. Either the assumption about the smoothly varying lighting conditions was too strong or another type of distortion was present in the images. The fact that the gray card intensity values varied so widely made us assume that the nature of the aberrations might not be linear. That would be able to explain the inconsistencies in case of the reflectance values as well. A possible source of this kind of a distortion could have been the color camera, the input device itself. The lighting conditions in the laboratory setting could also be responsible for some of the distortions. The neon lights located in the hallway and the office areas of the laboratory building are not standardized. Instead of emitting white light, they often introduce intensity variations in any of the three color-bands. This research project first addressed the problem related to the camera distortions. Then, as it is described in Chapter 4, the illumination variations were accounted for when computing the color identification transforms.
3.4 Camera Distortions
To investigate the possibility of some complex input device distortions, a thorough analysis of the input images was completed. This gave an explanation for the aberrations. The examined data showed that in most of the images, the R, G, B color histograms were shifted towards and clustered in the neighborhood of the maximum 255 value. This phenomenon, illustrated on Figure 13, is called "clipping" in photography. The lack of intensity values in the lower regions of the intensity interval caused the omission of details about the surfaces and introduced imprecision in the calculations. This type of distortion was mostly prevalent in case of the blue and moderately in the green color-bands.

The first solution to the problem seemed to be the adjustment of the frame grabber settings. In case of individual images, one could manage to succeed. It was mostly the U saturation and contrast settings that had to be reset. The former adjusted the video chrominance signal properties and the latter attempted to lower the contrast value of the image. If the contrast settings are too high in case of an image, it means that a step between "successive" colors in a progression is too large. This situation is illustrated on Figure 14. Instead of the regular-sized increments and several different colors between the beginning and the end point of the scale, the transition is more drastic and carried out in fewer steps. That amounts to the loss of the skipped colors. To control the amount of light reaching the camera lenses, the f-stop value of the color camera could also be manipulated. That reduced the distortions caused by excess or lack of illumination. Although these manipulations managed to correct the image distortions, all of the configurations had to be modified over and over again in order to acquire an image that would hold intensity values in the whole range of color intervals. The amount of necessary modification was different in case of the majority of the images. Even for images that were taken in the same room, the adequate settings did not necessarily agree. Thus, the goal, of finding a single particular setting that would be sufficient for the majority of the images could not be fulfilled.

The only other solution that seemed to be feasible, after the unsuccessful attempts with the camera settings, was the calibration of the camera and the reprocessing of the input image before the reflectance analysis took place.
In photography, there are standard methods for calibration. One, in case of general images, applies the color of an object or a surface (e.g.: human skin color, white wall) as a reference value. The average color of the particular/selected surface is calculated and all photographs are adjusted to that value. The other method, a more sophisticated one, relies on density cards. There was not sufficient time to experiment with finding the general intensity value of a great variety of office environment images; thus the second method was applied.
3.5 The Linearization Process
As the original gray card with its single reflectance reference would not have been sophisticated enough in case of non-linear distortions, a tool providing higher accuracy was searched for. This "tool" turned out to be a higher quality gray card that had two additional calibrated regions besides the gray one. The white section reflects 90%, the gray 18% and the black 3% of all the incoming light [Figure 12, right].
To examine the true nature of the aberrations of the images, a two-dimensional coordinate system was created. The x-axis of the coordinate system stood for the reflectivity and the y-axis signaled the intensity values. The RGB intensity measures corresponding to the three calibrated regions of the gray card were plotted in all three color-bands. These points were B = (3, By), G = (18, Gy) and W = (90, Wy). By connecting these points, it was assumed that the BG and GW line-segments would contain the rest of the image pixel values from the input. That is, even with the distortion, the different intensity values would linearly fit one of the two segments in each color band. This step was necessary in order to characterize the general nature of the aberrations and to be able to correct all the possible image values. The fact that only three reference values were used to predict the characteristics of the whole input image might seriously weaken the assumption. However, that first approximation turned out to be fairly acceptable for the indicated purposes.
Chart 1 clearly shows the non-linear nature of the distortions. The reconstructed color lines are not merely shifted from the ideal one. They are also broken. (On the chart, the ideal color line is the only one that goes through the origin. It is yellow and connects (0,0) and (100, 255).) That phenomenon is best visible in case of the blue and the green lines. Though, careful analysis of the red intensity values proved that even those two line-segments in the red band, BG and GW, had different slope measures. That was the reason why the operations developed with the single-step (18%) gray card could not account for the distortions and the gray card intensities varied so greatly. A multiplication, which adjusts the intensity values of regions with respect to the 18% gray region, simply shifts the line segments up or down in the y-direction. This operation does not address the nature of non-linearity.

In order to eliminate the aberration, the new task was to perform a transformation that would take the original lines (from the distorted image) and fit them on the ideal one. The mathematical approach that could straighten and shift the color lines into their "proper" position is described in the following. The reader should note that along the argument an important assumption was made. Namely, that in the images the whole [0, 255] range is used for specifying intensity values. Or, in other words, the ideal illumination is 255 in all cases.
3.5.1 Linearization Calculations
In order to correct for the distortions and to obtain consistent measures
in the images, each line-segment had to be transformed into the ideal color
line. The equation of a line in the two dimensional plane is:
,
where "m" represents the slope and (x0, y0) is a
specific point on the line. The ideal line for the color graph has the
equation of
. It connects
the coordinate point (0, 0) with (100, 255). To achieve the required alignment
of the distorted image lines, the original color segments had to be both
shifted and rotated. These two transforms are depicted on Figure 15. As
each color line had two different segments, the below-described algorithm
was applied six times all together.

white rectangles of the gray card (BX, GX, and WX) and the slope of the ideal color line (M). As the reflectance properties of the black, gray and white regions are 3%, 18% and 90% respectively, the constants took the following values.
Then the equations for the distorted lines have to be specified in order
to correct them. Chart 1 depicts the values for
,
,
and
, the reflectance values
coupled with their current distorted intensity values. (The chart labels
refer to the values in the blue color-band.) The slope of the line segments
could be calculated by substituting into the formula
The next step was to compute where the extensions of these line segments would intercept the y-axis. As in case of the ideal line this y-intercept is zero, any value different from that had to be deducted in the future transformation. (This operation was equivalent to a shifting process along the y-axis in the opposite direction of the current value.) Point G lies on both of the line segments for a given color band, so the coordinates of this point were utilized to define the equations.
The transformation has to shift the color line by
.
The other essential operation of the transformation, the rotation, adjusted the slope of the line to that of the ideal one (2.55). This rotation multiplier (K) was calculated by
The above-described (3.1), (3.2) and (3.3) formulas were sufficient to compute the new grayscale values for the image pixels.
As
, the above formula
could be simplified and written as
This transformation had to be applied to each pixel of the input image. The two variables "K" and "m" could be computed via the formulas, however, the reflectance value "x" was unknown. By using the line equation for the original, distorted image, this value could be extracted by:
When applying the (3.4) formula, the slope value had to be carefully selected. As in each color band there were two line-segments describing the original color line, the appropriate slope value could be obtained by comparing the original intensity value of the examined pixel to Gy. If Gy was greater than the intensity, it meant that the point was expected to lie along the first (BG) line-segment. The slope of that segment was used. Otherwise, when the original intensity was greater than Gy, the slope of the second line-segment was applied. [See Chart 1]
Consequently, the transformation formula to find the calibrated color values could be expressed by:
Two further assumptions that were implicitly stated in the line transform were the following. First, it was presupposed that for reflectance values greater than 90% and lower than 3%, the intensity values resided on one of the elongated line-segments and the slope measure of the line-segment closer to the examined point was used. The transformations in this way could be applied for extreme values as well. As, in general, image reflectance values do not exceed 90% or take values below 3%, this assumption should not be considered very limiting. Secondly, the transform also heavily relied on the monotonically increasing nature of the line segments that were to characterize the whole input image. If the real intensity values did not satisfy this criteria (for example, the image would follow a curve such as in Chart 2), the transformation outcomes would not be adequate.

The effectiveness of the algorithm using formula 3.5 was first judged by examining the resulting pictures. As the transformed images, for example Figure 9, did become sharper and the dim and grayish colors on the input [Figure 10] did turn to be more vivid, it was expected that the operation would be a useful tool.
Then, naturally, the corresponding intensity values were analytically checked for more precise information. When calculating the reflectance values around the gray card in several different scenes, the results seemed to be promising. The intensity values corresponding to the calibrated gray card surfaces were approximately the same in all three color-bands (e.g.: 46.521; 47.624; 47.942). The illumination setting differences were clearly visible, though, in case of the neighboring surface reflectance values. For instance, the following values were measured for two wall sections next to Room 401a on two different input images: Wall1 (0.84; 0.819; 0.715) and Wall2 (0.76; 0.715; 0.66). The spreading of the color histograms does not account for varying lighting conditions. As it was already mentioned it is the task of the color identification transforms. As some of the reflectance differences might have been the result of weak assumptions about the linearity along each examined line-segment an advanced procedure was applied next.
3.5.2 Advanced Linearization Algorithm
In order to attempt to improve the above-described linearization transform, the use of a multi-step gray-card, which displayed more steps leading from black to white, was considered. It was feasible to believe that more reference values could provide a higher level of accuracy in estimating the nature of the distortions and in completing the image transforms. Consequently, a Large Gray Scale Target (8.5" x 11" mylar variable density target) was acquired [Figure 16].


Figure 16. The 15-step Kodak density card.
The new density-step target have two identical gray-scale progressions displayed in the range of [0.07, 1.5]. These densities vary from the highest to the lowest value on the upper scale and advance from the lowest to the highest on the lower scale. The variation between the density steps on the new card is linear in both cases. It implies that the change is logarithmic in case of reflectivity.
Since, in the past, reflectance evaluators were used for the calculations and comparisons, the given density values had to be converted to the appropriate indicators first. In photography, the term "density" refers to the degree of concentration and the magnitude of the silver deposit in the emulsion [Basic Photography, 1941]. For precise quantitative measurements, it is expressed in terms of the light incident upon a negative and the light transmitted through the negative (or reflected in case of photographs on paper).
Opacity is also a photographic expression. It refers to the amount of light absorbed and it can be calculated by the ratio of the incident to the transmitted light. Transmission (T), the percent value of incident light which is transmitted by a given density is the inverse of opacity. It is the ratio of transmitted light to incident light.
Density (D) can be mathematically expressed as the base 10 logarithm of opacity (O).
Then converting the density value to transmission/reflectance values from this equation is simple.
As the density range of the progressions is [0.07, 1.5] and there are
15 steps in each, the linear step value between the individual rectangle
elements is 0.102. Based upon all this information and formula (3.6), a
conversion table [Figure 17] was created. It displays the appropriate relationship
between the given density values of the target and the required reflectance
measurements.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The new calibration method relied on fifteen calibrated samples instead of only three. Before its application to the color images, the values from the distorted images were plotted in the same two-dimensional reflectance-intensity coordinate system that was created earlier. This provided a good opportunity to see how accurately the three reference regions (Black, Gray and White) estimated the amount of aberration on the input images. [See Chart 3-4]
Chart 3. The distorted color lines measured by the 15-density-step
target; upper progression

The distorted color lines proved to most closely follow logarithmic curves. Surprisingly, though, it was mainly in case of the former BG line-segments where the difference compared to the previous measures (linear line-segments) was higher. A less accurate approximation was expected in the other cases, with the GW line-segments, as the length of these sections was considerably bigger. However, they reserved an almost linear behavior. (This is exactly what was assumed by all the previous computations). A possible explanation of this phenomenon could be that, on average, values of the lower portion of the [0, 255] intensity interval are used more extensively. Consequently, the rate of aberration originating from the assignment of approximate values to these intensities resulted in larger errors. (It is also true that the number of reference values was approximately the same in case of the two original line segments, but the length of the GW segment was much bigger. That difference might not allow for a direct comparison of the level of aberration along the two segments.)
The newly processed images were tested. Although the modified inputs were clear and reserved the original image features, they were all darkened. [Compare Figure 9 and 18] It was found that most probably the very precise calculations on the BG segment of the color lines and spreading out the histograms were the reasons for that. Besides these differences, the calculated values were just as consistent as with the three-step card.

For further calculations it was the three-step card (3%, 18%, 90%) that was selected to assist in the calibration task. This card had some practical advantages over the more sophisticated version. The regions were considerably easier on its surface to be measured and the transformed images looked brighter and more natural.
3.5.3 Application
The above described calibration method was a useful tool that could be applied before the reflectance analysis of the input color image. The one serious limitation that it superimposed on the identification procedure was that the gray card had to be present in all the examined pictures. This was a very strong requirement, which reduced the reality value of the application. Gray cards are not located all around in the office environment. Even if they were represented, during the image analysis, a human would have to point out their location on the input images or it would be required to present at a pre-specified part of the input in all cases. That dependency was too limiting and it motivated the generalization of the linearization algorithm. It was decided that the camera would be calibrated with respect to one specific image for all input images. Therefore, the intensity information required for the correction process could be stored in global variables. The selected image is displayed on Figure 19. The gray card is present with a white wall segment in the background. The illumination conditions are "normal". The incident light falling on the examined surface is neither too bright nor too dim. From here on, whenever the place recognition algorithm started, the linearization procedure was automatically applied according to these global values. In this way, assuming that the nature of camera distortions could be reliably approximated, the introduced errors were corrected for.

The generalization of the linearization process introduced the need for an auxiliary operation. As the corrections were completed to lighting conditions that did not necessarily agree with the ones presented on the current input image, a transform mechanism was required to compensate for such illumination differences. After the Land reflectance values were assigned to all the input image pixels and the region finding algorithm located the major surface measures that had to belong together, the color transforms sought an appropriate multiplier triplet that could eliminate the remaining illumination-caused distorting effects. This identification operation is described in more details in Chapter 4.
With this solution implemented the gray card was not required to be present in the examined images any more. If, however, one wanted to analyze an image with one of the more sophisticated gray cards, this option was still available in VLSys.
3.6 Collecting Surface Reflectance Measures
After the linearization procedure was introduced, it was possible to
find the reflectance measures belonging to various surfaces in Susan B.’s
environment. Gathering this information was necessary for the identification
procedure, which allowed for finding the corresponding values of the input
image regions in the model. All the regions were analyzed could be located
on input images that did contain a gray card reference. Right after the
correction transform and the elementary region analysis, measures were
taken to characterize the intensity surrounding surfaces of the gray card
reference. It was only the immediate neighborhood values that were registered.
In this way the assumption about the smoothly varying illuminations was
highly to be met. The reflectance measures were collected before the Weakness
algorithm further acted on the image because this process modified the
reflectance values of regions while merging them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For the final calculations the simple physics formulas were considered, which have been already described above in the "Single reference value", (Section 3.3). As a result, a collection of constant indicators [Figure 20] was stored in the environment model as the Lambertian reflectance evaluators of the different surfaces. In the color identification procedure (described in Chapter 4) these values are called "constant reflectances" and they guide the confidence evaluation process as well.