next up previous contents
Next: Applications Up: Extended concepts Previous: k-distance transformations

Distance transformations on gray-scale images

If one associates a gray-level image G(p) to the original binary image containing the object O, it is then possible to define a large variety of gray-level distance functions. Most of those are defined as modified geodesic distances, i.e. the distance between pixels p and q is the minimum of the lengths of the paths P:(p=p1,p2,...,pn=q) linking p and q where the path length l(P) is defined as

 \begin{displaymath}l(P) = \sum_{i=1}^{n-1}f(d_N(p_i,p_{i+1}),G(p_i),G(p_{i+1}))
\end{displaymath} (2.31)

where the choice of the function f allows to define many different distances. The purpose of this section is by no means to present an exhaustive review of gray-level distance functions, but merely to illustrate the diversity of their possible definitions.

Rutovitz [138], in 1968, introduced the idea of the gray-weighted distance transformation, in which the gray-level is associated with height and the gray-weighted distance is less along paths with lower gray-level values. In [139] Rutovitz again defines the fall-distance, where the only permitted paths are those with strictly decreasing gray-values. The set of points reached by such decreasing paths is known as the fall-set. Vossepoel, Smeulders and Van der Broek [174] propose a queueing scheme to compute fall-distances, somewhat similar to Piper and Granum's [122], who show that their own ordered propagation algorithm can be applied to gray-level distance functions.

Levi and Montanari [94] introduce the gray-weighted medial axis transform (GRAYMAT), where the length of a path is computed as the sum of all gray-levels along the path. Preteux [125,124] defines the topographical distance, where dN(pi,pi+1) corresponds to the slope between the two pixels. Finally, Toivanen [156] presents the distance transform on curved space (DTOCS) and the weighted distance transform on curved space (WDTOCS), where dN(pi,pi+1)=|G(pi)-G(pi+1)|+1 and $d_N(p_i,p_{i+1})=\sqrt{(G(p_i)-G(p_{i+1}))^2+1}$, respectively.

A common feature to all of the above definitions is that, in contrast with distances from binary images, the influence zone of an object pixel is usually not convex. Therefore, the above papers that implement the DT with raster scanning have to do so iteratively. On the other hand, since those distances are all defined as geodesic distances - i.e. as the minimum length of the paths joining the pixels - we know that the optimal implementation of all of the above distance functions is the uniform cost algorithm of Verwer, Verbeek and Dekker [167], described in section [*]. The only restrictions are that function f in ([*]) should be positive defined, and that it should be quantified, so that bucket indexes can be computed from the distance values.

To illustrate this, let us consider the well-known watershed transform introduced by Beucher and Lantuéjoul [5], and for which Vincent and Soille [170] proposed an efficient implementation by immersion. Intuitively, the watershed transform is defined as the division of a relief into its catchment basins. A catchment basin is associated with every local minimum of the relief. It consists of locations such that a water drop falling on that location would flow downstream towards that minimum. In image processing, the gray-scale level of a pixel, or a function of it, is considered as an altitude, so that the image constitutes the relief.

  
Figure 2.16: Minima, catchment basins, watersheds, dams and flooding level
\begin{figure}\centerline{ \epsfysize=3,5cm
\epsfbox{figures/chapter1/figure16.eps} }
\end{figure}

An alternative, more algorithmic, definition of watersheds relies on the "flooding" analogy. We consider that holes are pierced at every local minimum of the image, and that the relief is slowly immersed into water. Starting from the minima of lowest altitude, the water progressively floods the different catchment basins. Whenever waters coming from two different basins meet, a dam is constructed to keep those waters separated. After the whole image is flooded, dams separate the catchment basins completely. They are the watershed lines.

Vincent implements this by first sorting all pixels in the image by increasing gray-level. Essentially, he performs bucket-sorting in two steps. First he determines the frequency of each gray-level value, which allows him to allocate the needed memory for each list of pixels. Then, he scans the image a second time and inserts the pixels in the list corresponding to their gray-level value.

Next comes the flooding step, illustrated at Figure [*]. Supposing that the flooding has reached level h, the pixels of level h+1 have to be divided between the catchment basins of level h and the new basins corresponding to the local minima at level h+1. First, the pixels at level h+1 that are neighbors to pixels from a catchment basin at level h are put into a FIFO (first in first out) queue. Then, the catchment basins are extended by propagation under the constraint of considering only pixels at level h+1. When several catchments basins of level h are connected at level h+1, this procedure separates the resulting basins at level h+1 along the geodesic skeleton by influence zone (SKIZ). The pixels at level h+1 that are not reached by one of the catchment basins must be local minima, and become the seeds of new basins.

  
Figure 2.17: Watershed segmentation of a 3-level image by Vincent's algorithm. Upper left: Original image Upper right: flooding reaches level 1, 3 minima are detected. Lower left: flooding reaches level 2 Lower right: flooding reaches level 3.
\begin{figure}\centerline{ \epsfxsize=8cm
\epsfbox{figures/chapter1/figure17.eps} }
\end{figure}

Alternatively, Meyer [106,107,108] shows that the watershed transform can also be expressed in terms of a topographical distance function. If points p and q belong to a line of steepest slope between p and q (G(q)>G(p)), the topographical distance is defined as disttopo(p,q)=G(q)-G(p). The catchment basins are then the equivalent of the Voronoi division of the image, where object pixels are the local minima and the distance considered is disttopo.

Meyer implements this computation by using hierarchical waiting queues, which allows him to process the image by increasing topographical distance values. This technique was later adapted by Thiran, Warscotte and Macq [152] to implement a variation of the watershed segmentation.

Clearly, both Vincent's [170] and Meyer's [108] algorithms can be seen as adaptation of the optimal bucket sorting algorithm of Verwer, Verbeek and Dekker [167].


next up previous contents
Next: Applications Up: Extended concepts Previous: k-distance transformations
Olivier Cuisenaire
1999-10-05