next up previous contents
Next: Application: tissue classification in Up: k-NN classification and k-distance Previous: The k-DT algorithm.

Computational complexity.

In [175], it is shown that using a k-DT to compute a lookup table is the most efficient method to perform the k-NN classification of a large data set where the number of possible different patterns is comparable of lower than the size of the data set. Thanks to this result, we can restrict the present complexity analysis to showing that the algorithm of section [*] has an optimal computational complexity for a k-DT.
The output of the algorithm is made of k maps covering the F patterns. The complexity of the output is therefore O(F.k). This is the absolute lower bound for the optimal algorithmic complexity of a k-DT. More realistically, a k-DT algorithm should at least consider the neighbors of a pixel to compute its value, which means a O(F.D.k) complexity in D dimensions.
In algorithm [*], the procedure assign(p,l) is called exactly F.k times, since as soon as NN icur(p)(p) is assigned a value, icur(p) is incremented. Because they are always called together, the procedure propagate(p,l) is also called exactly F.k times. In that procedure, the neighbors of p are entered in the buckets structure. Using 2D-direct neighborhoods, restricted to those in the same direction as p-q(l), there are between D and 2D neighbors propagated for each of the F.k pixels that enter propagate(p,l). Thus, the total amount of elements passing through the buckets is O(F.D.k).
The distance distE(p,q(l) is only computed inside the propagate(p,l) procedure, in order to determine in which bucket (p,l) should go. Thus, the total amount of distance computations is exactly the same as the total amount of elements passing through the bucket structure, i.e O(F.D.k).
Finally, the number of comparisons performed inside the main loop for an element (p,l) taken from bucket(d) is fixed, unless dcur(p) = d. In this case, it will be compared icur(p)-idcur(p) times. This number is in average very low when prototypes are unique. The total number of comparisons is then also O(F.D.k).
In the worst case, with k identical prototypes at every prototype location, the average number of comparisons is close to k. It raises the number of comparisons to O(F.D.k2). This could be avoided by replacing prototypes (represented by a label l) by prototype locations (represented by a label l and a number m of occurrences in that location). Nevertheless, for practical applications, this does not appear to be needed.

  
Figure 10.1: k-DT algorithmic complexity: dependence from the number of neighbors k for several image sizes
\begin{figure}\centerline{ \epsfysize=7,5cm
\epsfbox{figures/chapter5/knn_testk.eps} }
\end{figure}

In order to confirm this theoretical analysis, we ran 3 experiments on synthetical 2D data, varying the number k of nearest neighbors, the size of the $n \times n$ images and the number N of prototypes. In experiment 1 (Figure [*]), k varies from 1 to 10, 3 values of n are considered and N is fixed to 1000. In experiment 2 (Figure [*]), n varies from 128 to 1024, 3 values of k are considered and N=1000. In experiment 3 (Figure [*]), N varies from 100 to 1000, n is fixed to 512 and 3 values of k are considered. In all experiments, the prototypes are randomly generated.
Theoretically, the computational complexity is O (F.D.k) = o(n2.2.k), so that the ratio of the needed CPU time by k.n2 should be a constant. The 3 experiments were performed on a Pentium II at 233 MHz computer. CPU times were recorded and their ratio to k.n2 are displayed in Figures [*] to [*].
In experiment 1 (Figure [*]), the CPU time per k.pixel is constant for k > 3. For $k \leq 3$, the fixed cost of handling the additional information in icur,dcur and idcur is a non-negligible factor, so that the CPU time per k.pixel is slightly higher.

  
Figure 10.2: k-DT algorithmic complexity: dependence from the image size ( $n \times n$) for several values of k.
\begin{figure}\centerline{ \epsfysize=7,5cm
\epsfbox{figures/chapter5/knn_testF.eps} }
\end{figure}


  
Figure 10.3: k-DT algorithmic complexity: dependence from the number N of training patterns for several values of k.

In experiment 2 (Figure [*]), the image size has no influence on the CPU time per k.pixel. In experiment 3 (Figure [*]), the number of prototypes has no influence either. In both Figures, the line for which k=1 is significantly higher than the other two, as suggested by the results of experiment 1.
Finally, let us notice that the CPU time required by the k-DT algorithm can be further reduced if, in the application considered, patterns that are too far away from all prototypes would better be not classified. In this case, the propagation can be stopped as soon as a critical distance is reached, even though the whole pattern space hasn't been labeled yet.

 
Table 10.1: Complexity of k-NN classification algorithms, with F the number of patterns to classify, N the number of training prototypes, k the number of nearest neighbors and D the dimension of the pattern space.
  Distance computations Comparison operations
Cover F.N O $(F.N.\log(N))$
Friedman O (F.k1/D.N1-1/D) $D.N.\log(N) + $O (F.k1/D.N1-1/D)
Warfield O (2D.F.(D+1).k) O $(2^D.F.(D+1).k.\log((D+1).k)$
This k-DT O(F.D.k) O(F.D.k)
 

In table [*], extended from the original in [175], the performance of this k-DT algorithm is compared to the brute force algorithm of Cover [24], and those of Friedman [59] and Warfield [175].
next up previous contents
Next: Application: tissue classification in Up: k-NN classification and k-distance Previous: The k-DT algorithm.
Olivier Cuisenaire
1999-10-05