An ordered matrix is a direct form of representation of the distance matrix in which various colours or degrees of shading are used to represent different distance ranges. The samples in the matrix are arranged in such a way as to put nearest neighbours as close together as possible. Figure 30 gives an example of a simple ordered matrix for the DMD data. An excellent way to arrive at the best possible sequence of neighbours for the ordered matrix is to use Kruskal's multidimensional scaling technique (ref. 134), which is strongly based on neighbourship criteria, to scale the n-dimensional distance matrix to a one-dimensional sequence.

Non-linear mapping techniques have been used extensively in this work and are sufficiently complex to be considered as multivariate statistical analysis procedures in their own right. In fact, the non-linear mapping technique most often used by us, which is based on Kruskal's multidimensional scaling method (ref. 134), has sometimes been compared with principal component factor analysis techniques (ref. 132) because of its remarkable ability to display the most prominent numerical trends in the data. The concept of non-linear mapping is complicated and therefore often a source of confusion. For instance, a Kruskal-type two-dimensional non-linear map (three-dimensional versions also exist) is not a projection of multidimensional space on to two-dimensional space, but the result of an iterative computer procedure (ref. 45) aimed at minimizing the "stress", an empirical goodness-of-fit value (ref. 135) which compares the configuration in two-dimensional space with the original configuration in multidimensional space. Low stress values (e.g. <5%) are obtained if the nearest neighbour sequence for each point in the non-linear map is nearly identical with the

nearest neighbour sequence for the corresponding point in multidimensional space. Whereas "relative neighbourship" is thus preserved as far as possible, relative distance between points is often sacrificed.

The non-linear map of the distance matrix in Table 6 is shown in Figure 31. This map conforms closely to the mental picture formed from the distance matrix, as also evidenced by the very low stress value of 0.8%. In particular, the non-linear map appears to be a synthesis of the relationships shown in the scatter plots of Figures 26 and 27.

The loss of accurate distance information limits the usefulness of the non-linear map for quantitative decisions. However, non-linear maps are powerful tools for visualising differentiation tendencies in the data. Especially when the samples form clusters of complex shape, as often occurs with biological samples, non-linear maps may be the only possible way of obtaining insight into the relationship between

0 0

Post a comment