Combination of molecular similarity measures using data fusion

CLAIRE M.R. GINN1, PETER WILLETT1* and JOHN BRADSHAW2

1 University of Sheffield, Western Bank, Sheffield S10 2TN, U.K.

2 GlaxoWellcome Research and Development Limited, Stevenage SGI 2NY U.K.

Summary. Many different measures of structural similarity have been suggested for matching chemical structures, each such measure focusing upon some particular type of molecular characteristic. The multi-faceted nature of biological activity suggests that an appropriate similarity measure should encompass many different types of characteristic, and this article discusses the use of data fusion methods to combine the results of searches based on multiple similarity measures. Experiments with several different types of dataset and activity suggest that data fusion provides a simple, but effective, approach to the combination of individual similarity measures. The best results were generally obtained with a fusion rule that sums the rank positions achieved by each molecule in searches using individual measures.

Key words: data fusion, database searching, molecular similarity, similarity measure

0 0

Post a comment