Ten optimal 15 x 15 sub-libraries were generated by the genetic algorithm described above. In order to assess the quality ofthe results, they were compared to (1) 10 000 randomly generated libraries, (2) 10 libraries optimal with respect to maximal diversity regardless ofthe other criteria, and (3) 10 libraries with minimal diversity. The ten optimal libraries had about 78% compounds with a crop protection score greater than 0.3 and a diversity index of 5.2 at costs of about 3000 USD (20 g per building block). The 10 000 randomly

Figure 7. Distribution ofthe cost ofthe starting materials for 10 000 randomly drawn 15 x 15 libraries.

drawn compounds behaved much less favorable. Figures 6 and 7 show the distribution of the percentage of suitable compounds over these libraries and the distributions of the costs of the starting materials.

As can be seen from Figure 6, about 80% of the 10 000 random libraries have 0-10% suitable compounds, the rest contain 10-30% suitable compounds - a rather insufficient rate. The same holds for the building block costs (20 g per BB). There is no random library with less than 30 000 USD cost for the starting materials. The mean value is at 80 000 USD, and the maximum value at about 130 000 USD. Thus, the GA optimization is by far more advantageous. Figure 8 illustrates this superiority again. The diversity index of the sub-libraries is plotted against the percentage of suitable compounds (crop protection score >0.3) for the 10 000 random libraries, 10 optimal libraries with respect to score, diversity and cost, 10 maximal diverse libraries regardless of the other criteria, and 10 minimal diverse libraries regardless of the other criteria. The last two groups of libraries were obtained by GA runs for maximizing and minimizing diversity alone without the other criteria (score and cost) in order to find the lower and upper ends of the diversity scale for this type of combinatorial library - there is no absolute diversity scale.

The diagram shows the 10 best libraries on the right hand side of the score axis and in the upper third of the diversity scale with the min. diverse and the max. diverse libraries as end points. This is much better with respect to

Figure 8. Diversity index vs. percentage of suitable compounds (score >0.3): 10 000 randomly drawn 15 x 15 libraries, the best library after optimization, the maximal and the minimal diverse libraries.

both criteria than the 10 000 random libraries and much better with respect to the score than the max. diverse libraries. Thus, the GA found a sufficient compromise between several independent criteria for combinatorial library optimization.

0 0

Post a comment