Standard vs accessibilityscaled Hbond score

Ranking a library of thrombin inhibitors

The thrombin library used in this study has two properties that should be kept in mind: Firstly, 75% of the compounds contain an amidinium or guan-idinium functional group and another 7% contain a different basic functional group designed to fit into the Si pocket. Therefore, any successful ranking of binding affinity must rely on secondary hydrogen bonds and hydrophobic interactions rather than on the presence or absence of a basic group binding to Asp189 in the S1 pocket. Secondly, Figure 2a shows that the library contains a large percentage of compounds with many rotatable bonds and that there is a clear tendency for the large compounds to have higher activity. In order to minimize the effect of molecule size, we will discuss results separately for two subsets, one containing all molecules with 3-8 rotatable bonds (1627 compounds, 141 <0.1 pM, set A) and a second one with molecules having more than 8 rotatable bonds (1513 compounds, 452 <0.1 pM, set B). The enrichment factors calculated with standard and accessibility scaled (AS) H-bond scoring for both sets A and B are depicted in Figures 2b and c. An activity threshold of pK = 7 was applied. Since the percentage of molecules defined to be active in the two sets is different, it should be kept in mind that the maximum achievable enrichment factors are also quite different (1 1.5 for set A vs. 3.3 for set B).

Generally speaking, 30 to 50% of the maximum achievable enrichment rate is reached in top 10% subsets of the ranked databases: The differentiation between high-affinity and low-affinity thrombin inhibitors is moderate, which must be attributed to errors of the scoring function as well as in the ligand poses generated by the docking algorithm. For set B, slightly better performance is found with the standard scoring scheme. In order to distinguish between large active and inactive molecules, hydrogen bonds formed at the periphery of the cavity must obviously be accounted for. For set A, the AS scaled scoring scheme performs significantly better than standard scoring. A more detailed look at the docking results reveals structural reasons for this fact. Many of the compounds defined as 'inactive' in set A are either docked outside the active site, where their interactions have less weight when accessibility scaling is used, or form few unfavorable hydrogen bonds within the cavity. This stresses the importance of hydrogen bonds formed in buried pockets of a protein. In this sense, accessibility scaling of hydrogen bonds emphasizes the pharmacophore-like character of specific atoms in protein cavities. It is clear that set A corresponds more closely than set B to a typical distribution of molecules in a virtual screening experiment, because large molecular weight and high molecular flexibility is usually not a desirable property of compounds in a screening library. The AS H-bond potential therefore seems to be appropriate for virtual screening experiments. In order to test this hypothesis, further enrichment experiments were performed using a 5000 molecule random subset of the WDI database.

Retrieving thrombin inhibitors from a WDI subset library

Two random sets of 100 thrombin inhibitors were generated from the thrombin library. The first random set was restricted to compounds with 0.1-0.001 p M binding constants (high-affinity set), the second set only contained 510 p M inhibitors (low-affinity set). In both cases, the additional restriction of 3-8rotatable bonds per thrombin inhibitor was applied in order to avoid the scoring bias for large molecules mentioned above. Docking results for

% of ranked library % of ranked library

Figure 3. Illustration of the enrichment achieved with two different sets of 100 thrombin inhibitors in a 5000 molecule random subset of the WDI. (a, b) high-affinity subset, (c, d) low-affinity subset. Dashed lines: standard scoring, solid lines: accessibility-scaled H-bond scoring. The right hand plots show the fraction of active compounds found at each percent level of the ranked library.

% of ranked library % of ranked library

Figure 3. Illustration of the enrichment achieved with two different sets of 100 thrombin inhibitors in a 5000 molecule random subset of the WDI. (a, b) high-affinity subset, (c, d) low-affinity subset. Dashed lines: standard scoring, solid lines: accessibility-scaled H-bond scoring. The right hand plots show the fraction of active compounds found at each percent level of the ranked library.

the thrombin inhibitor sets were combined with those for the WDI subset and the total library was re-ranked according to FlexX scores. The resulting enrichment curves are plotted in the left column of Figure 3; the percentage of thrombin inhibitors found at each percent level of the total library is plotted in the right hand column of Figure 3 for better comparison.

FlexX can clearly distinguish between the compounds from the high-affinity set and the WDI subset (Figures 3a and b). Standard scoring achieves ef values higher than 20 and retrieves all thrombin inhibitors within the top 9% of the total library. AS H-bond scoring performs even better: The maximum enrichment factor of about 50 is almost reached and all thrombin inhibitors are among the top 4% of the ranked library. Finding high-affinity inhibitors of thrombin in a WDI subset, however, is certainly not a realistic virtual screen ing experiment. When a screening library is searched for inhibitors of a new target, it is more reasonable to expect inhibitors in the low micromolar range at best. This situation is mimicked by the low-affinity data set added to the WDI subset. Figures 3c and d illustrate that picking weakly binding thrombin inhibitors from a library is a far more difficult task for a docking program, even though thrombin has a well-defined set of H-bond donor and acceptor groups that play a role in inhibitor recognition and binding. Standard FlexX scoring retrieves only about 60% of the thrombin medium-affinity inhibitors and 50% of the low-affinity inhibitors within the top 5% of the total library. In the same fraction of the library, AS H-bond scoring finds 20% more active compounds. The WDI subset contained only 143 compounds with amidinium systems (less than 3%). In the combined low-affinity and WDI libraries, 34 of these amidinium compounds were contained in the top 5% when standard scoring was applied. With AS H-bond scoring, this number decreases to 21. This means that the modified scoring scheme does not simply enrich any amidinium systems but is able to distinguish between active and inactive amidinium compounds to some extent.

It can be concluded that at least for this application, AS H-bond scoring is a significant improvement over the standard scoring scheme. AS H-bond scoring can help to focus the inhibitor search for targets where active site cavities contain a number ofburied donors and acceptors that must be satisfied. It has meanwhile been successfully applied in a number of Roche projects that fulfill these criteria, e.g. p38 MAP kinase and Staphylococcus aureus gyrase B. It might also be useful for the assembly of motif libraries (small, weakly but specifically binding molecules) for SAR by NMR experiments [69-71].

Modifications of scoring functions like the one presented here should be employed with caution, since there is an inherent danger of a loss of generality. For example, in the case of thrombin, AS H-bond scaling focusses on charged S 1 -binding groups and could easily overlook inhibitor classes with neutral S 1 groups such as those from Merck or Eli Lilly (for reviews see References 72 and 73). As long as there is no truly generally applicable scoring function, however, it is and will be fair practice to tailor scoring functions to individual types of targets [58,74-77].

0 0

Post a comment