Modifications of the scoring function in FlexX for virtual screening applications


F: Hoffmann-La Roche Ltd., Pharmaceutical Research, CH-4070 Basel, Switzerland (E-mail:[email protected])

Summary. A modification of the hydrogen bond score in the docking program FlexX is presented. Hydrogen bonds formed in inaccessible regions of protein cavities thereby gain larger weight than others formed at the protein surface. The modified scoring function is tested with thrombin as a target. Secondly, a recently published knowledge-based scoring function is compared to the FlexX scoring function in several database ranking experiments.

Key words: docking, hydrogen bonds, scoring, virtual screening


The goal of 'virtual screening' is to select subsets of chemical libraries in such a way that they are enriched with compounds showing a desired affinity towards a given macromolecular target [1,2]. Docking calculations are a means of database prioritization that makes use of the 3D structure of a receptor in a quantitative way [3-8]. The computational expenditure for docking calculations is higher than for 2D similarity and most 3D pharmacophore search methods. Reasonable database sizes are 100-10 000 compounds; depending on the problem specification, between 20 and 1000 compounds can nowadays be docked per CPU-hour. Docking has the advantages that - besides the target 3D structure - no other experimental information is needed and that there is no bias towards finding active compounds of specific structural classes. Since the seminal work by the Kuntz group [9], many docking algorithms have been proposed [10-24], some of which have resulted in commercially available packages such as GOLD [17,18], DOCK [9-11,13] and FlexX [21,22], which are suitable for virtual screening purposes.

Flexible docking programs search the translational, rotational and con-formational space of a putative ligand in the active site pocket of the receptor (see References 25-28 for recent analyses of search strategies). For each pose (denoting a conformation together with its orientation in space relative to the receptor coordinates), the interactions formed between the receptor and the ligand are evaluated. This leads to an energy value, a score, for each pose. The score is a measure ofthe free energy ofbinding. It serves various ranking purposes, (i) ranking of poses generated for one ligand molecule and one receptor structure (structure prediction), (ii) ranking of different ligands relative to the same receptor (database prioritization) and (iii) comparison of binding energies of a ligand in two different receptors (selectivity assessment). Up to now, there exists no scoring function with satisfactory performance in all these ranking problems. Indeed, the accurate and rapid prediction of binding free energies is the one challenging problem of structure-based drug design [29-33]. Many types of fast general scoring functions have been proposed, ranging from molecular mechanics force fields [9,15,25,33-36], to functions empirically fitted to experimental binding energies [37-43] and potentials of mean force [44-50].

While the reproduction of experimentally solved complex structures is an accepted way of testing and comparing docking programs [51,52], surprisingly few virtual screening applications have been published. Compound selection based on docking calculations, for example, has been done for thrombin [53,54], thymidylate synthase [55,56], DHFR enzymes [57] and HIV protease [58,59]. However, only in rare cases [13], such studies are conclusive as to the efficiency of the docking tools in enriching subsets of a database in active compounds. The reason may be that only few consistent sets of measured binding energies are available to the public. Here we describe applications of the docking program FlexX [21,22] to a number of Roche in-house target structures and compound libraries in order to assess the performance of this docking tool in virtual screening applications. A simple modification of H-bond score in the FlexX empirical scoring function is presented as well as a comparison of the FlexX empirical scoring function [37] with a recently published knowledge-based scoring function [47], The results presented here give insight into the usefulness of current docking tools and scoring functions in practical virtual screening applications.

Computational methods

Preparation of ligand libraries

Three sets of Roche compounds were prepared, (i) a set of 3700 Roche compounds with known thrombin Ki values, (ii) a set of470 diaminopyrimidines with associated IC50 data for S. aureus DHFR, and (iii) a set of 650 COX2 inhibitors. The thrombin and the DHFR data sets uniformly cover 6 orders of magnitude in activity and contain around 15% inactive molecules, while the

COX2 dataset covers 4 orders of magnitude. The program Corina [60] was used to generate 3D structures in Sybyl [61] mol2 format. The files generated by Corina were further processed by a C routine that generated a likely protonation state of acidic and basic functional groups. Aliphatic amines, amidines and guanidines were protonated, carboxylic acids were deproton-ated, while the protonation state of aromatic nitrogen-containing heterocycles was left as generated by Corina. A set of5000 randomly selected compounds from the WDI [62] was converted to mol2 format by means of the same procedure.

Docking protocol

An X-ray structure ofhuman a-thrombin complexed with NAPAP (PDB [63] code ldwd) was chosen as thrombin target structure. All protein atoms within a distance of 8 A, of any NAPAP ligand atom were defined as active site atoms. The water molecule 47 in the P1 pocket was retained as an active site atom. In-house X-ray structures of Staphylococcus aureus DHFR [64] and COX2 were prepared in a similar way. All libraries were docked into the thrombin active site by means of the standard scoring scheme for hydrogen bonds. The WDI and thrombin libraries were also docked using a modified version employing accessibility scaling (vide infra). FlexX default settings were used except for AGrol, which was set to 0.7 kJ mol1. Automatic base placement was used for the WDI, thrombin and COX2 libraries (average execution time per molecule: 2 min on an SGI R10K processor). The DHFR library was docked using a fixed placement of the diaminopyrimidine moiety that was taken from the X-ray structure (average execution time 6 s). Only rank 1 solutions were considered for each compound. Since many compounds contained stereocenters with unknown configuration and because a complete enumeration of all stereoisomers could not be afforded, docking was performed with just two enantiomers of an arbitrary stereoisomer generated by Corina. Only the structure and energy with the better rank 1 score was used for database ranking.

Hydrogen bond scoring schemes

When solvation effects are not explicitly accounted for, empirical scoring functions treat hydrogen bond (or in the case of force fields electrostatic) contributions equal, whether they are formed on the surface of the protein or within a protein pocket. In our modified scheme, individual H-bond contributions are evaluated as a function of the solvent accessibility of the H-bond partner in the binding pocket, simulating the electrostatic shielding experienced by buried hydrogen bonds [65]. The standard hydrogen bond scoring

accessibility = 1 accessibility = 0

Figure 1. Schematic representation of the calculation of surface accessibility. Instead of the 18 probe directions on this 2D representation, 45 spherical directions are used in the 3D case. In a first step, the number of directions is counted along which access to a surface point P is unhindered from the outside of the protein. In a second step, the resulting integer values are scaled in an interval between 0 and 1. Points on planar or convex parts of a surface receive accessibility values of 1.

accessibility = 1 accessibility = 0

Figure 1. Schematic representation of the calculation of surface accessibility. Instead of the 18 probe directions on this 2D representation, 45 spherical directions are used in the 3D case. In a first step, the number of directions is counted along which access to a surface point P is unhindered from the outside of the protein. In a second step, the resulting integer values are scaled in an interval between 0 and 1. Points on planar or convex parts of a surface receive accessibility values of 1.

scheme in FlexX is based on the penetration of so-called interaction geometries [21] of the binding partners. If a hydrogen bond exists by this definition, its contribution to the score is calculated as a constant term multiplied by a penalty function describing the amount of deviation from preset ideal angle and distance values. The modified scoring scheme introduces a second scaling term. For each point on the Connolly surface [66] of the binding site, an accessibility value a is assigned, whose calculation is described in detail elsewhere [67] (cf. Figure 1). Average a values are calculated for each surface atom.

A sigmoidal function was empirically chosen to scale the hydrogen bond scores according to the accessibility values. Several functions of the general form (1 + exp(p(a - q) - r)), where a is the atomic accessibility value, were tested with varying values of p, q and r, ranging from almost linear to very steeply ascending scaling functions. Best performance on a series of in-house data sets (data not shown) was found with values of p = 10, q = 0.2 and r = 5. This scaling function virtually removes the contributions of all hydrogen bonds formed at values of a = 0.6 and above and removes about half of the contribution of hydrogen bonds formed at a = 0.3. As a consequence, there is a strong overall reduction of hydrogen bond energy in the total score, which reflects our observation that contributions of the lipophilic contact surface are often underestimated in the FlexX scoring function. The performance of FlexX for structure prediction is not altered when the modified score is used (results not shown). It is obvious that the modified scoring scheme for hydrogen bonds is still a crude approximation. Still, all hydrogen bonds between various functional groups are treated equal and the chemical environment of each hydrogen bond is not considered [68]. Attempts by Bohm to improve his regression-based scoring scheme by terms accounting for the environment of hydrogen bonds have failed [41], but it is likely that terms for such secondary interactions cannot be derived by regression techniques.

Calculation of enrichment factors

In this study we were not interested in absolute scoring energy values obtained from single docking runs, but focussed exclusively on the ranking of molecules with respect to each other. The quality of the rankings was assessed by enrichment factors. For this purpose, libraries were divided into 'active' and 'inactive' compounds at an arbitrary pKi threshold. In enriching experiments employing the WDI library, sets of 100 randomly selected compounds from the thrombin library were selected and added as the only 'active' compounds to the WDI library. In all cases, enrichment factors were calculated as a function of the size of library subsets by the formula ef(subset size) = fraction active in subset/fraction active in library

On average, random screening should result in enrichment factors of ef= 1, values of ef< 1 are obtained when the subset contains less active compounds than should be expected from the library average. Enrichment factors were not scaled, i.e. absolute values obtained depend on the ratio of active and inactive compounds defined in each specific case and should be compared with the maximum achievable ef that is obtained by dividing the total number of molecules by the number of active molecules. For very large subsets of a library the term 'enrichment' naturally loses its meaning.

Figure 2. a) Plot of pKi vs. number of rotatable bonds for the thrombin library. Enrichment plots for two subsets of the thrombin library containing only compounds with 3-8 rotatable bonds (b) and only compounds with more than 8 rotatable bonds (c) are shown.
0 0

Post a comment