The development of combinatorial chemistry and its application to drug design [1,2] has led to new search problems in the area of virtual screening [3]. First of all, the number of molecules which can be synthesized on the basis of combinatorial chemistry increased dramatically compared to classical methods. This implies a much larger search space which has to be covered by virtual screening methods.

Probably more important for the development of virtual screening methods is the introduction of structure into this increased search space. If an unstructured compound collection is given, each molecule has to be analyzed independently in a screening experiment. Combinatorial libraries, however,

* To whom correspondence should be addressed. E-mail: [email protected]

follow a systematic build-up law synthesizing molecules from a highly limited set of building blocks. This structure can be exploited to severely reduce the runtime of virtual screening calculations.

Here we focus on the structure-based design of targeted combinatorial libraries through fast molecular docking. In general, we can distinguish three kinds of problems. In all cases, we assume that the 3D structure of the target protein is known:

Combinatorial Docking Problem Given a library, calculate the docking score (and the geometry of the complex) for each molecule of the library.

R-group Selection Problem Given a library, select molecules for the individual R-groups in order to form a smaller sublibrary with an enriched number of hits for the target protein.

de Novo Library Design Problem Given a catalog of molecules, design a library (including the rules of synthesis) optimizing the number of hits for the target protein.

Methods for these problems have emerged from the area of molecular docking and de novo ligand design (see [4] for an overview on combinatorial docking methods, [5-12] for reviews on docking and de novo design). In the former case, the docking algorithms are applied to fragments of the whole molecule and the resulting information is then connected, yielding placements for individual library molecules. In the latter case, the de novo ligand design is constrained by predefined rules of synthesis.

Early algorithms for the combinatorial docking problem analyzed the similarity in given ligand datasets in order to speed up the search process. The focus in these papers is to relate ligands within the dataset structurally. One approach to do so is to generate a minimal tree structure representing the whole ligand dataset [13]. Another approach is to speed up conformational searching based on clustering similar molecules [14]. In both cases, the derived hierarchy of molecules can then be used in an incremental construction docking method.

The combinatorial docking tools PRO_SELECT [15] and CombiDOCK [16] are also based on the incremental construction method. In both approaches, a library is formed by a template (or core) molecule with a set of attachment points to which one out of a predefined set of substituents can be connected. The template is positioned in the active site without considering the substituents. Starting from a few orientations of the template, the sub-stituents are placed into the active site of the protein independently. In case of

PRO_SELECT, substituents are then selected based on score and additional criteria like 2D similarity and feasibility of synthesis. CombiDOCK performs an additional step in order to calculate a score for whole library molecules by combining fragment scores.

Some approaches based on ligand de novo design software have been published for R-group selection problems. Kick et al. [17] applied a variant ofthe BUILDER program [ 18] to the preselection of substituents for a library targeted to Cathepsin D. The paper also contains an experimental validation of the method. Bohm [19] applied the LUDI program to the docking of two groups of fragments which can be connected painwise in a single-step reaction to search for new thrombin inhibitors. In principle, all programs for fragment-based de novo ligand design can be applied in a similar way to the R-group selection problem.

Finally, we mention two methods for de novo library design. Caflisch [20] applied the MCSS technique generating fragment placements which are subsequently connected. The DREAM++ software [21] combines tools for fragment placement and selection. The selection process is done such that only a small set of well-characterized organic reactions are needed to create the library.

Here we propose a new algorithm for the combinatorial docking problem based on the incremental construction algorithm implemented in FLeXX [22-24]. The algorithm is part of a new combinatorial docking extension of FlexX called FlexXc. The algorithm is more flexible than previous approaches with respect to the variety of libraries which can be processed. Also, R-groups are placed sequentially such that R-group dependencies are taken into account. We applied FlexXc to three libraries with up to 20 000 molecules and compare the results to sequential docking calculations of the enumerated library.

0 0

Post a comment