Virtual screening (VS, also referred to as molecular database screening or in silico screening) is the process of reducing a library containing an unmanageable number of compounds (available or virtual) to a limited number of potentially promising compounds for the target (or target family) of interest by means of computational techniques [1]. The rapid advance of new technologies such as combinatorial chemistry [2] and high-throughput screening [3] offers the possibility of synthesizing and testing hundreds of thousands of compounds. However, as the pressure on pharmaceutical companies to deliver more targets to the lead discovery pipeline augments, VS will increasingly become a valuable strategy for prioritizing compounds for screening. This should provide an optimum balance between the possibility of still screening every single target while maintaining time, cost, and waste of compounds at a reasonable level.

* To whom correspondence should be addressed. E-mail: j [email protected]

** Present address: Vertex Pharmaceuticals, 88 Milton Park, Abingdon, Oxon OX14 4RY, U.K.

The recent advances and developments in library design and automated computational methods for VS [4-6] have broadened its applicability to the different stages of a pharmaceutical project, namely, hit discovery, hit optimization, and lead optimization. At the hit discovery level, VS can be applied to general libraries of diverse available compounds in the search for those compounds containing alternative scaffolds showing the appropriate shape and chemical characteristics of the target of interest. At this stage, the activity criterion is usually not very stringent and a compound in the micromolar range can be usually considered as a 'hit'. Once hits are identified and confirmed, a hit optimization program starts. At this level, VS can be applied to more focused libraries of available or virtual compounds built around each one of the selected hits. If the activity of an original hit is improved up to the sub-micromolar range, then that hit becomes a 'lead'. Difficulties during the optimization process of a hit may foresee further difficulties during lead optimization. Therefore, it is important to have at this level different hit choices, as some of the hits identified at the discovery level will never be promoted to leads for strategic reasons. Finally, in a lead optimization program, VS can be applied to more targeted libraries aiming at not only further improving the activity of the compound but also its selectivity, toxicity and ADME properties [7]. As the selection criteria for compounds are narrowed when going from hit discovery to lead optimization, the efficiency of VS will depend more on the amount and type of structural information available.

With the introduction of biophysical methods in drug discovery [8] during the 1980s it became possible to determine the binding modes of ligands in the context of a protein binding site at atomic resolution. This made it extremely attractive to use the shape and chemical composition of the binding site for searching databases for complementary molecules. On this basis, putative lig-ands can be potentially retrieved from databases of (virtual) molecules [9] or designed de novo to fit the protein binding site [10]. Since then, the relevance of docking methods for, on one hand, identifying molecules from a database that optimally fit into the protein active site and, on the other hand, proposing a binding mode for those molecules, has been widely established [11-20].

Although docking molecules to the 3D structure of their biological targets provides, in principle, the most accurate filter to identify and improve lead compounds, docking protocols suffer from a number of limitations. First of all, the conformational and orientational space to be searched for each potential ligand and its receptor is vast [21,22]. Any attempt to reduce it has consequences for the breadth and accuracy of the search. Secondly, the ther-modynamical and quantum-mechanical descriptions of protein-ligand interactions are daunting and currently still beyond a definite and exact theoretical description [23-27]. This is, for instance, reflected in the painful absence of reliable and broadly applicable scoring functions for protein-ligand interactions [28-31]. Finally, the receptor structure is often perceived as a constant [32], while in fact protein structures are known to adapt themselves to varying extents to the ligand presented to them [33-36]. Due to the sharp increase in van der Waals repulsion at short distances, small changes in the receptor structure can drastically change the course of a docking simulation. Clearly, attempting to dock a ligand to a non-complementary protein conformation defies the lock-and-key paradigm that underlies the concept of docking. In conclusion, despite its proven usefulness, molecular docking has not yet achieved the degree of accuracy that may have been expected when it was originally perceived in a situation where structures of the protein or a protein-ligand complex are known in atomic detail [37-42].

Central to the application of computational techniques in medicinal chemistry is the similarity-property principle, which states that structurally related molecules should demonstrate similar biological activities [43,44]. The derivation of classical QSAR [45,46] and more recently 3D-QSAR methods [4751] as well as the concept of molecular similarity and diversity for designing virtual libraries [52-57] rely on this assumption. Since in many cases only structural information is available on active ligands at the onset of a project, many computational approaches have focused on relating observed activities with the 2D or 3D structural and physico-chemical properties of the ligands. Some methods aim at evaluating molecular descriptors [58-60] or identifying the presence or absence of certain chemical groups (fingerprints) [6 1,62]. Other methods search for substructure sets [63,64], geometric patterns or pharmacophores [65-67] within sets of active and inactive molecules. Finally, several methods based on the steric, electrostatic, hydrophobic and/or hydrogen-bonding surface properties of active molecules have also been developed [68-78]. In the last two categories, the alignment of molecules, in a way that is of relevance to their mode of action, represents a fundamental and challenging issue for the entire similarity approach [79].

Even though protocols for 3D alignment of molecules may yield 'perfect' alignments, according to the definition of similarity being used, there is no guarantee that the alignment obtained has relevance to the true binding modes of these molecules when complexed with their biological receptor. This presents an intrinsic caveat to the approach and indeed, cases are known where similar ligands display radically different binding modes [79-81]. Even more subtle deviations from the true binding mode, suggested by rigorously superimposing similar molecules, may affect the predictive power of models based on such superpositions [82].

Despite their limitations, both approaches have developed into practical and useful tools for the computational chemist working in the pharmaceutical industry. The principles of molecular similarity and chemical complementarity to the receptor have recently been implemented in computational tools for virtual screening of compound libraries. In practice, however, both methods are often used separately. When the structure of the receptor is unknown, active ligands are used to understand and predict the activities of novel compounds. If crystal structures are available for the system at hand, structure-based design would be the method of choice [83]. Given the strengths and limitations of both methodologies, however, a combination of ligand-based and receptor-based approaches could be used to improve the results obtained with either of them. The alignment of ligands may be improved by including knowledge on the binding site (vide infra) and docking may be performed more effectively if similarity with known ligand binding modes is enforced [20]. Although molecular docking has always been aimed at searching for all possible binding modes of novel ligands, even if they are very different from previously observed binding modes, a complete search of the binding site may not be necessary in most practical applications. On the basis of currently available crystal structures one can conclude that diverse inhibitors in most instances occupy the same regions within the binding site and in structure-based focusing of combinatorial libraries the use of a rigid and pre-placed scaffold greatly simplifies the docking of analogues [ 16]. Finally, large discrepancies between results obtained with both methods may identify compounds that deviate from the known SAR by manifesting new binding modes or question the applicability of either method to the system under study. In order to investigate some of these critical issues, the results obtained from two implementations of similarity and docking methods in a 3D virtual screening test case will be compared.

0 0

Post a comment