Thrombin Trypsin

3'°3!q 3^5 4^0 4^5 5^0 5,5 e!o 6^5 7!o 7^5 s!o a!5 pKi (calc.)

Thermolysin (1)

Figure 9. Correlation of experimentally determined pK values versus calculated ones by DrugScore for (a) a set of thrombin and trypsin inhibitors docked into thrombin (taken from [55]) and trypsin (taken from lpph) and (b) a set of themolysin inhibitors [56] docked into the protein structure 1tlp, respectively, by FlexX. In the latter case, only those ligands are displayed for which FlexX finds a reasonable geometry (Thermolysin(I)). Together with the ideal correlation line, deviations by ±1 log units are depicted.

pKi(calc.)

Figure 9. Correlation of experimentally determined pK values versus calculated ones by DrugScore for (a) a set of thrombin and trypsin inhibitors docked into thrombin (taken from [55]) and trypsin (taken from lpph) and (b) a set of themolysin inhibitors [56] docked into the protein structure 1tlp, respectively, by FlexX. In the latter case, only those ligands are displayed for which FlexX finds a reasonable geometry (Thermolysin(I)). Together with the ideal correlation line, deviations by ±1 log units are depicted.

Figure 10. Intrinsic geometrical constraints reflected by the atom pair preferences of 0.2-0.3 and C.2-O.3. Given the minima of the statistical pair preferences (O.2-O.3: 2.55 A; C.2-O.3: 3.45 A) and the bond length (C.2-O.2: 1.22 A), the C.2-O.2-O.3 angle is calculated to be 128°.

the squared correlation coefficient for scoring values calculated for the crystal geometries of this set amounts to 0.34. Thus, in the present case the affinity prediction based on computer-generated geometries is more precise than using the crystal coordinates. However, we anticipate that is not usually the case. The affinity predictions for the thrombin and trypsin inhibitors (sample (b), Figure 9) deviate from the experimentalpKi values by 0.7 log units and yield a squared correlation coefficient of 0.56. The predictions are of the same quality for thrombin and trypsin. In the case of the thermolysin inhibitors (c), a squared correlation coefficient of 0.35 is calculated for the total set of 61 ligands ( 'Thermolysin') (data not shown). If the set is restricted to only those ligands where FlexX at least predicts a geometry with less than 3 A rmsd from modeled reference structures (using crystal geometries as templates for the modeling), the R2 value increases to 0.42 (standard deviation: 1.68 log units) (Thermolysin(I), 43 cases, Figure 9). Considering the set of 15 thermolysin inhibitors (Thermolysin(II)), an R2 value of 0.36 is calculated. Excluding ZGPLNH2 as outlier from the latter set (Thermolysin(III)), a squared correlation coefficient of 0.50 (standard deviation: 1.39) is found.

Implicit directionality of spherical-symmetric pair-polentiah

Since the compiled preferences for a given atom pair are calculated solely as a function of the mutual distance, any directional features can only be implicitly contained in the derived pair-potentials. As an example, the hydrogen-bond between a carbonyl group and an O.3-type oxygen should be considered

(Figure 10). The most favorable interaction for O.2-O.3 occurs at a mutual distance of 2.55 A, for the C.2-O.3 interaction at 3.45 A. Starting with these values and assuming a C.2-O.2 bond length of 1.22 A, a C.2-O.2-O.3 angle of 128° is calculated, well in agreement with the expected bond angle. A similar orientational preference is observed if representative fragments stored in ISOSTAR are consulted [57]. Additional contact preferences formed by the neighboring atoms will further constrain the spatial arrangement of a specific directional interaction.

Visualization of'hot spots' in protein binding pockets A regularly spaced grid is generated inside the binding site and scoring Values are calculated at every grid point using different ligand-atom types. The results are contoured individually for each atom type.

Arabinose-binding protein (1 abe) binds both epimers of arabinose preferentially to other sugar ligands [53]. The isocontour surface for C.3 (comprising values 10% above the minimum) encompasses all ligand-atom positions of this type as found in the crystal structure (Figure 11). The O.3 contour (10% level) depicts three favorable regions in space; all are occupied by hy-droxyl groups of the ligand in the crystal structure. Interestingly, the contour for oxygen O-1 extends over a range where actually the oxygens of the a- and P-epimer bind. O-2 and O-5 do not coincide with regions contoured as most favorable. The fact can be explained because O-2 orients towards the solvent and no favorable interactions with the protein can be determined.

Contours encompassing scoring values calculated for C.3,O.3, and N.am (10% level) inside the binding pocket of thermolysin (5tmn) are displayed together with the phosphor-analogue (ZGPLL) of the peptide carbobenzoxy-Gly-Leu-Leu (Figure 11). For the phosphonamidate, Bartlett and Marlowe [58] reported a Ki value of 9.1 nM while the phosphonate isomer (substitution of P-NH by P-O: ZGP(O)LL) resulted in 990 times weaker binding. In contrast, for the phosphinate analogue (substitution of P-NH by P-CH2: ZGP(C)LL) an inhibition constant of 180 nM is reported [59], only 20 times weaker than the phosphonamidate. These findings were attributed mainly to solvation effects and the potential to form a hydrogen bond between the ligand atom adjacent to phosphorus and the carbonyl oxygen of Ala 113. Interestingly, DrugScore highlights favorable binding both for an N.am and a C.3 atom at a position next to the ligand atom adjacent to P. O.3 is not favorable at this position. However, for this atom-type, promising regions coincide with the terminal oxygens of the - PO-2 - moiety binding to zinc.

The contours described are displayed on a 10% level for each atom type thus resulting in a relative description. However, a per-atom contribution to the total score using C.3,O.3, and N.am, respectively, yields an absolute affin-

Figure 11. Isocontour surfaces encompassing values 10% above the global minimum of all scoring values on a 0.5 A, spaced grid for several atom types are depicted. Grid values are calculated by DrugScore inside the binding pockets of labe (top) and Stmn (bottom) for different ligand probe atoms. In the case of labe, the a- and P -epimers of arabinose are shown. The surfaces are color-coded as: dark blue (sp3-hybridized oxygen), yellow (aliphatic carbon) [ labe] and cyan (sp3-hybridized oxygen), yellow (amide nitrogen), magenta (aliphatic carbon) [5tmn]. For 5tmn, the arrow indicates the phosphonamidate nitrogen of ZGPLL, which is substituted by oxygen in the phosphonate ZGP(O)LL and by carbon in the phosphinate ZGp(C)LL.

ity description. The values for C.3 (-11.11) and N.am (-9.33) at the positions adjacent to P deviate by only 15%, however, that of an 0.3 (-4.34) at the same position contributes much less to binding affinity. As a consequence, the phosphonamidate ZGPLL (5tmn) and the phosphinate ZGP(C)LL (modeled from 5tmn by N.am/C.3 exchange) obtain comparable total scores while the phosphonate ZGP(O)LL (6tmn) is predicted to bind weakest. The rank ordering is qualitatively predicted correctly, however, the phosphonate is yet predicted too high in affinity (videsupra). Supposedly, this can be attributed to the simultaneous consideration ofhydroxyl- and ether-oxygens in the general 0.3 atom-type used in our compilation of the statistical potentials. Accordingly, in our potentials, the unfavorable ester-oxygen to carbonyl-oxygen interactions are overwhelmed by other, more favorable O.3/O.2 contacts.

Correspondence of 'hot spots' and observed ligand atom types In order to determine how often 'hot spots' detected by DrugScore actually match with ligand atom types, 159 crystallographically determined protein-ligand complexes were analyzed. In principle, Drugscore can be computed analytically at every position of the binding site, accordingly also at the crystallographically determined positions ofthe ligands under investigation. However, to use a more general approach in particular with respect to the analysis of unoccupied binding sites, we estimate the scoring by extrapolating from precalculated values assigned to the intersections of a 0.5 Á grid.

In a first step, we focus on fully buried ligand atoms only and test how frequent these atoms fall next to local minima in Drugscore for the five atom types C.3, O.3, O.2, O.co2, and N.3. By selecting these atom types, we intended to consider a small set of typical representatives for distinct types ofinteractions (hydrophobic, H-bond donors/acceptors, positively/negatively charged). In addition, this choice is close to the one used for the validation of Superstar [38] and allows a direct comparison (vide infra). The results are summarized in Table 3.

Assuming equal weights for all five atom types and full coincidence of ligand atomic positions with local minima of DrugScore, in the worst case a chance prediction of 20% would result. However, significantly higher prediction rates are observed. In particular, aliphatic carbon and amino nitrogen atoms (also implicitly including ammonium groups) are correctly predicted in 92 and 73% of all cases, respectively, while a carbonyl oxygen atom is recognized only in 27% of the cases. For O.3 and O.co2 types, the rates amount to 37 and 46%, respectively. In total, an overall prediction rate of 74% is achieved. Grouping similar atom types together ('hydrophobic': C.3; 'hy-drophilic': O.3, O.2, O.co2, N.3), in 86% of all cases the correct or a similar atom type is recognized. Assuming that most of the N.3-type atoms (amino

Table 3. Statistics on the prediction rates of buried ligand atom types suggested by Drug-Score compared to atom types observed in crystallographically determined protein-ligand complexes at the same spatial positions

Actual Predicted

Table 3. Statistics on the prediction rates of buried ligand atom types suggested by Drug-Score compared to atom types observed in crystallographically determined protein-ligand complexes at the same spatial positions

Actual Predicted

No.

C.3a

O.3a

O.2a

O.co2a

N.3a

Correct

H. phob./ h. phil.b

Sim. Interact^

C.3a

745

92%

3%

<1%

<1%

4%

92%

92%

O.3a

168

40%

37%

5%

7%

11%

60%

60%

O.2a

124

18%

26%

27%

23%

6%

82%

76%

O.co2a

67

6%

27%

17%

44%

6%

94%

88%

N.3a

15

20%

7%

0%

0%

73%

80%

80%

Overall

1119

74%

86%

85%

a Atom types are according to the SYBYL notation: aliphatic carbon (C.3), sp3-hybridized oxygen (O.3), carbonyl-oxygen (O.2), carboxyl(ate)-oxygen (O.co2), (protonated) amino-nitrogen (N.3).

b Atoms are grouped separating hydrophobic - hydrophilic properties: C.3 versus O.3/O.2/O.co2/N.3.

c Atoms are grouped showing similar type of interaction (N.3 is considered to be proton-ated): C.3 (hydrophobic); 0.3, N.3 (hydrogen-bond donors); O.3,O.2,O.co2 (hydrogen-bond acceptors).

a Atom types are according to the SYBYL notation: aliphatic carbon (C.3), sp3-hybridized oxygen (O.3), carbonyl-oxygen (O.2), carboxyl(ate)-oxygen (O.co2), (protonated) amino-nitrogen (N.3).

b Atoms are grouped separating hydrophobic - hydrophilic properties: C.3 versus O.3/O.2/O.co2/N.3.

c Atoms are grouped showing similar type of interaction (N.3 is considered to be proton-ated): C.3 (hydrophobic); 0.3, N.3 (hydrogen-bond donors); O.3,O.2,O.co2 (hydrogen-bond acceptors).

nitrogen) are protonated in proteins, atom types can be grouped according to a possible interaction type of the atom under investigation (C.3 (hydrophobic); 0.3 (donor + acceptor); O.2, O.co2 (only acceptor); N.3 (only donor)). With this classification, in 85% of all cases the correct interaction type is predicted.

The recently described tool Superstar [38] only considers four distinct probes (aliphatic carbon; hydroxyl-, carbonyl-oxygen; protonated amino nitrogen) for a similar validation study. Following the above-described arguments, a chance prediction of 25% has to be assumed. For a test set of 122 protein-ligand complexes, Superstar detects the correct atom type in 82% of the cases if only solvent-inaccessible ligand atoms are considered. This figure increases to 90% if 'similar' atom types are admitted. We included the carboxylate oxygen to consider an atom type most likely bearing a negative charge. It is the contrast to an amino nitrogen most likely being positively charged. Indeed, if our approach suggests a carboxylate oxygen as the most favorable type, an amino nitrogen is the atom type found the least frequent at this position by experiment. The same holds for the reverse.

0 0

Post a comment