## Distance dependence of reference states

Equations 5-13 show that the radius R of a reference sphere has an effect on the derived potentials. Figure 1 illustrates that part of this effect comes from the implicit introduction of solvent effects into the model. We are particularly interested to find the minimum radius of the reference sphere sufficient to implicitly account for solvent contributions. Below that radius a ligand atom would be buried in the protein matrix such that it does not feel the effect of the solvent anymore, including cooperative effects due to other ligand atoms that may be more exposed to the solvent. In addition, we would like to know the optimal reference sphere radius that produces the best correlation between score and experiment. These questions have been tackled here by studying the four sets of protein-ligand complexes and predicting their binding affinities using PMF derived with reference sphere radii between 6 and 12 A. In order to compare the scoring results for different reference sphere radii on equal footing, we use a scoring cutoff radius for all protein-ligand interactions of 6 A. Note that this is different from the optimal cutoff scheme found by Muegge and Martin that uses a 6 A cutoff for carbon-carbon interactions and a 9 A, cutoff for all other interactions. Using PMF/Ref1, Figure 8 shows the squared correlation coefficients R2 between calculated score and measured log K as function of the radius of the reference sphere. Test sets of diverse protein-ligand complexes (sets 3 and 4) show plateaus with statistically significant correlation between binding affinity and PMF score at reference sphere radii between 8 and 12 A Smaller radii lead to insignificant correlation between score and measured binding constants. For the reference state derived by using a 6 A radius the R2 is only 0.36 for set 3 and 0.32 for set 4. Test sets of special protein classes (sets 1 and 2) also show a continuous decline in R2 with decreasing reference sphere radius below 9 A. Set 1 shows a smaller but still significant decrease in correlation with decreasing radius. Set 2 seems to be optimal between 7 and 8 A. Towards

smaller radii it also leads to smaller R2. However, the significantly better scoring between 6.5 and 9 A of set 2 compared to 10-12 A is mostly due to the outlier 1mnc. Removing the outlier from set 2 results in an improved correlation at all distances, especially at distances above 9 A. It is interesting to note here that the optimal reference sphere radius for all the four test cases (using set 2a instead of 2) was found to be 9 A. This finding is independent of the cutoff scheme used. It has been reproduced for the original cutoff scheme of 6/9 A proposed earlier [27] for reference spheres with radii of >9 A (data not shown).

The results suggest that a radius of 7-8 A is sufficient to capture most of the solvation effects in a PMF scoring function. A reference sphere with a 6 A radius leads to significantly worse correlation between binding affinities and calculated scores. This result is consistent with the finding of good correlation for a PMF score recently reported by Mitchell et al. for a test set of 90 protein-ligand complexes from the PDB [41]. This scoring function (BLEEP) uses a reference sphere of 8 A to derive the PMF. Akin to Muegge and Martin [27] it has been found that additional terms of solvation do not improve the scoring.

The finding that Mitchell's scoring function performs similarly but not quite as good as the PMF score may be attributed to the fact that it misses a volume correction factor that has been introduced for PMF scoring [27]. In addition, it may be that Mitchell et al. use a reference state that is similar to Ref2. The effect of the ligand volume correction factor needs to be studied further but is out of the scope of this work. PMF scoring functions that use a reference sphere radius of 6 A, such as those reported by Verkhivker et al. [24] and Gohlke et al. [28], use additional solvation terms in their models in order to get good correlation with experimental binding affinities. For instance, the PMF term of Verkhivker's scoring function alone does not show any correlation with the binding affinities of his test set. This is again consistent with our finding that a radius of 6 A leads to a sub-optimal correlation between the PMF score and measured binding affinities.

## Post a comment