Structure and Modeling of Loop Regions

The seven transmembrane GPCR helices are connected by three intracellular (ICL) and three ECL loops. The shorter loops ICL1, ICL2, and ECL1 are well resolved in the crystal structures and are amenable to homology modeling for most Class A GPCRs. However, modeling of the other loop regions (ICL3, ECL3, and ECL2) is very challenging due to their conformational flexibility, large length variation, and low sequence conservation among GPCRs. For example ICL3, which is closely involved in G protein recognition and signal transduction, has a length ranging from only 4-5 residues (e.g., in CXCR4) to more than 45 residues in the p-adrenergic receptor. The structure of ICL3 in rhodopsin (~12 residues) has been resolved, albeit with low electron density and high B-factor [64] . In the p2AR and AA2aR crystal structures, the ICL3

conformation remains unknown as the loop is replaced by T4 lysozyme [14, 16] . However, the structure of the p1AR ICL3 loop has been reported by Schertler's group (Proceedings of the GPCR2009 Congress). The structures of ICL3 for bRho and p1AR provide a first glance at the potential structural diversity in this region; however, the biological relevance of these conformations will be better assessed in the context of co-crystallized G proteins.

The conformations of ECL3 and ECL2 are well resolved in the available crystal structures of bRho, p2AR, p1AR, and AA2aR. Nonetheless, these structures reveal a great diversity between GPCRs in this region, particularly in ECL2 as discussed above (Fig. 15.4). Unlike the cytoplasmic loops, the ECLs are anticipated to play a major role in orthosteric small molecule binding, and ECL2 has been demonstrated to directly impact ligand specificity/selectivity for several GPCRs [72, 73]. As the results of a recent blind GPCR modeling assessment show [40] . prediction of the ECL2-TM3 disulfide bond and the Phe168 side-chain conformation in AA2aR are critical for accurate ligand placement in the model. The models that correctly predicted the Phe168 conformation not only correctly predicted the highest number of ligand-receptor contacts [40] . but also achieved improved performance in a large-scale VLS benchmark (V. Katritch, M. Rueda, P. Lam, M. Yeager, and R. Abagyan, unpublished data). Unlike the relatively conserved downstream part of ECL2, large variations in length, sequence, and secondary structure elements of ECL2 upstream of the conserved cysteine residue preclude template-based modeling of this region, leaving ab initio modeling as the only available option for most GPCRs.

In the context of small soluble proteins, ab initio loop modeling methods can be quite accurate (RMSD < ~1.0A) for short loop lengths of four to six residues [74] . This indicates that several of the smaller GPCR loops may be well approximated using loop prediction methods. However, for loops of longer length, the accuracy rapidly diminishes. In a recent loop prediction benchmark, over 50% of medium length loops (seven to nine residues) had RMSDs of worse than 2 A even when the lowest RMSD conformation was selected for comparison .74] . For membrane proteins, further inaccuracies arise, as interactions among the receptor loops, receptor termini, transmembrane helical bundle, and phospholipid membrane may affect loop conformation and are frequently not considered during the modeling procedure. These errors may be greater for GPCR homology models, as the end-to-end distances of the helices are uncertain, and may even be inappropriate for the target loop length. Recently, Mehler et al. developed a de novo loop prediction algorithm for membrane proteins that explicitly considers a loop in the context of the full protein and other predicted loop conformations. This procedure also attempts to minimize error by predicting an ensemble of compatible loop conformations rather than a single low energy structure [75] . For shorter loops, such as ICL1 and ECL1, this method was found to be quite effective, though the performance on longer loops has yet to be extensively verified [76] .

Recently, "modeling by omission" has been highlighted as an important strategy for error reduction in computational models [30]. In this technique, areas of conformational uncertainty are eliminated to facilitate correct ligand docking and conformational prediction. This tactic follows from a similar strategy implemented in the recently validated Scan with Alanines and REfine (SCARE) algorithm, wherein pairs of side chains are systematically deleted to allow for induced fit docking [ 50] . Previous work with bRho found that retinal could be correctly docked following elimination of all receptor loops, as well as the N- and C-termini [77]. In a separate study, VLS results were compared for three GPCR homology models wherein ECL2 was either modeled de novo or deleted. In two of the three cases, the ECL2- deleted models outperformed the de novo loop conformations, leading the authors to suggest that ECL2 be eliminated in homology models except in circumstances where extensive experimental restraints are available [78]. We have also assessed the impact of this loop on p2AR docking by conducting VLS trials with ECL2-deleted versions of the p2AR crystal structure and agonist-bound TM5 shifted model described earlier in the chapter (Table 15.2). The absence of ECL2 did not affect the docked conformation of either (-)-carazolol with the crystal structure or (-)-isoproterenol with the agonist-bound model. Carazolol docked to the ECL2 truncated version of the receptor with an RMSD of 0.23 Á to the crystallized orientation, while the RMSD of isopro-terenol docked to the loop-deleted and intact agonist-bound models was 0.16Á. Additionally, omission of ECL2 had a relatively small impact on VLS performance (Table 15.2). Antagonist enrichment factors for the top-scoring 1% and 5% of the 1K test set is comparable in the presence and absence of ECL2, though the total yield of antagonists in the top-scoring 10% was decreased, from 80% to 66.7%. Enrichment factors were also slightly decreased for the agonist-bound model in VLS. Nonetheless, VLS in the absence of ECL2 provides high enrichment factors and hit rates, with enrichment factors of 50.9 (antagonists) and 31.8 (agonists) for the top-scoring 1% of the 1K test set. This indicates that omission of ECL2 from receptor homology models is a suitable alternative when: (1) mutagenesis data indicate minimal interactions between ligand and ECL2; and (2) experimental data are unable to provide accurate conformational information.

0 0

Post a comment