Motifs and domains

The literature on interacting proteins with GPCRs and the extensive role of GPCRs in numerous signalling pathways is ever growing [for a recent review see (Marinissen and Gutkind 2001) and chapters 5, 6, and 7]. This section will overview some resources that allow: (i) ways to identify motifs that characterize a sequence as a member of a specific GPCR sub-family, (ii) identification of motifs that are not specific to the GPCR superfamily but are additional features of the extra and intracellular domains of GPCRs, and (iii) identification of motifs that lead to post-translational events or impact on cell signalling and cell trafficking.

With respect to the last category, the interactions with G protein-receptor kinases or arrestins will not be covered here but this is an active research area (Oakley et al. 2001).

(i) While it is clear that a new receptor can be classified into a subfamily with some confidence based on the initial pairwise sequence searches, using BLAST, for example, it is necessary to analyse sequences at a more specific level. Thus, GPCR fingerprints (searchable in PRINTS: http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/) were developed, which allow characterization of regions of sequence in a more fine-grained way, for example, a comparison at a level of the individual TM Ds (Attwood et al. 2000). Contrast this with other profile databases, for example, PROSITE (http://www.expasy.ch/prosite/) and the more sophisticated PFAM (http://sanger.ac.uk/Pfam/) (Bateman et al. 1999), which provide sequence information at the higher GPCR family level. The program InterPro (http://www.ebi.ac.uk/interpro/scan.html) (Apweiler etal. 2001) is an integrated documentation resource of functional descriptions and literature for protein families, domains and functional sites, devloped as a means of rationalizing many complimentary efforts, such as those quoted above. It allows a user to visually compare the results of all these methods. iPro-class (http://pir.georgetown.edu/iproclass/) is another integrated resource based around PIR and SWISSPROT that provides comprehensive family relationships and structural/functional and features of proteins (Wu et al. 2001).

(ii) In addition to the domain search databases mentioned (PRINTs, PFAM, PROSITE, InterPro) there are other databases that allow the identification of domains in the N and C termini of GPCRs, which are not specific to the GPCR superfamily. SMART (http://smart.embl-heidelberg.de/)—a simple modular architecture research tool (Schultz et al. 1998) contains extensively annotated data on signalling domains classified by extra and intracellular location, as alignments, profiles and hidden Markov models. This is a coordinated source of literature and sequence information. ProDom (http://www.toulouse.inra.fr/prodom.html) (Corpet et al. 2000) derives alignments based on PSI-BLAST. Each resource has its own style and the inter-relatedness is evident from the way they mostly link to one another. A feature of some members of Type III GPCRs is the coiled coil motif found in the cytoplasmic C termini. These are known to be important for the GABAB receptors and are one of the reasons why the known subtypes, GABABR1 and GABABR2, heterodimerize. Indeed, this feature was key to the yeast two hybrid experiments that led to the discovery of the second subtype (Kammerer et al. 1999; Kuner et al. 1999; White et al. 1998). Predicting these is straightforward using programs such as COILS (http://www.ch.embnet.org/software/COILS_form.html) or Multicoil (http://nightingale.lcs.mit.edu/cgi-bin/multicoil) (Wolf et al. 1997).

(iii) Sequence motifs that lead to post-translational processing, N-glycosylation sites, for example, (found in the extracellular domains of GPCRs) can be readily predicted using PROSITE. Sequence motifs that influence both cellular trafficking events (e.g. manifest in cloning studies by lack of cell surface expression) and cell signalling events, are less well understood. These cannot be predicted reliably. Just as the knowledge of amino acid residues involved in the interactions of receptors with heterotrimeric G proteins (discussed separately below) has been generated by years of biochemical study, the same will be true for understanding the sequence basis of trafficking and signalling pathways. The PSORT program (http://psort.nibb.ac.jp/) (Nakai and Horton 1999) is a good general place to start for a wide range of predicted features, including some of those covered in earlier sections. More specifically, it is becoming increasingly recognized through work on ion channels that there are more ER retention signals than first thought, and these affect the trafficking of membrane proteins to the cell surface (Teasdale and Jackson 1996; Zerangue et al. 1999; Ma et al. 2001). Some of the known motifs are RXR(R), KKXX, and RKR, generally found in the C-terminus of sequences. These can be easily searched for textually, or by using a pattern searching programs, such as in GCG. It is the RXRR motif present in the GABAB1 receptor that results in the receptor being retained in the ER. Coexpression with the GABAB2 homologue results in masking of the motif by the coiled coil interacting domains, and allows the heterodimer to reach the cell surface (Margeta-Mitrovic et al. 2000; Calver et al. 2001, see chapter 28).

There are other examples of motifs that currently appear subfamily specific, which over time may form a part of a more integrated picture and allow predictions to be made. A specific example concerns the interaction of some members of the metabotropic glutamate receptor family with Homer adapter proteins. These interactions have been explained through identification of a proline-rich sequence (PPSPF) in the C terminus of the receptors (Xiao et al. 2000). These adaptor proteins are important again, not only in the intracellular signalling process of the receptors, but also for getting the receptors to the cell surface (Roche et al. 1999). Several receptors (dopamine D4, muscarinic M4) have been shown to have polypro-line based motifs (PXXP) in their third intracellular loops which interact with SH3 binding domains of adapter proteins, for example, Grb2 (Ren et al. 1993; Oldenhof et al. 1998; Tang et al. 1999). Recent experiments on mutant dopamine D3 receptors indicate that the picture for some receptors maybe more complicated than originally thought (Oldenhof etal. 2001).

PDZ domains (constituents of scaffold proteins) are easy to predict, but the interacting proteins less so. For one PDZ domain protein, PSD-95 (Cho etal. 1992), originally identified as part of the post-synaptic density, the interacting proteins are known to interact through their C termini. This is true for the beta-adrenergic receptors (Hu et al. 2000 and refs therein) where the signature ESKV from the beta 1 subtype is important, and is consistent with the generalized pattern S/TXV(L/I). Recently, the multi-PDZ domain protein MUPP1 has been shown to interact with the extreme C-terminal SSV sequence of the 5-HT2C receptor (Becamel etal. 2001).

These specific examples are important because they indicate that while some general domain features of receptor sequences are easily identified, teasing out more specific features to understand the biology, will only emerge from very specific experimentation.

(iv) Knowledge of the receptor interactions with heterotrimeric G proteins has important implications for work on all receptors, particularly orphans. Being able to predict the correct heterotrimeric G protein family-receptor association will help assay development and functional studies. For any receptor, possible functional effects of genetic variations will be important in the context of disease associations. The current knowledge on key amino acid residues has been gleaned from years of extensive biochemical studies, typically by performimg site-directed mutagenesis experiments and generating chimeric receptors. The results of these studies demonstrate that most of the intracellular regions (loops and C terminus) of a receptor are implicated (Wess 1998).

Computational approaches to predict the G protein coupling profiles, using primary sequence data, have been published for known receptors. Vriend and colleagues applied a sequence analysis method (corelated mutation analysis—CMA) on aminergic and adenosine receptors (Horn et al. 2000). The conclusion was that a weak signal could be detected in receptor families where specialization for coupling to a given G protein occurred during a recent divergent evolutionary event, for example, with the muscarinic receptors.

An alternative data mining method for predicting coupling specificity of G protein coupled receptors to their G proteins, has combined pattern discovery and membrane topology prediction. The patterns were derived by analysing a set of receptor sequences for which the literature indicated clear non-promiscuous coupling to G proteins. The method discovered patterns of amino acid residues in the intracellular domains that are specific for coupling to particular functional classes of G proteins and hence can be used to predict coupling specificity of orphans (Moller et al. 2001).

0 0

Post a comment