N

â– qeewragK

Fig. 4. Comparison of the Multiple Alignment program of the translated human EST contig sequence (HUM) and the yeast MRS4 protein sequence. Boxes indicate the blocks of homology between human and yeast sequences. The two blocks previously identified by the BLAST search (see Fig. 2) are framed again. A third block of homology is also shown. The homology between the human and the whole yeast sequences is 45.8%; it corresponds to 55.6, 53.1, and 43.5% in blocks 1, 2, and 3, respectively.

(retrieved by BLAST searching, using the yeast MRS4 protein) are the HTO15 protein (accession number NP.057696), the hypothetical protein FLJ10618 (accession number NP.060625), and the CGI-69 protein (accession number NP_057100), which present, respectively, 60, 41, and 46% similarity and 40, 25, and 27% identity with the yeast MRS4 sequence. The sequence alignments presented in the scrolling screen show a stretch of homology between the yeast MRS4 protein and the human HTO 15 protein, but this is not observed for the hypothetical FLJ10618 protein or the CGI-69 protein, which should lead one to be cautious with this homology. Nevertheless, these three proteins are mitochondrial carrier proteins, as indicated after clicking in the corresponding GenBank number. Looking for information about the HTO 15 protein reveals that this protein is the mitochondrial solute carrier (UniGene Hs.300496) that we previously noted in the first step of the BLAST search. Alignment of the yeast MRS4 protein and the human HT015 protein reveals 42.1% homology (not shown), which is slightly less than the homology between the yeast MRS4 and our EST-deduced amino acid sequence. At this step, no other available data allow us to know which of the EST-derived amino acid sequences, or which of the HT015 proteins, is the real MRS4 homolog; only complementation studies in yeast might answer this question. Since December 2000, the complete sequence of the human homolog of yeast MRS4 has been identified (UniGene Cluster Hs.326104, Genbank AAK49519) and matches the partial protein sequence retrieved by the method described here.

Bench work is then absolutely required to check the in silico cloning before claiming to have identified a human cDNA. Oligonucleotide primers must be designed from the genomic or EST contig sequence to perform PCR or RT-PCR followed by sequencing in order to correct eventual mismatches and gaps.

In the selected case of MRS4, BLAST searching retrieves several ESTs showing significant homology scores. However, in some cases, no human homologous sequences can be identified. As most ESTs have been obtained by sequencing the 3' part of mRNAs, the 5' part of large mRNAs is often lacking. Consequently, this identification strategy may be inefficient if the C-terminal regions of the known gene from the model organism and its human counterpart present low levels of homology. This can result either from interspecies differences in protein equipment or from loss of particular functions of one given protein during evolution. Another explanation may stem from a low degree of sequence conservation between species. Finally, the absence of a human ortholog in the EST database of a gene identified in a model organism may also be due to a low transcription level of the corresponding gene, as the EST database is a qualitative and quantitative picture of the genome transcriptome.

0 0

Post a comment