Some Elementary Features Of The Human Genome

The human genome consists of nuclear and mitochondrial DNA sequences. In round numbers, the nuclear sequences of humans consist of approximately three billion base pairs.

The Nuclear Genome

Nuclear DNA is a linear molecule roughly 2 m in length when extended; when folded into the nucleus, it occupies a sphere of about 10 mm in diameter. In human cells, the nuclear DNA sequences are carried on 22 chromosomes plus an X and a Y chromosome in males and two Xs in females. Normally, everyone has two copies of each chromosome (except for the sex chromosomes) and the genes they carry.

Each chromosome has a characteristic length1 and a distinctive banding pattern. Bands on the long and short arms are numbered from the centromere toward the telomere. Bands on the short arm are designated by "p" and bands on the long arm by "q" followed by the region number, and the band number within that region; subdivisions of bands are designated by a decimal point and a number. Each gene occupies a particular spot on its chromosome. For example, NAT2 is located at 8p21.3-23.1, which means that NAT2 is situated on the short arm of chromosome 8 within band 2, and between region 1, subdivision 3 and region 3, subdivision 1. The glucose-6-phosphate dehydrogenase (G6PD) gene assigned to Xq28 is located in band 2, region 8 on the long arm of the X chromosome.

In eukaryotes and prokaryotes, the genes and the polypeptides they encode are colinear along the DNA molecule, but, unlike prokaryotic genes, almost all eu-karyotic genes are divided into exons (expressed regions) separated by introns (intervening regions). Much of the human genome, as well as those of other higher eukaryotes, consists of sequences whose genetic function is not clearly defined, and for many purposes it is convenient to divide the genome into sequences of three types: the coding sequences along with untranslated regions of exons (UTRs) and their associated promoter regions; the intronic sequences and various repetitive sequences; and the mobile sequences, such as the transposons. Differences in the structure and function between them are relevant to the molecular basis of protein diversity.

Coding sequences encode the messages for synthesis and targeting of proteins. They occupy a small part, 5% or less, of the human genome. Proteins are composed of structural domains that contribute to their three-dimensional architecture and stability, and of functional domains such as binding regions, catalytic sites, and folding elements. Exons are often regarded as units that encode structural or functional domains that remain intact during synthesis and targeting of proteins, and it has been proposed that domain duplication at the protein level corresponds to exon duplication at the DNA level. It has also been suggested that exons can be shuffled independently and that exon shuffling has played a major role in the evolution of higher organisms.2

The sequences of eukaryotic nuclear DNA surrounding exons are composed of introns and repetitive sequences. Introns and repetitive sequences may occupy 70% or more of the human genome. Introns may protect exons from mutation, but, except for DNA close to exons, they may not be crucial for their expression as changes in these sequences have no known effect. Repetitive tracts of DNA contribute a variable but significant proportion to the total genomic DNA of eukaryotic chromosomes. The base composition of these sequences is highly heterogeneous. When total genomic DNA is sheared and allowed to reassociate, these elements tend to form ''satellite bands'' when centrifuged in a density gradient. All such bands have been called satellite DNA from time to time but the term is now usually restricted to tandem repeats of DNA in centromeric and telomeric regions of the chromosome regardless of whether they form bands on a density gradient. Satellite DNAs are often associated with chromatin, although their genetic function has not been clearly defined. Primary sequence analysis of cloned satellite DNA has shown that a great deal of variation occurs within them. In certain instances, a repetitive satellite unit may contain polymorphic sites for restriction endonucleases, and because these polymorphisms are inherited in Mendelian fashion, they are useful in isolation and analysis of hereditary traits as discussed below (see p. 130).

The mobile sequences occupy the remainder of the human genome. They are transposable from one location to a new location and are also referred to as ''jumping genes.'' Transposable elements have been identified as a common feature of prokaryotic and eukaryotic genomes. They are a specific group of ge-nomic elements dispersed throughout the genome, many of which have repetitive sequences present in hundreds or thousands of copies, and they may contain several genes in addition to those essential to their mobility. Transposable elements are responsible for a wide range of mutations and are thought to play a role in shaping evolution, but exactly how they move about and what effects they produce are unknown. Since the mobile sequences may contain other genes with functional regulatory domains, they may also interfere with gene expression. In other instances, insertion of transposons at a new location may interrupt coding sequences, UTRs, or their control regions to inactivate genes.3

Evidence for the de novo insertion of a transposable element into the blood clotting factor VIII gene on the X chromosome of humans was reported for two hemophilia patients.4,5 The inserted element was also found to possess reverse transcriptase activity that could represent a potential source of the activity necessary for its relocation. This was one of the first demonstrations that transposable elements can also cause human genetic disease. On the other hand, the scattering of these mobile sequences around the human genome and the effects of mutations in them may, in some instances, benefit the organism by immobilizing them; or they may have been important in the development of whole families of proteins such as the hemoglobins, the cytochrome P450 drug-metabolizing enzymes, and the drug receptors.

Chromosomal translocation is another mechanism by which genes, in this case groups of genes, can move from one location to another in the genome. Reciprocal translocation entails a rupture of the two chromosomes that then fuse their breakpoints to form a chimeric gene. Sometimes a translocation moves a quiescent gene close to an active promoter. Examples of chromosomal translocation of pharmacogenetic importance are glucocorticoid-remediable hypertension6,7 and resistance to treatment of acute promyelocytic leukemia with all-irans-retinoic acid.8-10

Additional structural features of DNA that are capable of affecting gene expression are described below (see Chapter 6).

The Mitochondrial Genome

The genetic analysis of the mitochondrial genome (mtDNA), which started as a search for a model system to study gene expression, spanned some 15 years and involved the intimate collaboration of Giuseppe Attardi's laboratory at the California Institute of Technology and Fred Sanger's Laboratory of Molecular Biology at Cambridge University, epitomizes the power of biochemical, electron microscopic, and immunological techniques, and led to the discovery of a genome with unique features of gene organization and expression.11 Human mtDNA is a compact, circular, double-stranded molecule that contains 16,569 base pairs located within the mitochondrial matrix and anchored at one end to the inner mitochondrial membrane.12 mtDNA contains 37 genes encoding 22 tRNAs, 12S and 16S ribosomal RNAs, and 13 proteins involved in oxidative phosphorylation. The sequence and organization of human mtDNA differ in several respects from those of the DNA double helix. For example, the genes have none or only a few noncoding bases between them. In many cases, the termination codons are not coded in the DNA but are created posttranscriptionally by polyadenylation of the mRNAs. Additionally, the genetic code differs from the universal code in that UGA codes for tryptophan and not for termination, AUA codes for methionine not isoleucine, and AGA and AGG are termination rather than arginine codons. A 1-kilobase noncoding region contains both promoters and the replication site of origin. The two strands of mtDNA are distinguished as heavy (guanosine rich) and light (cytosine rich). mtDNA does not have DNA repair systems analogous to those of the nucleus, potentially increasing the vulnerability of mtDNA to damage from lipophilic xenobiotics.

At cell division the mitochondria are distributed randomly to daughter cells. If a parental cell contains only normal mitochondria, homoplasmy is said to exist. But if the parental cell contains mutant as well as normal mitochondria, a condition known as heteroplasmy exists, and the proportion of mutant versus normal mitochondria inherited by the heteroplasmic daughter cells will vary. In mammals, less than one mtDNA in a thousand is derived from the father in an average individual, and the transmission of mtDNA occurs almost exclusively through the maternal lineage. A trait that shows maternal inheritance and distributes to sons and daughters could thus be explained by a mutation in a mitochondrial gene.

Mitochondrial genetics is complicated by the unique characteristics of mtDNA inheritance. Mitochondrial disorders are attributed to at least three sources of variation—the particular locus affected, the type of mutation (point mutation, deletion, duplication), and the fraction of mutant mitochondria present in tissues and individuals when heteroplasmy exists; other factors including the nuclear genotype, aging, and the environment, particularly chemical exposure, can also contribute to mitochondrial variation. Typically, a mutation in the mitochondrial genome is sought when clinical findings involve tissues with high energy demands, such as disorders of the brain or skeletal muscle. The same clinical clue would not have suggested mitochondrial mutation in a rare type of maturity-onset diabetes; instead, the pattern of transmission through the maternal line provided that clue. Similarly, the pattern of maternal transmission in aminoglycoside antibiotic-induced deafness provided the clue to the inheritance of this pharma-cogenetic trait (see Appendix A).

0 0

Post a comment