DNA has a unique status among biological molecules, being both a chemical and information comprising the nucleotide sequence that encodes the amino acid sequence of a gene's cognate protein. In a recent article in Science, Michael Gottesman and his colleagues at the National Institutes of Health have demonstrated that even this straightforward understanding of information flow involved in gene expression may be more complicated than it has appeared. These results have implications not only for understanding the underlying biology but in protecting genes with patents.
The genetic code is degenerate, meaning that the number of possible triplet codons (64) is greater than the number of amino acids (20) corresponding to these codons. This fundamental concept in molecular biology was first postulated by Francis Crick and verified by Marshall Nirenberg and others in the '50's and '60's. The physical embodiment of the degeneracy of the code is transfer RNA (tRNA), small RNA molecules having a triplet codon at one "end" of the molecule and a position at the other end to which the corresponding amino acid is linked; the specificity resides in the "charging" enzymes that match tRNA to its proper amino acid. Most amino acids are encoded by more than one triplet and have more than one tRNA charged with that amino acid. Leucine has the most (with 6) and methionine and tryptophan are the exceptions that have only one cognate tRNA.
It is known that not all tRNAs for a particular amino acid are produced in the same abundance, either in cells of a particular species or between species. Thus, early genetic engineering efforts for some proteins involved changing codon choice for DNA from one species (human) for expression in other species (bacteria and yeast). The underlying assumption, however, is that there was no difference between these codons other than tRNA abundance, and that an amino acid sequence could be encoded by any one of a number of nucleotide sequences representing "synonyms" of the native sequence.
Gottesman's work changes all that. His team looked at the gene encoding human MDR1, a plasma membrane, ATP-dependent transporter that recognizes and transports from the cell many structurally-different chemical compounds. This protein, termed P-gp, is responsible for protecting colon, kidney, and other cells from environmental toxins; forms a part of the blood-brain barrier; and makes certain cancer cells resistant to chemotherapeutic drugs. Gottesman expressed two different embodiments of the MDR1 gene, where a certain portion of the amino acid sequences of each were encoded by a synonymous stretch of nucleotides. In this portion of the gene, the encoded amino acid sequence was the same but the codons used in the two embodiments were different: one of them used rare triplet codons and the other did not. His results were startling: not only were the proteins produced at different rates (which could be expected if the rare codon-containing species was translated more slowly), but the phenotype of the two proteins differed as well. For example, the rare codon-encoding embodiment showed different sensitivity to certain inhibitors. In addition, the different forms showed different susceptibility to proteolytic enzymes and different binding characteristics with conformation-specific antibodies. Gottesman interpreted these results to mean that differences in translation speed or efficiency could change the way the protein was folded as it was produced at the ribosome, and the differences in protein folding caused (or at least contributed to) the observed phenotypic differences.
It is unknown how common it is to have these kinds of phenotypic differences result from triplet codon synonyms. P-pg differs from many other proteins in significant ways. For example, it is relatively nonspecific in substrate recognition, being able to transport many structurally-different compounds across the cell's plasma membrane against a concentration gradient. Other researchers have found changes in substrate specificity associated with mutations in the amino acid sequence (see Choi et al., 1988, Cell 53:519-29 and Safa et al., 1990, Proc. Natl. Acad. Sci. U.S.A. 87:7225-29). The protein is also internally-duplicated, and other members of the MDR gene family are involved in dramatically different phenotypes despite having very similar primary amino acid sequences (see Van der Bliek et al., 1987, EMBO J. 6:3325-31 and Smith et al., 1994, FEBS Lett. 354:263-66).
Gottesman's findings have implications for how patent protection is obtained for genes. Heretofore, it has been common to recite a gene claim in the form of "an isolated nucleic acid encoding a protein having an amino acid sequence identified by SEQ ID NO: X." This claim is typically supported by one cloned or otherwise identified sequence, and encompasses all nucleotide sequences encoding the amino acid sequence. Said another way, the claim encompasses all nucleotide sequence synonyms of the encoded amino acid sequence. When these synonymous sequences were believed to be equivalent (i.e., to encode the same protein with the same phenotypic characteristics), there was no reason not to consider elucidation of the nucleotide sequence and translation into the predicted amino acid sequence as encompassing all these synonymous sequences. If the effect of synonymous sequences detected by Gottesman is widespread, however, it may be more appropriate to permit patenting of claims to specific synonymous sequences, provided that they are associated with a phenotype that differs in some way from the phenotype of the protein encoded by the patented sequence. This would be the equivalent of patenting a chemical species encompassed within a prior art genus, where the species has unexpected or unappreciated properties that distinguish it from the properties of the generic compounds. These results also illustrate one other example of how the human genome has evolved to maximize genetic variability and helps explain the general observation from the Human Genome Project, that there are many fewer human genes than were expected to be found.
Comments