Researchers from The Weizmann Institute of Science report the discovery of two new properties of the genetic code. Their work, which appears online in Genome Research, shows that the genetic code—used by organisms as diverse as reef coral, termites, and humans—is nearly optimal for encoding signals of any length in parallel to sequences that code for proteins. In addition, they report that the genetic code is organized so efficiently that when the cellular machinery misses a beat during protein synthesis, the process is promptly halted before energy and resources are wasted.
DNA sequences that code for proteins need to convey, in addition to the protein-coding information, several different signals at the same time. These “parallel codes” include binding sequences for regulatory and structural proteins, signals for splicing, and RNA secondary structure. Here, we show that the universal genetic code can efficiently carry arbitrary parallel codes much better than the vast majority of other possible genetic codes. This property is related to the identity of the stop codons. We find that the ability to support parallel codes is strongly tied to another useful property of the genetic code—minimization of the effects of frame-shift translation errors. Whereas many of the known regulatory codes reside in nontranslated regions of the genome, the present findings suggest that protein-coding regions can readily carry abundant additional information.
“Our findings open the possibility that genes can carry additional, currently unknown codes,” explains Dr. Uri Alon, principal investigator on the project. “These findings point at possible selection forces that may have shaped the universal genetic code.”
The genetic code consists of 61 codons—tri-nucleotide sequences of DNA—that encode 20 amino acids, the building blocks of proteins. In addition, three codons signal the cellular machinery to stop protein synthesis after a full-length protein is built.
While the best-known function of genes is to code for proteins, the DNA sequences of genes also harbor signals for folding, organization, regulation, and splicing. These DNA sequences are typically a bit longer: from four to 150 or more nucleotides in length.
Filed under: bacteria, biodefense, bioinformatics, bioinformatics blog, bioinformatics software, DNA, DNA microarray, drug development, drug discoverry, drug resistance, epigenetics, gene expression, genetics, genotyping, microaray blog, microarray |