By Kevin E. Noonan --
Hans Sauer, Associate General Counsel for Intellectual Property for the Biotechnology Industry Organization (BIO), frequently asks (when discussing patent-eligibility of genes): "What about cucumber genes? Should they be patented?" If Hans wishes to remain au courant, however, he will need to update this question by asking "What about tomato genes?" after the disclosure, in this week's Nature, that the entire genomic DNA sequence of the tomato (Solanum lycopersicum) has been deciphered (The Tomato Genome Consortium, "The tomato genome sequence provides insights into fleshy fruit evolution," Nature 485: 635–41 (31 May 2012)).
As with other species genome project results, the reported sequence reveals interesting relationships between tomatoes and closely-related species (like the potato, Solanum tuberosum), including "wild" species (Solanum pimpinellifolium) related to the domestic tomato, as well as identifying sequence characteristics resulting from evolutionary events in the tomato pedigree.
The researchers used an inbred cultivar, "Heinz 1706," having a "predictive" genome size of 900Mb on 12 tomato chromosomes. The tomato genomic sequence was compared with the "wild" tomato sequence as well as other domesticated species of Solanum (particularly potato). The sequence differences between the domesticated and wild tomato amounted to 0.6% nucleotide divergence in 5.4 million single nucleotide polymorphisms (SNPs) scattered throughout the genome.
Between tomato and potato, on the other hand, 8.7% sequence divergence was found, including nine large and several smaller inversions (which can facilitate sequence and species divergence by interfering with meiotic pairing), with intergenic and heterochromatic (repeat-rich) sequences showing (not unexpectedly) > 30% sequence divergence.
Evolutionarily, the tomato sequence showed evidence of two consecutive genome triplications (a phenomenon common among plants but unknown for animal species), one recent between tomato and potato (~7.3 Myr ago), and one ancient (at about the rosid (tomato) - asterid (grape) divergence, ~71 +/- 19.4 Myr ago). These events are believed to have provided the genetic plasticity to develop genes controlling fruit characteristics, such as color and "fleshiness." These include transcription factors and enzymes involved in ethylene biosynthesis (used for ripening), red light photoreceptors (PHYB1/PHYB2), and lycopene synthesis (PST1/PSY2). Interestingly, the researchers reported that "[s]everal cytochrome P450 subfamilies associated with toxic alkaloid biosynthesis show contraction or complete loss in tomato and the extant genes show negligible expression in ripe fruits," a genetic adaptation to be expected for plants that have adopted a seed-dispersal mechanism requiring that the fruit be eaten by animals and the seeds passed in their spoor. Differential expression of genes for cell wall architecture was also observed and believed to be involved in this aspect of the tomato life cycle.
On the other hand, the observed patterns comparing tomato genes with grape orthologs (22.5% of grape genes have an orthologous region in tomato, 39.9% have two and 21.6% have three) suggested that triplication was followed by "widespread gene loss" (see Figure 2 of publication). Synteny maps between tomato chromosomes and chromosomes from potato, eggplant (aubergine), pepper, and tobacco were prepared and showed high levels of synteny, not unexpected in view of these species' interrelatedness.
Turning to the overall chromosome structure, the tomato showed pericentric heterochromatin and distal euchromatin, with miRNA and chloroplast insertions "more evenly distributed" throughout euchromatin. There was evidence of fewer high-copy LTR-containing retrotransposons in the tomato genome than in Sorghum or Arabidopsis genomes, with "older" average insertion age (2.8M v. 0.8M years ago); the tomato genome is "unusual among angiosperms by being largely comprise of low copy number DNA." In view of these differences with Arabidopsis, it was paradoxical that "[c]hromosomal organization of genes, transcripts, repeats and small RNAs (sRNAs) is very similar" between the tomato and Aribodopsis genomes.
Sequence comparisions and open reading frame analyses found from 34,727 to 35,004 protein-coding genes in the tomato genome. Of these, 18,320 were orthologs to potato genes; comparisons of synonymous vs. non-synonymous nucleotide substitution patterns between potato and tomato with similar comparisons between sorghum and maize (11.9Myr divergence) suggested stronger "diversifying" selection in the tomato/potato pair. Comparison of the tomato genome with the "wild" S. pimpinellifolium species genes showed 7,378 identical genes and 11,753 genes with only synonymous nucleotide sequence changes; of the rest (12,629 genes), there were not only non-synomynous changes but also gain/loss of stop codons, with implications for gross changes in functionality. In some instances, "several" chromosomal segments (particularly from cherry tomatoes) were more similar to the S. pimpinellifolium species than to the Heinz 1076 cultivar sequence.
The tomato genome shares with the soybean and potato genomes the location of small, regulatory RNAs to "gene-rich" chromosomal regions and particularly promoters. 96 conserved miRNA species were identified in tomato compared with 120 such species in potato, grouped into 34 families (10 "highly conserved" in plants and the others "less conserved" that are more abundant in potato than tomato; some of these potato-specific miRNAs specifically target Toll interleukin receptor, nucleotide-binding site and leucine-rich repeat genes. Protein-coding genes from tomato, potato, Arabidopsis, rice, and grape were "clustered" into 23,208 gene groups (having at least 2 members), of which "8,615 are common to all five genomes, 1,727 are confined to eudicots (tomato, potato, grape and Arabidopsis), and 727 are confined to plants with fleshy fruits (tomato, potato and grape)."
The report concludes with these lessons from the genome that reflect the natural history of the tomato:
The genome sequences of tomato and S. pimpinellifolium also provide a basis for understanding the bottlenecks that have narrowed tomato genetic diversity: the domestication of S. pimpinellifolium in the Americas, the export of a small number of genotypes to Europe in the 16th century, and the intensive breeding that followed. Charles Rick pioneered the use of trait introgression from wild tomato relatives to increase genetic diversity of cultivated tomatoes. Introgression lines exist for seven wild tomato species, including S. pimpinellifolium, in the background of cultivated tomato. The genome sequences presented here and the availability of millions of SNPs will allow breeders to revisit this rich trait reservoir and identify domestication genes, providing biological knowledge and empowering biodiversity-based breeding.
The genomic data generated by the whole project is available on-line, in GenBank under accession number AEKE00000000. The individual chromosome sequences as numbers CM001064–CM001075, and the data on expressed sequences are available in the "Sequence Read Archive" under accession number SRA049915, GSE33507, SRA050797 and SRA048144.
The Tomato Genome Consortium was responsible for the report, and reflects a vast amount of work by hundreds of researchers around the world. Named as corresponding authors are:
Shusei Sato, Satoshi Tabata, Lukas A. Mueller, Sanwen Huang, Yongchen Du, Chuanyou Li, Zhukuan Cheng, Jianru Zuo, Bin Han, Ying Wang, Hongqing Ling, Yongbiao Xue, Doreen Ware, W. Richard McCombie, Zachary B. Lippman, Stephen M. Stack, Steven D. Tanksley, Yves Van de Peer, Klaus Mayer, Gerard J. Bishop, Sarah Butcher, Nagendra Kumar Singh, Thomas Schiex, Mondher Bouzayen, Antonio Granell, Fernando Carrari, Gianluca De Bellis, Giovanni Giuliano, Glenn Bryan, Michiel J. T. van Eijk, Hiroyuki Fukuoka, Debasis Chattopadhyay, Roeland C. H. J. van Ham, Doil Choi, Jane Rogers, Zhangjun Fei, James J. Giovannoni, Rod Wing, Heiko Schoof, Blake C. Meyers, Jitendra P. Khurana, Akhilesh K. Tyagi, Tamas Dalmay, Andrew H. Paterson, Xiyin Wang, Luigi Frusciante, Graham B. Seymour, Bruce A. Roe, Giorgio Valle, Hans H. de Jong and René M. Klein Lankhorst
Note: the complete list can be found in the Supplementary Information (Supplementary Information).
The Tomato Genome Consortium includes scientists from The Kazusa DNA Research Institute, Japan; 454 Life Sciences Co. (Roche), USA; Amplicon Express Inc., USA; Beijing Vegetable Research Center, China; National Center for Gene Research, Chinese Academy of Sciences, Shanghai, China; BGI-Shenzhen, China; BMR-Genomics SrL, Italy; Boyce Thompson Institute for Plant Research, Cornell University, USA; Centre for Biosystems Genomics, The Netherlands; Centro Nacional de Analisis Genomico, Barcelona, Spain; China Agricultural University, Beijing, China; Institute of Vegetables and Flowers Chinese Academy of Agricultural Sciences, Beijing, China; Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China; Wuhan Botanical Garden, Chinese Academy of Sciences, Beijing, China; Cold Spring Harbor Laboratory, USA; Colorado State University, USA; National Taiwan University, Taipei; Department of Plant Biology, Cornell University, USA; Centre for Genomic Regulation, University Pompeu Fabra, Barcelona, Spain; Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium; Faculty of Agriculture, The Hebrew University of Jerusalem, Israel; Institute of Industrial Crops, Heilongjiang Academy of Agricultural Sciences, China; Institute for Bioinformatics and Systems Biology (MIPS), Helmholtz Center for Health and Environment, Germany; College of Horticulture, Henan Agricultural University, China; Department of Life Sciences, Imperial College London, UK; NRC on Plant Biotechnology, Indian Agricultural Research Institute, India; INRA, Génétique et amélioration des fruits et légumes, France; INRA, Biologie du Fruit et Pathologie, France; Unité de Biométrie et d'Intelligence Artificielle, INRA, France; INRA-CNRGV, France; Plateforme bioinformatique Genotoul, Biométrie et Intelligence Artificielle, INRA, France; Institut National Polytechnique de Toulouse - ENSAT, Université de Toulouse, France; Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Spain; Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora", Universidad de Malaga - Consejo Superior de Investigaciones Cientificas (IHSM-UMA-CSIC), Spain; Instituto de Biotecnología, Argentina; Institute for Biomedical Technologies, National Research Council of Italy; Institute of Plant Genetics, Research Division Portici, National Research Council of Italy; ENEA, Casaccia Research Center, Italy; Scuola Superiore Sant'Anna, Italy; ENEA, Trisaia Research Center, Italy; James Hutton Institute, UK; Barcelona Supercomputing Center, Spain; Institute of Research in Biomedicine, Barcelona, Spain; ICREA, Barcelona, Spain; Keygene N.V., Wageningen, The Netherlands; Plant Systems Engineering Research Center, Korea Research Institute of Bioscience and Biotechnology, Republic of Korea; Life Technologies, USA; Life Technologies, France; Max Planck Institute for Plant Breeding Research, Germany; School of Agriculture, Meiji University, Japan; Department of Plant Science and Plant Pathology, Montana State University, USA; NARO Institute of Vegetable and Tea Science, Japan; National Institute of Plant Genome Research, India; Plant Research International, Business Unit Bioscience, Wageningen, The Netherlands; Institute of Plant Genetic Engineering, Qingdao Agricultural University, China; Roche Applied Science, Germany; Seoul National University, Department of Plant Science and Plant Genomics and Breeding Institute, Republic of Korea; Seoul National University, Department of Agricultural Biotechnology, Republic of Korea; Seoul National University, Crop Functional Genomics Center, Republic of Korea; High-Tech Research Center, Shandong Academy of Agricultural Sciences, China; Institute of Vegetables, Shandong Academy of Agricultural Sciences, China; School of Life Sciences, Sichuan University, China; Sistemas Genomicos, Spain; College of Horticulture, South China Agricultural University, China; Syngenta Biotechnology, Inc., USA; Norwich Research Park, UK; Department of Botany, The Natural History Museum, UK; United States Department of Agriculture - Agricultural Research Service, Robert W. Holley Center, USA; Instituto de Hortofruticultura Subtropical y Mediterranea, Departamento de Biologia Molecular y Bioquimica, Spain; Centre de Regulacio Genomica, Universitat Pompeu Fabra, Spain; Arizona Genomics Institute, USA.; Crop Bioinformatics, Institute of Crop Science and Resource Conservation, University of Bonn, Germany; Department of Plant and Soil Sciences, and Delaware Biotechnology Institute, University of Delaware, USA; Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi South Campus, India.; University of East Anglia, UK; Department of Biology and the UF Genetics Institute, USA; Plant Genome Mapping Laboratory, University of Georgia, USA; Center for Genomics and Computational Biology, School of Life Sciences, and School of Sciences, Hebei United University, China; J. Craig Venter Institute, USA; University of Naples "Federico II" Department of Soil, Plant, Environmental and Animal Production Sciences, Italy; Division of Plant and Crop Sciences, University of Nottingham, UK; Department of Chemistry and Biochemistry, University of Oklahoma, USA; CRIBI, University of Padua, Italy; Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, USA.; Department of Agriculture and Environmental Sciences, University of Udine, Italy; Wageningen University, Laboratory of Genetics, Wageningen, The Netherlands; Wageningen University, Laboratory of Plant Breeding, Wageningen, The Netherlands; Wellcome Trust Sanger Institute Hinxton, UK; Ylichron SrL, Casaccia Research Center, Italy; and Plant Engineering Research Institute, Sejong University, Republic of Korea.