By Kevin E. Noonan --
Long before DNA sequencing technology existed (indeed, long before Watson and Crick proposed that DNA was the genetic material and proposed a structural basis for its ability to be replicated), scientists were able to study genome structure using strictly genetic approaches. Genetic linkage maps, for example, date from the work of Thomas Hunt Morgan and Hermann Mueller on fruit flies at Columbia University and later at Cal Tech. Traditionally, these maps have been generated using phenotypic markers, much like Mendel did with his pea plants (in relative obscurity) a generation before.
Marker density is a limitation using traditional methods, however. But an improvement in this aspect of the methodology was achieved in the 1980's with the advent of restriction fragment length polymorphism (RFLP) methods. These methods used physical changes in DNA sequence (specifically at restriction enzyme recognition sites) to greatly expand the scope of linkage analysis. This is because the "phenotype" of a RFLP is a polymorphism that can be identified using electrophorectic separation of fragments having different lengths; unlike traditional phenotypes, unless the RFLP happens to be associated with an amino acid change in a protein it is unlikely to be subject to natural selection, and thus will fall within the "neutral" sequence variation that has been appreciated to exist in natural populations since the work of Motoo Kimura in the 1960's. The RFLP method was used to great effectiveness by David Botstein, Ray White, Mark Skolnick, and Ronald Davis in identifying the chromosomal location of genes associated with, inter alia, Huntington's disease, and notably by Mary Claire King in identifying the locus of the BRCA1 gene and in forensic studies used to identify Argentines "disappeared" by the military dictatorship during the co-called "dirty war."
Since the completion of the Human Genome Project, the existence of single nucleotide polymorphisms (SNPs), which are RFLPs without being limited to restriction enzyme recognition sites, has expanded the extent and number of generic markers useful for genome mapping. These SNPs have been used to construct a high-density linkage map for the sunflower (Helianthus annus) in a paper by John E. Bowers, Eleni Bachlava, Robert L. Brunick, Loren H. Rieseberg, Steven J. Knapp and John M. Burke from the Department of Plant Biology and Center for Applied Genetic Technologies, University of Georgia, the Department of Crop and Soil Science, Oregon State University, the Department of Botany, University of British Columbia, and the Department of Biology, Indiana University. The study, "Development of a 10,000 Locus Genetic Map of the Sunflower Genome Based on Multiple Crosses" was reported in the online journal Genes/Genomes/Genetics (G3), 2: 721-729 (July 1, 2012). These researchers identified 10,080 polymorphic genetic loci in the approximately 3.5 Gbp sunflower genome using a high throughput SNP genotyping array in developing their map, but also noted areas of the sunflower genome (up to 26 centiMorgans in size) having no markers that could be detected in individual crosses, the result, the authors speculate, of genetic identity between individuals in the four populations studied. However, none of these regions were in common in the four populations as a whole, permitting a "gapless" map of the sunflower genome to be elucidated.
Interestingly, the economic importance of sunflowers (being the fourth largest source of cultivated vegetable oils worldwide; www.fas.usda.gov) is such that earlier genomic mapping technologies have been applied (including RFLP maps; Berry et al. 1995, "Molecular marker analysis of Helianthus-annuus L. 2. Construction of an RFLP linkage map for cultivated sunflower," Theor. Appl. Genet. 91: 195–99; Gentzbittel et al. 1995, "Development of a consensus linkage RFLP map of cultivated sunflower (Helianthus-annuus L)," Theor. Appl. Genet. 90: 1079–108) and others. These efforts, however, have been directed towards "traits of agronomic importance" as well as comparisons with genomic structure of related Helianthus species according to the authors. But these traditional efforts suffered from the same limited marker density deficiencies noted earlier (although SNPs and other markers identified using these methodologies were integrated into the high-density map). Compared with the marker density in these earlier studies (one marker per 2 Mbps) other species (Arabidopsis, rice, sorghum) were previously mapped at a density of ~160 bp per marker (i.e., a ~12,500-fold higher marker density). These efforts were also hampered by the fact that no plant species closely related to sunflower (family Asteraceae) have been fully sequenced (the closest sequenced relative is the potato), although other related species (tomato, lettuce) are currently subject to genomic sequencing efforts.
Markers were analyzed from multiple genetic crosses between members of the four populations. The study reports crosses between individuals from an oilseed maintainer line with high oleic acid content and an oilseed "restorer" line, while other crosses involved individuals from the oilseed maintainer line and a wild sunflower line; an oilseed restorer line and a confectionery restorer line; and an oilseed restorer line that segregates for nuclear male sterility and a non-oilseed landrace line. From these four populations, four individual maps of ~3500-5500 marker loci were created, falling into 17 linkage groups which was consistent with the 17 chromosomes in the sunflower species (although crosses with the landrace species showed genetic heterogeneity that required additional analysis). Analyses of genotyping errors showed that the "raw allele calls" made from the data were "highly robust" (i.e., having an error rate ranging from ~0.14-1%). Turning to these results, the authors report that the maps were "largely collinear with an average of 88.7% of all shared loci being syntenic in pairwise combinations." "Not surprisingly," according to the study, "the cross that showed the most and largest regions of reduced recombination was the only cross that involved wild sunflower," a phenomenon that had been previously seen in maize (McMullen et al. 2009, "Genetic properties of the maize nested association mapping population," Science 325: 737–40). The study also reports that 762 of the 5694 mapped loci could be ascribed to two different chromosomal locations in different crosses (and 21 markers were found at three genomic locations), a phenomenon the authors suggest could be due to the existence of multicopy genes, wherein the mapped markers were the result of detecting the different paralogs that were located at different loci in the sunflower genome. This corresponded to <14% of all detected SNPs being associated with multicopy genes.
The combined consensus map reported in this paper comprised 10,080 loci spanning 1380 cM. The authors state that the existence of this SNP map represented "over 5 million molecular data points" made possible by using multiple mapping populations simultaneously and that can be used to help assemble a map from the "sunflower genome project" currently underway (Kane et al. 2011, "Progress towards a reference genome for sunflower," Botany-Botanique 89: 429–37). The relevance and utility of this marker map for understanding and improving sunflower species, subspecies and economically important variants remains a task for the future.