About Us | Help Videos | Contact Us | Subscriptions

The Plant Genome - Original Research

Simple Sequence Repeat Marker Development and Mapping Targeted to Previously Unmapped Regions of the Strawberry Genome Sequence


This article in TPG

  1. Vol. 4 No. 3, p. 165-177
    unlockOPEN ACCESS
    Received: May 12, 2011

    * Corresponding author(s): dan.sargent@emr.ac.uk
Request Permissions

  1. Daniel J. Sargent ,
  2. Paulina Kuchta,
  3. Elena Lopez Girona,
  4. Hailong Zhang,
  5. Thomas M. Davis,
  6. Jean-Marc Celton,
  7. Annalisa Marchese,
  8. Malgorzata Korbin,
  9. Kevin M. Folta,
  10. Vladimir Shulaev and
  11. David W. Simpson
  1. D.J. Sargent, E. Lopez Girona, A. Marchese, and D. W. Simpson, East Malling Research, New Road, East Malling, Kent ME19 6BJ, UK; P Kuchta and M Korbin, Research Institute of Horticulture, Konstytucji 3 Maja 1/3, 96-100 Skierniewice, Poland; H. Zhang, The Glycomics Center, Univ. of New Hampshire, Durham, NH; T.M. Davis, Dep. of Biological Sciences, University of New Hampshire, Durham, NH; J.M. Celton, Biotechnology Dep., Univ. of the Western Cape, Private Bag X17, Bellville 7535, South Africa; K. M. Folta, Horticultural Sciences Dep., Univ. of Florida, Gainesville, FL; V. Shulaev, Dep. of Biological Sciences, Univ. of North Texas, 1155 Union Cir., Denton, TX


The genome sequence of the woodland strawberry (Fragaria vesca L.) is an important resource providing a reference for comparative genomics studies and future sequenced rosaceous species and has great utility as a model for the development of markers for mapping in the cultivated strawberry Fragaria ×ananassa Duchesne ex Rozier. A set of 152 microsatellite simple sequence repeat (SSR) primer pairs was developed and mapped, along with 42 previously published but unmapped SSRs, permitting the precise assignment of 28.2 Mbp of previously unanchored genome sequence scaffolds (13% of the F. vesca genome sequence). The original ordering of F. vesca sequence scaffolds was performed without a physical map, using predominantly SSR markers to order scaffolds via anchoring to a comprehensive linkage map. This report complements and expands resolution of the Fragaria spp. reference map and refines the scaffold ordering of the F. vesca genome sequence using newly devised tools. The results of this study provide two significant resources: (i) the concurrent validation of a substantial set of SSRs associated with these previously unmapped regions of the Fragaria spp. genome and (ii) the precise placement of previously orphaned genomic sequence. Together, these resources improve the resolution and completeness of the strawberry genome sequence, making it a better resource for downstream studies in Fragaria spp. and the family Rosaceae.


    EMBL, European Molecular Biology Laboratory; EST, expressed sequence tag; FC, Fragaria pseudochromosome; FG, Fragaria linkage group; FV×FB, Fragaria vesca ‘815’; × Fragaria bucharica ‘601’; FvH4, Fragaria vesca L. ‘Hawaii 4’; PCR, polymerase chain reaction; SSR, simple sequence repeat; STS, sequence tagged site

Fragaria is the most commercially important soft fruit genus, primarily due to the cultivation of the genetically complex octoploid species F. ×ananassa (2n = 8x = 56). In 2009, the world production of strawberries exceeded 4.1 million t and the crop was valued in excess of US$4 billion (FAO, 2011). Due to its economic importance, F. ×ananassa has been the subject of much genetic research aimed at developing superior cultivars with enhanced disease resistance, fruit quality, and other characters, prompting the development of a number of molecular marker maps for this species (Rousseau-Gueutin et al., 2008; Sargent et al., 2009; van Dijk et al., 2010).

Simple sequence repeats (SSRs) have been the marker of choice in the genus Fragaria for linkage map development to date (Rousseau-Gueutin et al., 2008; Sargent et al., 2006; Sargent et al., 2009; Spigler et al., 2010; van Dijk et al., 2010) due to their abundance in the genome, their codominant and highly polymorphic nature, and their relative ease of development from enriched genomic libraries and expressed sequence tag (EST) collections (Celton et al., 2009; Sargent et al., 2003). The diploid Fragaria spp. reference map has been constructed using predominantly SSR markers, and a total of 272 have previously been mapped in the diploid Fragaria spp. reference linkage map Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ (FV×FB) (Sargent et al., 2008). The reference map was developed from an interspecific F2 population derived from a cross between Fragaria vesca L. and its closest diploid relative Fragaria bucharica Losinsk.; it spans seven linkage groups, corresponding to the basic, seven-member Fragaria spp. chromosome set and covers a total of 528.1 cM. This map contains a total of 422 sequence-characterized sequence tagged site (STS) markers mapped in the full progeny (Ruiz-Rojas et al., 2010) and a further 230 STS markers mapped using the bin-mapping progeny (Illa et al., 2011; Sargent et al., 2008).

The recent publication of the woodland strawberry (F. vesca) genome sequence was a milestone in plant biology (Shulaev et al., 2011). Not only was it one of the smallest plant genomes ever to be sequenced, but sequencing was performed using only short-read sequencing technologies. The complex polyploid genome of the cultivated strawberry genome precludes de novo assembly of sequences generated using current sequencing technologies, and so F. vesca (2n = 2x = 14), a close relative of Fragaria ×ananassa Duchesne ex Rozier and widely regarded as the closest extant descendent of the octoploid's A genome donor (Davis et al., 2009), was chosen as a surrogate system for the development of a diploid reference genome sequence for the genus (Shulaev et al., 2011). The F. vesca ‘Hawaii 4’ (FvH4) genotype was chosen due to its amenability to genetic transformation, its self-compatibility and therefore highly homozygous genome, and its perpetually flowering “semperflorens” habit (Shulaev et al., 2011). Fragaria vesca ‘Hawaii 4’ was sequenced to 39x depth of coverage and the sequence is composed of a total of 219 Mbp of nucleotide data contained in more than 3200 genome sequence scaffolds.

Despite the large number of scaffolds, the majority of the genome sequence, 209.3 Mbp (96%), is contained in just 247 scaffolds of over 50 kbp in length (Shulaev et al., 2011). The genome was assembled into scaffolds and pseudochromosomes without a physical map, assigning the vast majority of contiguous sequences (198.1 Mbp of the sequence, contained in 204 sequence scaffolds) to a physical location on the F. vesca genome using molecular markers genetically mapped to the FV×FB linkage map (Shulaev et al., 2011). Of these, 131 scaffolds containing 169.5 Mbp of sequence have been anchored using the full FV×FB mapping progeny. The remaining 73 scaffolds were anchored using a selective mapping approach known as bin mapping (Sargent et al., 2008), the majority through sequencing of a reduced complexity genome scan of each of the six genotypes that comprise the selective mapping or “bin” set (Celton et al., 2010).

To complement and extend the existing SSR marker resources available for Fragaria spp., in particular for the development of saturated linkage maps, we developed a set of novel SSR markers from regions of the F. vesca genome that were unanchored and thus previously unmapped or had been anchored only through bin mapping. We designed 152 novel primer pairs flanking selected polymorphic SSRs and, through mapping of these loci and a further 42 previously published SSRs, anchored a further 28.2 Mbp of scaffolded genome sequence to precise positions on the FV×FB reference map using the full mapping progeny of 76 individuals. Using these additional data we have improved the resolution and completeness of the anchoring of the F. vesca genome and have created a set of revised pseudochromosomes for FvH4.

Materials and Methods

Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ Diploid Fragaria Reference Population

Novel markers developed in this investigation were tested for polymorphism between the grandparental genotypes (F. vesca ‘815’ and F. bucharica ‘601’) of the F2 diploid Fragaria spp. reference mapping population FV×FB (Sargent et al., 2006). Segregation data for markers that were polymorphic between the grandparental genotypes were generated in the full FV×FB progeny of 76 individuals.

Simple Sequence Repeat Marker Development

Previously unanchored FvH4 sequence scaffolds over 50 kbp in length were submitted individually to the SSR server at the Genome Database for Rosaceae (Jung et al., 2008) to determine the precise locations of di- and trinucleotide SSR loci contained within each scaffold sequence. Di- and trinucleotide SSRs were investigated as these had been shown to be the most polymorphic motifs in previous mapping investigations in Fragaria spp. Cutoff criteria for candidate SSR marker loci were di- and trinucleotides with a minimum repeat length of 12 and 6 repeats, respectively. Where no SSRs were discovered under these criteria, the repeat length of dinucleotides was reduced to eight repeats. Simple sequence repeats were considered for marker development when they occurred with a minimum of 200 bp of high-quality sequence data flanking either side of the repeat region. A maximum of four pairs of polymerase chain reaction (PCR) primers targeting SSR loci were designed and synthesized per scaffold. Where possible, SSRs were selected for marker development at evenly spaced intervals throughout the scaffold sequence. Primers were designed to amplify products between 100 and 350 bp in length. All primer pairs were designed with PRIMER 3 (Rozen and Skaletsky, 2000) to have an oligonucleotide melting temperature of 55 to 65°C (optimum 60°C), a primer length of 20 to 24 bp (optimum 22 bp), and a 2-bp GC clamp at the 5′ end of each primer. Primers were synthesized by Integrated DNA Technologies Ltd (Leuven, Belgium). Primer pairs amplifying markers mapping to the FV×FB map were named with the prefix FvH4 followed by a four digit numerical identifier, that is, FvH40001.

Polymerase Chain Reaction Conditions and Product Visualization

Following the touchdown protocol described by Sargent et al. (2003), amplicons were generated from the grandparental genotypes of the mapping population FV×FB for novel SSR markers developed in this investigation and SSRs previously published by Sargent et al. (2008) and Spigler et al. (2010) that had not been previously scored in the full FV×FB mapping progeny. Initially, PCR products were assessed for polymorphism following agarose gel electrophoresis, ethidium bromide staining, and visualization over ultraviolet light. Where possible, at least one marker identified as polymorphic between the grandparental genotypes per sequence scaffold was labeled on the forward primer with one of two fluorescent dyes: 6-FAM or HEX (IDT, Leuven, Belgium). Labeled products were then sized following fractionation by capillary electrophoresis through a 3100 genetic analyzer (Applied Biosystems, Warrington, UK) and the data generated were collected and analyzed using the GENESCAN and GENOTYPER (Applied Biosystems, Warrington, UK) software applications.

Based on compatible product sizes and fluorescent dye colors, markers (primer pairs) were grouped into sets of up to 16 and multiplex PCR was performed using the “Type-it” PCR mastermix (Qiagen, Crawley, UK) following the manufacturer's recommendations, except that PCRs were performed in a final volume of 12.5 μL. Reactions were performed using the following PCR cycles: an initial denaturation step of 95°C for 5 min was followed by 28 cycles of 95°C for 30 sec, an annealing temperature of 55°C decreasing by 0.5°C per cycle until 50°C for 90 sec, and 72°C for 30 sec, followed by a 30 min final extension step at 68°C. Samples were analyzed by capillary electrophoresis as described above.

Data Analysis and Map Construction

Segregation data for all new markers were analyzed for cosegregation with previously mapped markers using the published data for the FV×FB mapping population (Ruiz-Rojas et al., 2010) to determine the map position of the markers and thus the physical positions on the FvH4 genome sequence of the sequence scaffolds from which these markers were derived. Data were analyzed using JOINMAP 4.0 (Van Ooijen, 2006) using the Kosambi mapping function. Linkage group construction was determined using a minimum logarithm of the odds (LOD) score threshold of 3.0, a recombination fraction threshold of 0.35, ripple value of 1.0, jump threshold of 3.0, and a triplet threshold of 5.0.

Creation of Pseudochromosomes

Pseudochromosomes were created from FvH4 sequence scaffolds over 50 kbp in length that had been anchored to the FV×FB reference map by markers developed in this investigation or using previously mapped markers. Sequence scaffolds were arranged in order based on their map positions on the FV×FB linkage map and, where possible (when at least two markers were mapped to the same scaffold with sufficient resolution to permit orientation), scaffolds were orientated in accordance with the linkage group on which they were located. Scaffolds were separated in the pseudochromosomes by arbitrary gaps of 10,000 nucleotides (N)10k to allow clear demarcation between the end of one scaffold and the beginning of the next within each pseudochromosome. Fragaria spp. pseudochromosomes were plotted alongside the FV×FB linkage groups using HARRY PLOTTER (Moretto et al., 2010).

Linkage Group and Pseudochromosome Nomenclature

For clarity, when referring to markers mapped to the FV×FB linkage map, linkage group nomenclature followed that of Vilanova et al. (2008): Fragaria spp. linkage groups (FG) FG1 through FG7. When referring to markers or scaffolds located to one of the FvH4 pseudochromosomes, nomenclature followed that of Shulaev et al. (2011): Fragaria spp. pseudochromosome (FC) FC1 through FC7.

Simple Sequence Repeat Identification in Fragaria vesca ‘Hawaii 4’ Sequence Scaffolds

A FASTA (Pearson, 1990) file containing the seven FvH4 pseudochromosomes created from FvH4 sequence scaffolds over 50 kbp in length that had been anchored to the FV×FB reference map was submitted to MSATCOMMANDER v0.8.2 for Windows (Faircloth, 2008) for SSR screening. All SSRs with a di- or trinucleotide motif at least 12 and 6 repeats, respectively, were identified and compiled, listing their repeat length, motif, and pseudochromosome position.


Simple Sequence Repeat Design, Amplification, Polymorphism, and Mapping in Fragaria vesca ‘815’ × Fragaria bucharica ‘601’

In total, 296 primer pairs were designed from FvH4 genome sequence scaffolds. Of these, 152 primer pairs (51%) fulfilled three key criteria: (i) they generated single-locus polymorphisms between the grandparental genotypes of the FV×FB mapping population, (ii) they were targeted to previously unmapped or bin-mapped sequence scaffolds and were thus located in areas of the genome from which SSR markers had not previously been developed, and (iii) they mapped to one of the seven diploid Fragaria spp. linkage groups. One hundred twenty-one of the remaining 144 primer pairs generated products that were monomorphic between the parents of the FV×FB mapping population. Of the remaining primer pairs, 14 generated complex, or nonspecific, amplification profiles, six failed to amplify a product, and three generated products of over 1 kbp in length that could not be reliably genotyped in the progeny. Supplemental Table S1 lists the locus names, primer sequences, repeat motif, repeat number, and European Molecular Biology Laboratory (EMBL) reference numbers of the 152 mapped primer pairs. Additionally, a further 42 polymorphic SSRs that had been developed in previous investigations (Sargent et al., 2008; Spigler et al., 2010) and that had been identified in scaffolds over 50 kbp in length but had not previously been scored in the full FV×FB mapping population were also mapped.

Genome Sequence Scaffolds Anchored to the Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ Reference Map

The 194 SSR loci genotyped in the full mapping population anchored a total of 93 genome sequence scaffolds to the FV×FB reference map. Sixty of these scaffolds had been previously located to FV×FB mapping bins using a selective mapping strategy (Celton et al., 2010) while 33 had not previously been located to the FV×FB map. Simple sequence repeat markers developed for one scaffold (scf0513139) mapped to two distinct locations: one on FG2 and one on FG5 of the FV×FB reference map. The sequence of scaffold scf0513139 was therefore split at nucleotide 563,780 between the two mapped SSRs in an area of low sequence coverage, and the two sections of the scaffold were renamed 513139a and 513139b following the convention of Shulaev et al. (2011).

Ninety-three FvH4 genome sequence scaffolds were anchored using markers mapped in the full FV×FB progeny in this investigation, adding a further 28.8 Mbp of sequence to precise locations on the F. vesca genome. These, along with those anchored using markers mapped in the full FV×FB progeny by Shulaev et al. (2011), brings the total number of anchored scaffolds (including the two scaffolds created by splitting scaffold scf0513139) to 222. These scaffolds are anchored using a total of 411 genetic markers (Fig. 1), 194 mapped in the present investigation and a further 217 previously mapped markers (Ruiz-Rojas et al., 2010), and contain 197,682,269 nucleotides (197.7 Mbp) of sequence including embedded gaps. Using only markers that were contained in genome sequence scaffolds and by removing markers where their positioning on the FV×FB was questionable—through the erroneous positioning of an otherwise robustly positioned sequence scaffold within a region of the genetic map, poor fit of the data for a single marker in relation to other markers on the linkage group, or large amounts of data missing for a marker—we have produced an updated version of the reference linkage map for Fragaria spp. in which the positioning of the markers and underlying sequence scaffolds is robust and reliable. The distribution of scaffolds per linkage group in the updated map was not even, with FG3 containing the highest number of scaffolds (47) and FG7 the lowest (14). The distribution of sequence was also uneven, the longest pseudochromosome being FC6 (39.5 Mbp), which was almost twice the length of FC1 (20.0 Mbp), while FC3 contained the greatest number of nucleotides per centimorgan of its equivalent linkage group (687 kbp) and FC7 contained the fewest (372 kbp). Table 1 lists the number of scaffolds, linkage groups, and equivalent pseudochromosome lengths along with the number of nucleotides per scaffold and per centimorgan of each of the linkage groups of the FV×FB map.

Figure 1.
Figure 1.

A genetic linkage map of the Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ (FV×FB) mapping progeny of 76 seedlings showing the map positions of the 411 sequence characterized genetic markers used to anchor the scaffolds over 50 kbp in length of the Fragaria vesca L. ‘Hawaii 4’(FvH4) genome sequence assembly. Novel markers mapped for the first time in the full progeny in this investigation are given in bold and genetic distances are given in centimorgans. Asterisks denote the putative positions of centromeric regions on each linkage group. FG, Fragaria linkage group.


View Full Table | Close Full ViewTable 1.

. Fragaria vesca L. ‘Hawaii 4’ (FvH4) pseudochromosome characteristics. The number of sequence scaffolds anchored to each Fragaria pseudochromosome giving the lengths of Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ (FV×FB) linkage groups that they were anchored to and their equivalent pseudochromosome lengths along with the number of nucleotides per scaffold and per centimorgan of each of the linkage groups of the FV×FB map.

Linkage group No. of scaffolds Pseudochromosome (FC) length (bp) Linkage group length (cM) Megabase pairs per scaffold Base pairs per centimorgan
FG1 31 19,956,132 52.9 0.64 377,371
FG2 29 24,527,021 55.2 0.85 444,467
FG3 47 33,188,684 48.3 0.71 687,449
FG4 29 29,352,125 66.5 1.01 441,492
FG5 31 27,567,315 61.5 0.89 448,577
FG6 41 39,546,558 98.1 0.97 402,961
FG7 14 22,432,496 60.3 1.60 372,151
Total 222 196,570,331 442.8
FC, Fragaria pseudochromosome.
FG, Fragaria linkage group.

The 222 scaffolds mapped to the FV×FB reference map using the full mapping progeny now represent 90% of the FvH4 genome sequence scaffolds over 50 kbp in length. Figure 2 shows the physical positions of the 222 sequence scaffolds in relation to the markers to which they are anchored on the FV×FB reference map while Supplemental Table S2 lists the 411 genetic markers mapped to the FV×FB reference map that have been identified in genome sequence scaffolds herein or previously by Shulaev et al. (2011), their map positions, and the FvH4 genome sequence scaffolds that they anchor, along with the scaffold sizes in base pairs. Figure 3 displays scaffolds anchored to FG2 of the FV×FB linkage map in this investigation using the full mapping progeny and those anchored in the investigation of Shulaev et al. (2011) in which some scaffolds were anchored using bin mapping, showing the increased precision in the positioning of scaffolds in this investigation. A total of 10% (24) of scaffolds over 50 kbp in length, collectively containing 11.9 Mbp of sequence, were not anchored to the FV×FB map with the full FV×FB mapping progeny using SSRs developed in this investigation; however, of the 11.9 Mbp, 56% (6.7 Mbp) was previously bin mapped, leaving just 5.2 Mbp (2.4% of the FvH4 genome sequence contained in scaffolds over 50 kbp) unanchored to the Fragaria spp. genome. In most cases (92%), when scaffolds were not anchored in this investigation it was due to difficulties in designing primers for amplification of single SSR loci or a lack of polymorphism in the SSRs designed; however, in a very small number of cases (8% unmapped scaffolds), it was due to the absence of SSR motifs of sufficient length around which to design primers.

Figure 2.Figure 2.
Figure 2.

(cont'd on next page) The genetic locations of the Fragaria vesca L. ‘Hawaii 4’ (FvH4) genome sequence scaffolds. The physical positions of the 222 sequence scaffolds greater than 50 kbp in length are shown in relation to the markers to which they are anchored on the Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ (FV×FB) reference map. Physical distances are given in kilobase pairs and genetic distances are given in centimorgans. LG, linkage group.

Figure 3.
Figure 3.

A comparison of scaffold positions anchored by bin mapping and by markers mapped in the full Fragaria vesca ‘815’ × Fragaria bucharica ‘601’ (FV×FB) on the Fragaria linkage group (FG) 2 of the FV×FB reference map showing A) sequence scaffolds mapped using bin mapping (blue scaffolds) by Shulaev et al. (2011) and B) the positions of the same scaffolds (shown in blue) anchored using markers mapped in the full progeny in this investigation. Scaffolds shown in green were anchored for the first time in this investigation and were not present on the map of Shulaev et al. (2011). LG, linkage group.


Fragaria vesca ‘Hawaii 4’ Pseudochromosomes

The seven FvH4 pseudochromosomes (FC1–FC7) containing the 222 genome sequence scaffolds over 50 kbp in length that were anchored to the FV×FB linkage map using the full mapping progeny in this investigation or by Shulaev et al. (2011), along with an eighth pseudochromosome, FC0, containing the 24 unanchored genome sequence scaffolds over 50 kbp in length, have been deposited at the strawberry genome browser (Shulaev et al., 2011; Jung et al., 2005) and have been denoted v1.1.

Simple Sequence Repeat Identification in Fragaria vesca ‘Hawaii 4’ Pseudochromsomes

From the 222 genome sequence scaffolds over 50 kbp in length anchored to the FV×FB genetic map, a total of 10,071 SSRs were identified, 7321 (72.7%) of which were dinucleotide and 2744 (27.3%) of which were trinucleotide, with a repeat length equal to or in excess of 12 and 6, respectively. Simple sequence repeat distribution across the seven FvH4 pseudochromosomes was random with no clustering observed in relation to physical distance (data not shown). There was a strong correlation (R2 = 0.923) between chromosome physical length and number of SSRs per chromosome, with an average of one SSR every 19,518 nucleotides. Simple sequence repeat motifs were present in the genome at different frequencies. Within the dinucleotides, the (AG)n motif was present at the highest frequency (59%), followed by (AT)n (37%) and (AC)n (4%) (Fig. 4a). The (CG)n motif was not found in the FvH4 genome sequence scaffolds with a repeat number n ≥ 12. Within the trinucleotides, the (AAG)n motif was present at the highest frequency (33%), followed by (AAT)n (21%) and (ATG)n (10%), the other seven trinucleotide repeat motifs making up the remaining 36% (Fig. 4b).

Figure 4.
Figure 4.

Simple sequence repeat (SSR) motifs identified in the Fragaria vesca L. ‘Hawaii 4’ (FvH4) pseudochromosomes showing frequency distributions of SSR repeat motifs within the FvH4 genome sequence scaffolds over 50 kbp in length. A) shows distribution of dinucleotide repeats and B) shows distribution of trinucleotide repeats.



In this investigation we have developed and mapped a set of 152 novel polymorphic SSR markers from previously unmapped regions of the diploid Fragaria spp. genome, which will thus be useful for the continued development of genetic linkage maps in strawberry. Additionally, the mapping of these novel SSR markers has improved the precision of the anchoring of the FvH4 genome sequence assembly (Shulaev et al., 2011) through the location of a further 28.2 Mbp of sequence data to the diploid Fragaria spp. reference map FV×FB using the full mapping progeny (Ruiz-Rojas et al., 2010).

Due to their inherent transferability, codominant inheritance, and generally high levels of polymorphisms, SSRs have found enormous utility as genetic markers and have been used extensively as markers in numerous plant species including the development of reference maps for many rosaceous genera (Aranzana et al., 2003; Celton et al., 2009; Gisbert et al., 2009; Graham et al., 2004; Yamamoto et al., 2007). The SSR markers developed in this investigation mapped to discrete positions on the diploid Fragaria spp. reference map (Ruiz-Rojas et al., 2010) and permitted the precise anchoring of a further 28.2 Mbp Fragaria spp. genome sequence data than had previously been anchored (Shulaev et al., 2011). In total, 90% of the genome sequence scaffolds over 50 kbp in length have now been anchored using STS markers in the full FV×FB progeny and a total of 97.6% are anchored either using a conventional- or bin-mapping strategy. The sequencing of the FvH4 genome was not performed as a means to an end, and it was envisaged that the resultant sequence would be utilized a generic tool to assist the entire Fragaria spp. research community. Bin mapping, as performed by Shulaev et al. (2011), is a highly effective strategy for the placement of a large number of markers onto a linkage map with minimum experimental time and cost. However, the technique lacks precision and, while scaffolds anchored in this way were placed accurately in mapping bins on the FV×FB genetic map, the ordering of markers within the bins cannot be determined. Figure 3 displays the positions of scaffolds anchored in this investigation through the mapping of markers in the full FV×FB progeny compared to the placement of scaffolds using bin mapping by Shulaev et al. (2011). Mapping anchor markers in the full FV×FB progeny has increased the precision with which the scaffolds were anchored and therefore the utility of the anchored sequences for further research activities. Thus, the pseudochromosomes presented in this investigation will improve the utility and generic value of the FvH4 sequence as a reference tool for genomics studies within Fragaria spp. and between plant genera. Marker systems exploiting single nucleotide polymorphisms, such as cleaved amplified polymorphic sequence (CAPS) markers (Ruiz-Rojas et al., 2010) and high resolution melting analyses, have found utility in other species including apple (Malus pumila Mill.), potato (Solanum tuberosum L.), and almond [Prunus dulcis (Mill.) D. A. Webb] (Chagné et al., 2008; De Koeyer et al., 2010; Wu et al., 2010), but such markers cannot be readily applied to the cultivated strawberry and its wild progenitors due to the complex polyploid nature of their genomes. The SSRs developed and utilized for scaffold anchoring in this investigation will be of great utility for the genetic characterization of and the continued development of genetic linkage maps for the octoploid cultivated strawberry F. ×ananassa (Sargent et al., 2009; van Dijk et al., 2010).

As an additional contribution to Fragaria spp. genomic information resources needed for mapping and other marker-dependent applications in strawberry, we have also surveyed the content and distribution of SSR repeats within the FvH4 genome sequence. As with other plant species, such as Arabidopsis thaliana (L.) Heynh., rice (Oryza sativa L.), almond (Prunus dulcis), peach [Prunus persica (L.) Batsch], and rose (Rosa spp.) (Jung et al., 2005; McCouch et al., 2002), the most abundant repeat motifs for di- and trinucleotide SSRs in the FvH4 genome sequence were (AG)n and (AAG)n, respectively. The distribution of repeat motifs for SSRs throughout the whole FvH4 genome sequence was similar to that found in ESTs sequenced from F. ×ananassa for dinucleotide repeats but differed for trinucleotide repeats, with the (AAT)n and (ATG)n repeats present in much lower frequencies in the ESTs sequenced and analyzed by (Folta et al., 2005) using similar criteria to define the presence of di- and trinucleotide SSRs. In two more recent studies, however, (Zhang and Deng, 2010) found a similar percentage of (AAG)n trinucleotide repeats in SSRs found within the 17,565 Fragaria spp. ESTs deposited in the EMBL nucleotide sequence repository, with other motifs present at lower frequencies, and (Bombarely et al., 2010) reported similar proportions of both di- and trinucleotide repeats within a collection of 4500 F. ×ananassa ESTs to this investigation, suggesting that SSR repeat motifs in Fragaria spp. are distributed in similar frequencies within the coding and noncoding portions of the genome.

Simple sequence repeat distribution on the FV×FB genetic map was nonrandom, with clustering apparent on all seven linkage groups (Fig. 1) (Ruiz-Rojas et al., 2010; Sargent et al., 2006). The random physical distribution of SSR loci across the FvH4 pseudochromosomes suggests that the clustering of markers on the FV×FB linkage map is due to recombination suppression in specific regions of the genome. Such recombination suppression is apparent on linkage maps of other species, including papaya (Carica papaya L.) and barley (Hordeum vulgare L.), and has been shown to be located within centromeric regions (Chen et al., 2007; Wenzl et al., 2006). The direct effects of centromeric regions on the suppression of recombination have been demonstrated by (Lambie and Roeder, 1986) and thus we postulate that the regions of distinct recombination suppression apparent on all seven FV×FB linkage groups may be the centromeric regions of the seven Fragaria spp. chromosomes. It cannot be ruled out, however, that the clustering, particularly where more than one distinct region was observed on a single linkage group as on FG3, might have been caused by other factors such as low homology between the interspecific chromosomes due to high sequence divergence between F. vesca and F. bucharica or genomic rearrangements in the evolution of the two species since they diverged from a common ancestor.

Using the 152 SSRs developed in this investigation and 42 previously published SSRs we have anchored 93 scaffolds that had not previously been located, to precise but, in the majority of cases, unorientated positions using the full FV×FB mapping progeny and created seven pseudochromosomes composed of 222 sequence scaffolds covering 197.7 Mbp. As greater numbers of plant genomes are sequenced (Kaul et al., 2000; Ming et al., 2008; Shulaev et al., 2011; Tuskan et al., 2006; Velasco et al., 2007, 2010; Vogel et al., 2010; Yu et al., 2002), it has become possible to perform whole genome synteny analyses at the DNA sequence level (Gar et al., 2011; Illa et al., 2011; Jung et al., 2010). Marker density on the FV×FB map presented is one marker every 1.09 cM and, thus, anchoring the majority of the FvH4 sequence using the full mapping progeny will permit such analyses to be performed for Fragaria spp. with far greater precision than could have been achieved if scaffolds had been located using bin mapping alone, where markers were assigned to 45 mapping bins with an average bin length of 12.6 cM (Sargent et al., 2008). The seven FvH4 pseudochromosomes presented in this report will also assist resequencing efforts and genomic investigations in the cultivated strawberry, F. ×ananassa, and its wild octoploid progenitors Fragaria chiloensis (L.) Mill. and Fragaria virginiana Mill.

Supplemental Information Available

Supplemental material is available free of charge at http://www.crops.org/publications/tpg.


We gratefully acknowledge the provision of HARRY PLOTTER by Marco Moretto (FEM-IASMA) and the assistance of Alastair Plant and Thomas Foreau in the development and testing of SSR primers identified using the GDR SSR miner. Fragaria genomics at East Malling Research is funded by the BBSRC and the East Malling Trust. This research was supported in part by USDA-CSREES National Research Initiative (NRI) Plant Genome Grant 2008-35300-04411 and New Hampshire Agricultural Experiment Station Project NH00535 (to TMD).





Be the first to comment.

Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.