About Us | Help Videos | Contact Us | Subscriptions
 

The Plant Genome - Article

 

 

doi:10.3835/plantgenome2011.08.0023

An Improved Consensus Linkage Map of Barley Based on Flow-Sorted Chromosomes and Single Nucleotide Polymorphism Markers

  1. María Muñoz-Amatriaín,
  2. Matthew J. Moscou,
  3. Prasanna R. Bhat,
  4. Jan T. Svensson,
  5. Jan Bartoš,
  6. Pavla Suchánková,
  7. Hana Šimková,
  8. Takashi R. Endo,
  9. Raymond D. Fenton,
  10. Stefano Lonardi,
  11. Ana M. Castillo,
  12. Shiaoman Chao,
  13. Luis Cistué,
  14. Alfonso Cuesta-Marcos,
  15. Kerrie L. Forrest,
  16. Matthew J. Hayden,
  17. Patrick M. Hayes,
  18. Richard D. Horsley,
  19. Kihara Makoto,
  20. David Moody,
  21. Kazuhiro Sato,
  22. María P. Vallés,
  23. Brande B.H. Wulff,
  24. Gary J. Muehlbauer,
  25. Jaroslav Doležel and
  26. Timothy J. Close*
  1. M. Muñoz-Amatriaín and G.J. Muehlbauer, Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, St. Paul, MN 55108; M.J. Moscou and B.B.H. Wulff, The Sainsbury Lab., Norwich Research Park, Norwich, NR4 7UH, UK; P.R. Bhat, J.T. Svensson, R.D. Fenton, and T.J. Close, Dep. of Botany and Plant Sciences, Univ. of California, Riverside, CA 92521; J.T. Svensson, Dep. of Plant Biology, Univ. of Copenhagen, DK-1871 Frederiksberg C, Denmark; J. Bartos, P. Suchánková, H. Šimková, and J. Doležel, Centre of the Region Haná for Biotechnological and Agricultural Research, Institute of Experimental Botany, Sokolovská 6, CZ-77200 Olomouc, Czech Republic; T.R. Endo, Lab. of Plant Genetics, Graduate School of Agriculture, Kyoto Univ., Kyoto, Japan; S. Lonardi, Dep. of Computer Science and Engineering, Univ. of California, Riverside, CA 92521; A.M. Castillo, L. Cistué, and M.P. Vallés, Departamento de Genética y Producción Vegetal, Estación Experimental Aula Dei, CSIC, 50059, Zaragoza, Spain; S. Chao, USDA-ARS, Biosciences Research Lab., Fargo, ND, 58105-5674; A. Cuesta-Marcos, P.M. Hayes, and L. Cistué, Dep. of Crop and Soil Science, Oregon State Univ., Corvallis, OR 97331; K. L. Forrest and M.J. Hayden, Dep. of Primary Industries Victoria, Victorian AgriBiosciences Centre, La Trobe R&D Park, Bundoora, VIC 3083, Australia; R.D. Horsley, Dep. of Plant Sciences, North Dakota State Univ., Fargo, ND 58108; K. Makoto, Bioresources Research and Development Dep., Sapporo Breweries Ltd., 37-1, Nittakizaki, Ota, Gunma 370-0393, Japan; D. Moody, InterGrain Pty Ltd, Kensington, WA 6100, Australia; K. Sato, Institute of Plant Science and Resources, Okayama Univ., Kurashiki, 710-0046, Japan. M. Muñoz-Amatriaín and M.J. Moscou contributed equally to this work

Abstract

Recent advances in high-throughput genotyping have made it easier to combine information from different mapping populations into consensus genetic maps, which provide increased marker density and genome coverage compared to individual maps. Previously, a single nucleotide polymorphism (SNP)-based genotyping platform was developed and used to genotype 373 individuals in four barley (Hordeum vulgare L.) mapping populations. This led to a 2943 SNP consensus genetic map with 975 unique positions. In this work, we add data from six additional populations and more individuals from one of the original populations to develop an improved consensus map from 1133 individuals. A stringent and systematic analysis of each of the 10 populations was performed to achieve uniformity. This involved reexamination of the four populations included in the previous map. As a consequence, we present a robust consensus genetic map that contains 2994 SNP loci mapped to 1163 unique positions. The map spans 1137.3 cM with an average density of one marker bin per 0.99 cM. A novel application of the genotyping platform for gene detection allowed the assignment of 2930 genes to flow-sorted chromosomes or arms, confirmed the position of 2545 SNP-mapped loci, added chromosome or arm allocations to an additional 370 SNP loci, and delineated pericentromeric regions for chromosomes 2H to 7H. Marker order has been improved and map resolution has been increased by almost 20%. These increased precision outcomes enable more optimized SNP selection for marker-assisted breeding and support association genetic analysis and map-based cloning. It will also improve the anchoring of DNA sequence scaffolds and the barley physical map to the genetic map.


Abbreviations

    BOPA, Barley oligonucleotide pool assay; DH, doubled haploid; EDTA, ethylenediaminetetraacetic acid; FC, Foster × CIho 4196; HA, Haruna Nijo × Akashinriki; HO, Haruna Nijo × OHU602; ID, Igri × Dobla; LG, linkage group; MB, Morex × Barke; MH, Mikamo Golden × Harrington; OPA, oligonucleotide pool assay; OWB, Oregon Wolfe Barley; OWBA.C., anther culture-derived Oregon Wolfe Barley; OWBH.b., Oregon Wolfe Barley created with the Hordeum bulbosum-based approach; OPA, pilot oligonucleotide pool assay; RIL, recombinant inbred line; SI, signal intensity; SM, Steptoe × Morex; SNP, single nucleotide polymorphism; VB, Vlamingh × Buloke; WBTAL, wheat–barley ditelosomic addition line; WGS, whole-genome shotgun

Barley (Hordeum vulgare L.) is cultivated worldwide due to its adaptation to cold, drought, salinity, and alkaline conditions. It ranks fourth among cereals in terms of total production and area of cultivation (FAOSTAT, 2011) and its main uses are animal feed and in the malting and brewing industry, although benefits for human health have reignited interest in barley as a food (Baik and Ullrich, 2008). Since barley is a true diploid, it represents an attractive genomic model for other Triticeae species such as wheat (Triticum aestivum L.). The importance of barley in agriculture and its position as a model species for genetic studies led the barley research community to form the International Barley Sequencing Consortium with the goal of sequencing the >5000 Mb highly repetitive barley genome (Schulte et al., 2009; International Barley Sequencing Consortium, 2011). A combination of map-based sequencing and whole-genome shotgun (WGS) sequencing strategies are being followed to unveil the barley genome and, although significant achievements have been accomplished (Mayer et al., 2011; Sato et al., 2011b; Schulte et al., 2011), more effort and possibly new technologies will be required to overcome challenges associated with this large genome.

Genetic linkage maps are crucial for a variety of studies including quantitative trait loci identification, marker-assisted selection, association mapping, comparative genomics, and map-based cloning. At the same time, high-fidelity and dense genetic maps help in the genetic anchoring of physical maps and in the ordering and orientation of WGS scaffolds. Recently, high-throughput, robust molecular marker technologies have been developed that have resulted in more densely populated genetic maps. As a result, these dense genetic maps have increased occurrence of common markers between individual maps. This has allowed the integration of individual linkage maps into consensus genetic maps, which enable the mapping of an increased number of loci and facilitate the use of markers across different germplasm. Several consensus maps of barley have been published combining information of a minimum of three to a maximum of seven mapping populations and different types of markers (restriction fragment length polymorphism, simple sequence repeat, single nucleotide polymorphism [SNP], diversity array technology, and amplified fragment length polymorphism) (Marcel et al., 2007; Rostoks et al., 2005; Stein et al., 2007; Varshney et al., 2007; Wenzl et al., 2006).

The high-throughput SNP genotyping platform developed by Close et al. (2009) was a major step forward in the development of high-density genetic maps for barley. This platform consists of three pilot oligonucleotide pool assays (POPA1, POPA2, and POPA3) containing 4596 SNPs, which resulted in two final Barley oligonucleotide pool assays (BOPA1 and BOPA2) of 3072 SNPs. In the same work, the authors constructed a consensus genetic map with SNP data from genotyping four doubled haploid (DH) populations that involved a total of 373 lines. The resulting SNP consensus map contained 2943 loci grouped in 975 marker bins and covered a distance of 1099 cM (Close et al., 2009). Since its release, the barley SNP platform has been extensively used by the barley community to characterize germplasm collections (e.g., Cockram et al., 2010; Comadran et al., 2011; Hamblin et al., 2010) and for association mapping studies (e.g., Cuesta-Marcos et al., 2010; Massman et al., 2011; Ramsay et al., 2011; Roy et al., 2010). The premise of the present work was that incorporation of SNP genotyping data from additional populations would considerably improve the resolution and accuracy of the consensus map.

Historically, cytogenetic resources developed during the 20th century have provided a complementary approach to genetic mapping. The recent development of chromosome arm sorting (Suchánková et al., 2006) coupled with whole genome amplification (Šimková et al., 2008) has made the application of these resources to high-throughput genomics technologies possible. An example of this was the analysis of wheat–barley disomic chromosome addition lines (Islam, 1983) using the Affymetrix Barley1 GeneChip (Bilgic et al., 2007; Cho et al., 2006) for transcriptome analyses. A total of 1787 transcribed genes were mapped to chromosome 2H to 7H of barley (chromosome 1H wheat–barley disomic chromosome addition line is not available) (Cho et al., 2006) and later, 1257 were mapped to chromosome arms using a similar approach (Bilgic et al., 2007). As sufficient DNA from individual chromosomes and arms can now be isolated, it is feasible to apply these cytogenetic resources to the same high-throughput genotyping technologies used in genetic mapping. This approach has many strengths, such as not depending on gene expression and the ability to map genes with little or no sequence variation. It also provides complementary mapping information to genetic map-based mapping and allows for the definition of pericentromeric regions if wheat–barley ditelosomic addition lines (WBTALs) are used.

Here, we reexamined the SNP genotyping calls from the original four populations included in the previous SNP consensus map (Close et al., 2009) and added data from another six mapping populations and additional individuals from one of the original populations. For most of the markers two independent lines of direct evidence, genetic mapping and flow sorting, supported the map positions. Cumulatively, we present a robust SNP-based consensus genetic map that incorporates marker data from 1133 individuals. Due to both its higher number of bins and improved marker order, the consensus map developed in this study constitutes a significant achievement in support of SNP selection for marker-assisted breeding, association genetic analysis, map-based cloning, and anchoring DNA sequence scaffolds and a physical map to the genetic map.


MATERIALS AND METHODS

Mapping Populations

A total of 10 mapping populations were used in this study, including nine segregating populations of DH lines and one recombinant inbred line (RIL) population. Four of those DH populations (Oregon Wolfe Barley [OWB] created with the Hordeum bulbosum L.-based approach [OWBH.b.], Steptoe × Morex [SM] 1 [SM1], Morex × Barke [MB], and Haruna Nijo × OHU602 [HO]) were used previously to construct a SNP consensus map (Close et al., 2009). The six additional populations were anther culture-derived OWB (OWBA.C.), Haruna Nijo × Akashinriki (HA), Mikamo Golden × Harrington (MH), Vlamingh × Buloke (VB), and Igri × Dobla (ID) DH populations and a Foster × CIho 4196 (FC) RIL population. These six populations had been previously developed in the context of other studies and some of them have been published (Cistué et al., 2011; Horsley et al., 2006; Sato et al., 2011a; Sato and Takeda, 2009). An additional 57 individuals from the SM population were also used and, together with the SM1 lines, made up population SM2. More details on the 10 populations and the additional individuals included in the SM population SM2 are given in Table 1.


View Full Table | Close Full ViewTable 1.

Information on the individual mapping population data used for consensus map construction.

 
Population Abbreviation Growth habit† Type‡ No. of lines (before curation) No. of lines (after curation) No. of SNPs§ assayed No. and percentage of polymorphic SNPs Reference
Oregon Wolfe Barley created with the Hordeum bulbosum L.-based approach OWBH.b. S × S DH 93 82 4596 1469 (32%) Costa et al., 2001; BarleyWorld. 2006
Steptoe × Morex 1 SM1¶ S × S DH 93 92 4596 1215 (26%) Kleinhofs et al., 1993; USDA-ARS, 2011b
Morex × Barke MB S × S DH 94 93 4596 1574 (34%) N. Stein, personal communication, 2010
Haruna Nijo × OHU602 HO S × W DH 100 94 1536 759 (49%) Sato and Takeda, 2009
Steptoe × Morex 2 SM2 S × S DH 150 146 3060 835 (27%) Kleinhofs et al., 1993; USDA-ARS, 2011b
Anther culture-derived Oregon Wolfe Barley OWBA.C. S × S DH 94 93 3072 1271 (41%) Cistué et al., 2011
Haruna Nijo × Akashinriki HA S × S DH 68 54 1536 734 (48%) Sato et al., 2011a
Mikamo Golden × Harrington MH S × S DH 95 91 1536 491 (32%) Zhou et al., 2011
Vlamingh × Buloke VB S × S DH 347 289 1536 440 (29%) D. Moody, unpublished data, 2010
Foster × CIho 4196 FC S × S F8–9 RIL 94 89 1524 409 (27%) Horsley et al., 2006
Igri × Dobla ID W × F DH 106 102 1536 446 (29%) M.P. Vallés, unpublished data, 2010
Total 1250 1133
Growth habit of the parental genotypes used for the cross (S, spring barley; W, winter barley; F, facultative).
DH, doubled haploid; RIL, recombinant inbred line.
§SNP, single nucleotide polymorphism.
SM1 is subset of SM2.

Single Nucleotide Polymorphism Genotyping and Data Analysis

The high-throughput SNP-genotyping platform developed by Close et al. (2009) was used to genotype all the populations included in this study. Three of the populations (OWBH.b., SM1, and MB) were genotyped with the three pilot Illumina (San Diego, CA) GoldenGate oligonucleotide pool assays (POPA1, POPA2, and POPA3), which involved 4596 SNPs (Table 1). Additional individuals of the SM population were genotyped only with POPA1 and POPA2 (3060 markers) and were considered, together with SM1, as an individual population for mapping purposes (SM2). Foster × CIho 4196 was genotyped only with the 1524 SNPs represented on POPA1. Highly informative SNPs represented in these three POPAs were used to generate two BOPAs (BOPA1 and BOPA2), which were used to genotype the OWBA.C. population, with a total of 3072 assayed markers (Table 1). The rest of the populations (HO, HA, MH, VB, and ID) were genotyped only with BOPA1 (1536 SNPs).

Visualization and analysis of SNP data was performed using BeadStudio software (Illumina, 2008). Every SNP data cluster for each population was manually inspected to apply an accurate and consistent clustering method. Uniform criteria for inclusion or exclusion of SNPs were applied to all the populations and, as a consequence, some cases of apparent polymorphism were not used. Typical reasons for exclusion of an apparently polymorphic data clustering pattern included (i) homozygote clusters that are insufficiently separated (theta compressed) to readily distinguish heterozygotes from homozygotes in different germplasm, (ii) clusters that are vertically but not horizontally separated, which we found from the use of flow sorted chromosomes usually to be attributed to polymorphism in a locus different from the targeted SNP, and (iii) excessive dispersion of subclusters within an apparent homozygous cluster, which often manifested as segregation distortion but could be explained by signal interference from a different locus. In cases of minor doubts about the reliability of a SNP, we took advantage of barley–rice (Oryza sativa L.) synteny, viewed using HarvEST:Barley (HarvEST, 2011), to decide whether or not the cluster settings would cause the marker to map to a locus that seemed sensible in the context of synteny. Once this initial BeadStudio analysis was performed, genotyping data from each population were exported from the software for subsequent data processing.

Construction of the Individual Maps

Marker data were curated to identify and remove identical individuals and to exclude monomorphic or highly segregation-distorted markers. Individuals with a high number of heterozygous SNP loci and/or producing “No Calls” at an excessive number of SNPs were also detected and removed from the analysis. In general, these types of issues can be attributed to poor quality DNA samples, cross contamination between DNA samples, or intercrossing between lines. Individuals carrying nonparental alleles were also discarded; such individuals must represent errors in propagation, outcrossing, or DNA sample preparation. Command-line MSTMAP v4.3 (Wu et al., 2008; Wu, 2008a), which efficiently builds genetic maps by computing the minimum spanning tree of a graph associated with the genotyping data, was used to generate individual genetic maps for the 11 populations, using a cut off p-value of 0.000001, maximum distance between markers at 15.0 cM, no estimation before clustering, and the COUNT objective function and with genetic distances estimated using the Kosambi function (Kosambi, 1943).

Linkage groups (LGs) were assigned to chromosomes based on the previous Close et al. (2009) map. As a fairly stringent p-value cut off was used, several maps have two or more LGs that assign to the same chromosome. In this case, LGs were merged based on the ordering from the Close et al. (2009) map and confirmed for order and orientation based on two-point linkage analysis from MadMapper (Kozik, 2006; West et al., 2006). The MadMapper software was also helpful to visualize and validate all 11 genetic maps as well as to identify double recombinants. All double recombination events were inspected and only those supported by several markers or that were preceded and/or succeeded by long genetic distances were considered as “real,” while double recombinants for singleton markers not involving large genetic distances were called as missing data.

Pilot oligonucleotide pool assay names were used to designate SNP loci in the maps (e.g., 1_0894), where the first number corresponds to the POPA number and the next four digits indicate the SNP order in the corresponding POPA. A cross reference between alternative marker names is included in Supplemental Table S1.

Construction of the Consensus Map

All 11 genetic maps were used to generate a consensus genetic map using MergeMap v1.2 (Wu et al., 2011; Wu, 2008b), a software based on graph theory wherein individual maps are converted into directed acyclic graphs that are then merged into a consensus graph on the basis of their shared vertices (Jackson et al., 2005; Jackson et al., 2008; Yap et al., 2003). Equal weight was given to all genetic maps (weight = 1.0). MergeMap implements an efficient algorithm for resolving conflicts in the marker order among individual maps by deleting the smallest set of marker occurrences (Wu et al., 2011). In the case of equal probability of deletion among maps, we manually inspected the quality of each marker in conflict and assigned a higher weight to the most reliable maps. This only occurred once on chromosome 5H, where SM2 was given priority over SM1 due to the greater number of lines in SM2 (54 additional lines).

The current implementation of MergeMap (Wu et al., 2011; Wu, 2008b) inflates genetic distances between markers in the consensus genetic map. Previously, Close et al. (2009) used the arithmetic mean of individual LGs to determine an appropriate scaling factor for each LG. Here, we compared the genetic distances between consecutive markers in individual genetic maps to the same genetic distance as estimated in the consensus genetic map. The most stable estimate was found by dividing the arithmetic mean of these genetic distances in individual genetic maps by that of the consensus genetic map, with a scaling factor of 0.612 ± 0.062.

Plant Material for Flow-Sorted Chromosomes

Seeds of WBTAL (21″ + t″) carrying chromosome arms 2HS, 2HL, 3HS, 3HL, 4HS, 4HL, 5HS, 5HL, 6HS, 6HL, 7HS and 7HL were obtained from the collection maintained at Kyoto University, Japan.

Preparation of Material for Chromosome Sorting

Chromosome preparation and sorting was performed according to Suchánková et al. (2006). Briefly, metaphase cells were accumulated by treatment of root tips with 2 mM hydroxyurea (18 h), recovery in hydroxyurea-free medium (6.5 h), treatment with 2.5 μM amiprophos-methyl (2 h), and overnight incubation in ice cold water. Chromosomes were released by mechanical homogenization after mild formaldehyde fixation (for details see Vrána et al. [2000]). Chromosome suspensions were stained by 2 μg ml−1 DAPI (4′,6-diamidino-2-phenylindole) and analyzed using a FACS Vantage SE flow cytometer (Becton Dickinson, San Jose, CA). Preparation of material for chromosome 1H was reported previously by Šimková et al. (2008). Chromosome arms were sorted from corresponding WBTAL at a quantity of 25,000 each and placed into 20 μL of double distilled H2O in a 0.5 mL polymerase chain reaction tube.

Amplification of Chromosomal Arm DNA

Flow-sorted arms were processed and amplified according to Šimková et al. (2008). Chromosome arms were treated with proteinase K (3 μg per 25,000 arms) for 36 h at 50°C in 70 μL (chromosomes) or 90 μL (arms) of buffer consisting of 2.5 mM Tris (pH 8.0), 1.25 mM ethylenediaminetetraacetic acid (EDTA) (pH 8.0), and 0.125% (w/v) sodium dodecyl sulfate. Half of the original amount of proteinase K was added after 20 h. Proteinase K was removed using a Microcon YM-100 column (Millipore Corporation, Bedford, MA) in four rounds of centrifugation (for details see Šimková et al. [2008]). Chromosomal DNA was amplified using an illustra GenomiPhi V2 DNA Amplification Kit (GE Healthcare, Chalfont St. Giles, UK) in 20 μL reaction for 90 min according to manufacturer's instructions. Amplified DNA was lyophilized and subsequently diluted to a final volume of 100 μL by 10 mM Tris-HCl and 0.1 mM EDTA (pH 8.0). 50 μL were then purified using MicroSpin G50 columns (GE Healthcare).

Assignment of Genes to Chromosomes and Arms using Flow-Sorted Material

Flow-sorted chromosome 1H or arms, following amplification and purification by gel filtration, were applied to BOPA1 and BOPA2 to determine the location of each gene. Two independently prepared samples were used as replicates for all samples except 2HS, 3HL, and 5HS, for which only a single sample of each was applied to BOPA1. The location of each gene was determined by comparing the signal intensities (SIs) from all flow-sorted samples to the SIs from barley genomic DNA samples as positive controls (Morex, Betzes, and Akcent) and to negative controls, either salmon (Oncerhynchus keta) sperm DNA for BOPA1 or Escherichia coli DNA for BOPA2. The proportion of flow-sorted chromosomes or arms in each sample was adjusted by mixing with negative control DNA to achieve two to three times the relative concentration as would be in complete barley genome DNA. The final total DNA concentration was 80 ng μL−1, of which 250 ng was applied to the GoldenGate assay. The BeadStudio software (Illumina, 2008) for BOPA1 (data generated fall 2007) and Genome Studio (Illumina, 2010) for BOPA2 (data generated winter 2010) were used to cluster the data points. In general the data could be partitioned into signals that clustered with the positive or negative controls (Supplemental Fig. S1). In nearly all cases the data clusters required manual adjustment because the default clustering algorithm is intended to first seek heterozygotes and then identify homozygotes whereas in our case the distinction was simply gene presence or absence in the DNA sample. For some SNPs the data could not be adequately partitioned into gene-negative or gene-positive clusters, and in these cases the data were not used for further analysis. The SNP locus was assigned to a chromosome or arm if all replicate samples for that arm or chromosome provided the same interpretation; otherwise the SNP locus was considered to be unassigned.


RESULTS AND DISCUSSION

Data Analysis and Curation

To achieve uniformity in the analysis of each individual population contributing to the consensus map, we reexamined the previous genotyping data corresponding to the OWBH.b., SM1, MB, and HO populations. Due to the more stringent criteria for SNP inclusion in the present work (see Materials and Methods), fewer polymorphic SNPs were considered for individual map construction compared to Close et al. (2009) (Table 1). In particular, 5.9, 4.3, and 4.7% of the polymorphic markers included in the previous OWBH.b., SM1, and MB genetic maps, respectively, were not included in the present study. In contrast, for the HO population, the number of SNPs considered was increased by 3.4% with respect to the previous study, due to the addition of genotypic data from five more lines of this population (Table 1). The removal of less-reliable markers from these populations was intended to help reduce conflicts in the marker order among component maps and therefore assist in the construction of a consensus map (Jackson et al., 2008). The loss of markers in some individual maps was sometimes compensated by their presence in other maps such that in total, 116 markers that were included in the Close et al. (2009) consensus map were not included in the new consensus map produced in this work. A list of those 116 markers, along with their consensus map LG positions and neighboring markers, is provided in Supplemental Table S2.

The same criteria were followed for inspecting the SNP data corresponding to the six additional populations included in this work and the 54 additional lines from the Steptoe × Morex population that had been genotyped with a subset of POPA markers. Although this cannot result in a higher number of mapped markers, their inclusion increased the number of recombination events in the SM population and hence the marker resolution. We were also able to use a new OWB population of 94 lines developed by anther culture (OWBA.C.) (Cistué et al., 2011), which, as expected due to its high degree of phenotypic variation (Costa et al., 2001), contributed a high number of polymorphic SNPs (1215; Table 1). A high percentage of polymorphism (49%; Table 1) was also found in the Japanese HA population developed by Sato et al. (2011a) from crossing the malting cv. Haruna Nijo with the food landrace Akashinriki. The fact that both parents were expressed sequence tag donors from which oligonucleotide pool assay (OPA)-SNPs were identified increased the likelihood of the platform to detect polymorphisms (Sato et al., 2011a). The other four new mapping populations, which included parents from Japan (Mikamo Golden), Australia (Vlamingh and Buloke), the United States (Harrington, Foster, and CIho 4196), and Europe (Igri and Dobla), had lower numbers of polymorphic SNPs (Table 1). Their lower polymorphism rate was probably due to the similarity of the parental genotypes or perhaps their absence from the SNP discovery panel (Close et al., 2009; Moragues et al., 2010).

Single nucleotide polymorphism data were examined afterward to identify identical individuals as well as problematic lines. With the development of high-throughput genotyping technologies, it is becoming easier to detect identical lines in mapping populations, resulting in removal of redundant genotyping information that can cause bias in the linkage analysis. To identify duplicate lines, we compared the genotype calls between all pairs of individuals. The presence of 11 and 14 duplicated individuals had been observed previously in the OWBH.b. and HA mapping populations, respectively (Chutimanitsakun et al., 2011; Sato et al., 2011a). In addition, we found and removed one duplicated individual from the OWBA.C., SM1, and ID populations, two duplicated lines from SM2, MH, and FC, five duplicated individuals from HO, and 42 duplicated lines from VB. Lines with an excessive number of heterozygous SNP calls and/or “No Calls” were also identified and removed from the data set, in particular one line from both MB and HO populations, two lines from the SM2 and MH populations, three individuals from FC and ID, and 16 lines from the VB population. Table 1 shows the final numbers of lines from each population that were considered for further analysis while the specific lines removed from each population can be found in Supplemental Table S3.

Generation of Individual Linkage Maps

After curation of the SNP data, we constructed component maps from the 11 high-quality datasets. We chose the software tool MSTMAP (Wu et al., 2008; Wu, 2008a) to develop all the individual genetic maps due to its good performance compared to other available tools, especially in the speed and accuracy of map construction (Cheema and Dicks, 2009). The resulting linkage maps were also compared with those produced by JoinMap 4.0 (Van Ooijen, 2006). Maps generated by both programs were identical in marker order, probably due to the quality of our genotyping data, although MSTMAP assembled maps significantly faster than JoinMap. Since not all the individual maps had the same SNP coverage, we preferred not to force the number of LGs to match the number of chromosomes and to use a set of stringent parameters with MSTMAP, taking advantage of the wealth of genetic map information to link the disjointed LGs. Specifically, we used the Close et al. (2009) consensus map to join and orientate LGs.

All constructed individual maps were then validated by visualizing with CheckMatrix from MadMapper (West et al., 2006; Kozik, 2006). First, we confirmed the high quality of the genetic maps by generating two-dimensional heat plots, which show all pairwise recombination values for nonredundant markers. An example of a heat map from one of the individual maps is shown in Supplemental Fig. S2. Second, we generated a graphical genotyping plot from each map to easily identify all double crossovers. Double recombination events can be real or indicative of genotyping errors. We manually inspected all double recombinants that were not supported by large centimorgan distances between markers. In total, 98 singletons were replaced with missed calls. Most of these were identified in FC (49) and VB (34) mapping populations, probably because of their lower marker density compared to other maps, although in the case of FC some of these 49 double crossovers might be real, given the higher opportunity for recombination of RILs than DHs. We preferred to be conservative and err on the side of caution. The remaining rare singletons occurred in SM1, SM2, OWBA.C., HA, MH, and ID mapping populations.

We generated each of the 11 component maps from the filtered genotype datasets, and both the individual maps and the genotyping data used for their construction are presented in Supplemental Table S4. A total of four markers could not be placed into individual genetic maps: marker 3_1434 from OWBH.b., marker 2_0029 from MB, and markers 1_0739 and 1_0780 from FC. The rest of the SNPs were distributed among the seven barley chromosomes in each of the component maps (Table 2), with average densities ranging from one SNP per 2.52 cM in the MB genetic map to one marker per 5.02 cM in the ID genetic map. Genetic map sizes varied among the different populations, from 954.1 cM for FC to 1257.8 cM for OWBA.C. (Table 2).


View Full Table | Close Full ViewTable 2.

Distribution of single nucleotide polymorphism loci in the individual component maps and the consensus map.

 
Chromosome
Map† Characteristic 1H 2H 3H 4H 5H 6H 7H All
OWBH.b. Markers 153 225 240 206 262 191 191 1468
Bins 64 71 89 57 86 59 64 490
cM‡ 154.0 179.3 199.3 123.4 229.0 150.8 195.3 1231.1
SM1 Markers 139 211 233 125 217 119 171 1215
Bins 46 57 62 48 78 41 56 388
cM 138.9 146.6 154.7 141.5 187.3 151.5 140.8 1061.3
MB Markers 208 259 236 142 285 205 238 1573
Bins 60 72 77 41 74 51 64 439
cM 133.7 151.9 178.0 138.2 194.7 133.8 158.9 1089.2
HO Markers 97 135 128 97 116 98 88 759
Bins 47 63 59 48 63 44 47 371
cM 141.4 166.5 158.0 126.9 182.6 122.0 185.4 1082.8
SM2 Markers 99 154 162 84 149 75 112 835
Bins 50 61 69 49 80 36 59 404
cM 133.8 136.9 151.2 127.3 179.9 158.3 134.8 1022.2
OWBA.C. Markers 134 189 207 179 224 174 164 1271
Bins 70 78 76 65 72 51 73 485
cM 178.1 185.1 198.9 151.1 189.9 139.5 215.2 1257.8
HA Markers 86 125 120 100 127 88 88 734
Bins 34 46 35 26 52 31 35 259
cM 155.6 180.6 149.4 100.4 190.1 132.4 157.0 1065.5
MH Markers 49 82 99 55 65 76 65 491
Bins 26 39 34 26 35 28 33 221
cM 122.1 152.9 154.0 117.9 161.8 103.6 150.1 962.4
VB Markers 32 94 41 76 89 42 66 440
Bins 26 72 34 55 58 29 56 330
cM 145.0 197.8 180.7 145.3 207.5 137.9 198.7 1212.9
FC Markers 51 86 65 48 52 50 57 409
Bins 33 45 38 33 30 29 33 241
cM 126.2 167.3 146.2 113.6 139.8 102.4 161.0 956.6
ID Markers 50 71 95 49 73 68 40 446
Bins 27 39 38 30 39 35 22 230
cM 147.5 191.3 148.4 136.2 165.1 132.4 199.6 1120.5
Mean cM 143.3 167.7 165.4 129.2 184.3 133.1 172.4 1095.5
Consensus Markers 345 491 487 359 540 357 415 2994
Bins 145 191 180 155 198 126 168 1163
cM 143.2 172.9 180.1 146.5 189.9 142.2 162.5 1137.3
FC, Foster × CIho 4196; HA, Haruna Nijo × Akashinriki; HO, Haruna Nijo × OHU602; ID, Igri × Dobla; MB, Morex × Barke; MH, Mikamo Golden × Harrington; OWBA.C., anther culture-derived Oregon Wolfe Barley; OWBH.b., Oregon Wolfe Barley created with the Hordeum bulbosum-based approach; SM1, Steptoe × Morex 1; SM2, Steptoe × Morex 2; VB, Vlamingh × Buloke.
Centimorgans estimated using the Kosambi function.

A higher number of loci exhibiting segregation distortion were detected in OWBA.C,, HA, and VB genetic maps, but segregation distortion loci were present in almost every population and regions affected by distortion were not always coincident among individual maps (Supplemental Fig. S3). It is unclear whether or not the method for population development (RIL or doubled haploidization via H. bulbosum or anther or microspore culture) is associated with a greater degree of segregation distortion.

Development of an Integrated Consensus Map

Individual genetic maps were merged into a consensus map using MergeMap (Wu et al., 2011; Wu, 2008b), a freely available software tool that implements an algorithm based on graph theory (Jackson et al., 2005, 2008; Yap et al., 2003) to integrate linkage maps. Although JoinMap (Van Ooijen, 2006) has been one of the most commonly used softwares to build consensus maps, MergeMap outperforms JoinMap in marker order accuracy and speed of operation (Wang et al., 2011; Wu et al., 2011) and has been successfully used to generate previous SNP consensus maps of barley (Close et al., 2009) and cowpea [Vigna unguiculata (L.) Walp.] (Muchero et al., 2009). Given the high number of individual maps and differences in population size, which affect the accuracy of the marker positioning, a few ordering conflicts were found in all chromosomes except 6H and 7H. MergeMap resolved most of these conflicts by deleting the smallest set of marker occurrences necessary to remove the conflicts. However, in chromosome 5H there was a case of equal probability of marker removal between the two maps in conflict (SM1 and SM2). We then used the option of assigning “weights” to individual maps that the software offers (Wu et al., 2011) to give priority to the marker order of SM2, due to the greater number of lines in this population (Table 1). Since genetic distances in the consensus map were expanded relative to the individual maps, which is an algorithmic anomaly of the coordinate system used in MergeMap, chromosomal lengths were normalized after consensus map construction (see Materials and Methods).

Although a comparison of the consensus genetic map to the individual component maps showed a good consistency in the locus order between the populations, a total of four markers were found to map twice in the consensus map due to their different chromosomal position in the component maps. In particular, markers 1_0349 and 1_0716 mapped on both 1H and 3H, marker 2_1055 mapped on 1H and 6H, and 2_0029 was found to map on both 5H and 6H (Supplemental Table S5). Map data from flow-sorted chromosomes (see below) were used to manually curate two of these markers (1_0349 and 2_0029), for which 3H and 6H map positions were retained, respectively, while the second position was removed. Removal of one of the map positions for SNPs 1_0716 and 1_1055 was done based on population consistency and rice synteny, with the 1H and 6H map positions retained for these two markers, respectively.

The resulting consensus genetic map contained 2994 SNP loci in 1163 marker bins (unique loci) in an aggregate map size of 1137 cM (Table 2; Supplemental Table S6), providing an average marker bin density of 0.99 cM. The map has only one large gap of 11 cM in the long arm of chromosome 4H (Fig. 1; Supplemental Table S6) with the remaining gaps smaller than 5 cM. Although the genotyping of most of the additional individual populations with a subset of the total number of available OPA-SNP markers (Table 1) limited the mapping of new markers on the consensus map, 167 new SNPs were placed into the new consensus map compared to the Close et al. (2009) map, most of them mapping to chromosomes 2H, 3H, and 4H (Supplemental Table S7). This is an increase of 51 SNPs after subtracting 116 SNPs that were used previously but not included in the new map (Table 3). However, the resolution of the consensus linkage map was clearly higher with the inclusion of the seven additional individual maps, as shown by the increased number of bins in all chromosomes, with a total increase in map resolution of almost 20% (Table 3). Chromosomes 1H, 4H, and 6H had the smallest number of markers and bins and were also the smallest in size (Table 2). Close et al. (2009) also showed similar results, although we were able to increase both the number of markers and marker resolution for the three LGs, especially in chromosome 4H (Table 3). In general, small rearrangements were observed when comparing the two consensus maps, with the largest near pericentromeric regions. Since both maps were generated using the same software, these differences mainly reflect the greater resolution of the current SNP consensus map as a result of the addition of populations that have informative recombination events between closely linked markers or regions with little recombination. This improved consensus genetic map is publicly available at HarvEST:Barley (version 1.82 and higher; HarvEST, 2011) and GrainGenes 2.0 (USDA-ARS, 2011a).


View Full Table | Close Full ViewTable 3.

Comparison of the new single nucleotide polymorphism (SNP)-based consensus genetic map with the Close et al. (2009) consensus map.

 
Close et al. (2009) New consensus map
Number of individuals 373 1133
Number of SNPs 1H 341 345
2H 485 491
3H 475 487
4H 338 359
5H 535 540
6H 352 357
7H 417 415
Total 2943 2994
Number of bins 1H 125 145
2H 161 191
3H 152 180
4H 113 155
5H 180 198
6H 111 126
7H 133 168
Total 975 1163
Figure 1.
Figure 1.

The improved single nucleotide polymorphism-based consensus map of barley. The consensus genetic map is represented as concentric circles, from 1H (most central) to 7H (most outer). Chromosomes are anchored based on their respective pericentromeric regions determined with flow-sorted chromosome arms (except for 1H, where homology to rice was used). Circles represent the number of markers at each position, with both size and color representing total number of markers (blackest black = 53; grayest gray = 1). A scale bar representing 5 cM is shown as concentric arcs beside each chromosome.

 

To assess the impact of individual populations on the integrated map, we performed a leave-one-out analysis in the construction of the consensus genetic map. As shown in Table 4, the MB population had the greatest impact on the number of markers, with 9.0% reduction in the number of mapped SNPs if it is removed from the integrated map. This can be associated with the application of the three POPAs for genotyping this population and the use of both cv. Morex and Barke in the design of the GoldenGate assays (Close et al., 2009). As expected, leaving out SM2 had no impact on marker number, since SM2 markers are included in SM1. Of the new populations used in this study, FC added the most markers to the updated consensus map (1.4%; Table 4). Regarding their contribution to the consensus map resolution, OWBH.b. was the population that had the highest impact in the number of unique bins followed by OWBA.C. (6.5 and 5.2% bin reduction, respectively; Table 4), which reveals the importance of including this anther culture-derived population in the development of the new SNP-based consensus map. The inclusion of VB, FC, and the additional individuals of the SM population (SM2) also had a relevant influence on increasing map resolution, with numbers of marker bins reduced by 3.9, 2.4, and 2.1% due to their absence from the consensus map, respectively (Table 4). Surprisingly, the exclusion of HA increased the number of bins by 0.9% (9 bins) and is likely a direct result of the small population size (54 lines). Regardless of the negative impact on map resolution, the HA population contributed to an increase in the number of SNP markers on the consensus map by 0.6% (Table 4), which led us to keep it as a component population of the integrated consensus map.


View Full Table | Close Full ViewTable 4.

Leave-one-out population analysis for the consensus genetic map.

 
Population left out† Consensus genetic map Percent reduction in the number of bins Percent reduction in marker inclusion
Bins Markers
FC 1135 2953 2.4 1.4
HA 1174 2977 –0.9 0.6
HO 1140 2983 2.0 0.4
ID 1151 2987 1.0 0.2
MB 1121 2726 3.6 9.0
MH 1153 2989 0.9 0.2
OWBA.C 1103 2990 5.2 0.1
OWBH.b. 1087 2880 6.5 3.8
SM1 1150 2891 1.1 3.4
SM2 1139 2994 2.1 0.0
VB 1118 2989 3.9 0.2
None 1163 2994 NA NA
Population left out of the construction of the consensus genetic map. FC, Foster × CIho 4196; HA, Haruna Nijo × Akashinriki; HO, Haruna Nijo × OHU602; ID, Igri × Dobla; MB, Morex × Barke; MH, Mikamo Golden × Harrington; OWBA.C., anther culture-derived Oregon Wolfe Barley; OWBH.b., Oregon Wolfe Barley created with the Hordeum bulbosum-based approach; SM1, Steptoe × Morex 1; SM2, Steptoe × Morex 2; VB, Vlamingh × Buloke.

For reference to previous data sets, historical markers were integrated into the SM and OWBH.b individual populations (Costa et al., 2001; Kleinhofs et al., 1993) and the consensus map was reconstructed (Supplemental Table S8).

Gene Mapping using Flow-Sorted Chromosome and Arms

The BOPA1 and BOPA2 platforms were applied to amplified, flow-sorted material to rough map genes to chromosome 1H and the chromosome arms of 2H to 7H. For the purpose of anchoring markers to individual chromosomes or arms, SI was more important than the clustering results as the chromosome location is independent of the allele. This approach was extremely robust, as 2930 genes were mapped with BOPA1 and BOPA2, which represents 96.1% of the genes surveyed (Table 5). An overlap of 2560 genes was mapped with both flow-sorted chromosome or arms and genetic maps, with an agreement of 99.4% (2545 genes) between mapping approaches. A clear correspondence observed between the number of genes mapped using both approaches indicates no significant bias based on chromosome or arm in mapping. An additional 370 genes were mapped using flow-sorted materials, which were not genetically mapped in any of the 10 populations (Supplemental Table S9).


View Full Table | Close Full ViewTable 5.

Genes mapped using flow-sorted chromosome or arms and genetic maps.

 
Method of mapping
Chromosome Arm Flow-sorted chromosomes or arms Genetic map Unique to flow-sorted chromosome or arms Correspondence between flow sorting and genetic mapping
1H NA 432 345 116 316
2H 2HS 153 157 21 129
2HC† 32 38 NA 32
2HL 275 296 27 248
2HB‡ 3 NA 1 2
3H 3HS 118 128 12 102
3HC† 62 69 0 62
3HL 270 290 28 243
4H 4HS 116 104 18 97
4HC† 25 30 0 25
4HL 202 225 15 186
5H 5HS 95 85 12 81
5HC† 34 43 NA 34
5HL 370 412 45 325
5HB‡ 14 NA 1 11
6H 6HS 113 117 11 101
6HC† 48 56 0 48
6HL 170 184 17 152
6HB‡ 1 NA 0 1
7H 7HS 174 182 24 149
7HC† 51 57 0 51
7HL 168 176 22 146
7HB‡ 4 NA 0 4
Total NA 2930 2994 370 2545
C indicates a gene is located in the pericentromeric region based on information from flow-sorted chromosomes and the consensus genetic map.
B indicates a gene was detected on both the short and long chromosome arm.

An advantage with gene mapping using flow-sorted material is that it is an accurate method for determining the physical position of genes relative to chromosome arm. Applying this mapping information from the arms of chromosomes 2H to 7H permits the definition of the pericentromeric region when coupled with the consensus genetic map. Bins in the genetic map were evaluated for an admixture of genes mapped to both the short and long chromosome arm. The pericentromeric region was defined as the set of bins still containing this mixed state of physically mapped genes from both arms. These regions are shown in Table 6 for chromosomes 2H to 7H. A unique characteristic of these regions is the significant increase in gene density, likely caused by a complete lack of recombination. This is clearly observed in Fig. 1, where pericentromeric regions were used to anchor chromosomes on the horizontal axis.


View Full Table | Close Full ViewTable 6.

Pericentromeric regions of chromosomes 2H to 7H.

 
Chromosome Bins cM region
1H NA NA
2H 74–76 67.1–68.1
3H 56–59 65.3–67.9
4H 57 56.2
5H 39 42.4
6H 48–50 59.3–60.6
7H 86–88 80.3–81.8

Conflicts between consensus genetic map position and flow-sorted chromosome or arm was observed for 15 genes (Supplemental Table S5), and these may be explained by low-level error in the mapping of genes with the flow-sorting approach or the mapping of paralogs. Reevaluation of the 22 SNP loci markers detected on both short and long arms of a chromosome (Supplemental Tables S5 and S6) found four in the pericentromeric region (Supplemental Table S5). Flow-sorted chromosome or arm mapping supported the improved quality of SNPs included in the consensus genetic map. Of the 167 new SNPs included in this consensus genetic map, 86 were mapped using flow-sorted material. In contrast, the 116 SNPs removed with respect to Close et al. (2009) only had 32 genes mapped using flow-sorted material. An attempt was made to allocate SNP loci to chromosomes using DNA isolated from disomic wheat–barley addition lines. However, a much higher incidence of marker position conflicts than with flow-sorted materials indicates that flow sorting with subsequent amplification is the more robust approach for OPA-based gene mapping (data not included).

Supplemental Information Available

Supplemental figures and tables associated with this manuscript are located at http://www.crops.org/publications/tpg.

Acknowledgments

This work was supported by the USA National Science Foundation (DBI-0321756 to TJC and SL), the Human Frontier Science Program (LT000218/2011-L to MJM) and Gatsby Charitable Foundation (MJM and BBHW), the Czech Ministry of Education Youth and Sports and the European Regional Development Fund (LC06004 and CZ.1.05/2.1.00/01.0007 to JD, JB, HŠ and PS), the Ministry of Science and Innovation of Spain (AGL2007-62930/AGR and GEN2006-28560), and the Agriculture and Food Research Initiative Plant Genome, Genetics and Breeding Program of USDA's Cooperative State Research and Extension Service (2009-65300-05645 to TJC, SL and GJM). Authors thank Dr. Adam J. Lukaszewski for seeds of wheat-barley addition lines.

 

References

Footnotes


Comments
Be the first to comment.



Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.