Genetic linkage maps are crucial for a variety of studies including quantitative trait loci identification, marker-assisted selection, association mapping, comparative genomics, and map-based cloning. At the same time, high-fidelity and dense genetic maps help in the genetic anchoring of physical maps and in the ordering and orientation of WGS scaffolds. Recently, high-throughput, robust molecular marker technologies have been developed that have resulted in more densely populated genetic maps. As a result, these dense genetic maps have increased occurrence of common markers between individual maps. This has allowed the integration of individual linkage maps into consensus genetic maps, which enable the mapping of an increased number of loci and facilitate the use of markers across different germplasm. Several consensus maps of barley have been published combining information of a minimum of three to a maximum of seven mapping populations and different types of markers (restriction fragment length polymorphism, simple sequence repeat, single nucleotide polymorphism [SNP], diversity array technology, and amplified fragment length polymorphism) (Marcel et al., 2007; Rostoks et al., 2005; Stein et al., 2007; Varshney et al., 2007; Wenzl et al., 2006).
The high-throughput SNP genotyping platform developed by Close et al. (2009) was a major step forward in the development of high-density genetic maps for barley. This platform consists of three pilot oligonucleotide pool assays (POPA1, POPA2, and POPA3) containing 4596 SNPs, which resulted in two final Barley oligonucleotide pool assays (BOPA1 and BOPA2) of 3072 SNPs. In the same work, the authors constructed a consensus genetic map with SNP data from genotyping four doubled haploid (DH) populations that involved a total of 373 lines. The resulting SNP consensus map contained 2943 loci grouped in 975 marker bins and covered a distance of 1099 cM (Close et al., 2009). Since its release, the barley SNP platform has been extensively used by the barley community to characterize germplasm collections (e.g., Cockram et al., 2010; Comadran et al., 2011; Hamblin et al., 2010) and for association mapping studies (e.g., Cuesta-Marcos et al., 2010; Massman et al., 2011; Ramsay et al., 2011; Roy et al., 2010). The premise of the present work was that incorporation of SNP genotyping data from additional populations would considerably improve the resolution and accuracy of the consensus map.
Historically, cytogenetic resources developed during the 20th century have provided a complementary approach to genetic mapping. The recent development of chromosome arm sorting (Suchánková et al., 2006) coupled with whole genome amplification (Šimková et al., 2008) has made the application of these resources to high-throughput genomics technologies possible. An example of this was the analysis of wheat–barley disomic chromosome addition lines (Islam, 1983) using the Affymetrix Barley1 GeneChip (Bilgic et al., 2007; Cho et al., 2006) for transcriptome analyses. A total of 1787 transcribed genes were mapped to chromosome 2H to 7H of barley (chromosome 1H wheat–barley disomic chromosome addition line is not available) (Cho et al., 2006) and later, 1257 were mapped to chromosome arms using a similar approach (Bilgic et al., 2007). As sufficient DNA from individual chromosomes and arms can now be isolated, it is feasible to apply these cytogenetic resources to the same high-throughput genotyping technologies used in genetic mapping. This approach has many strengths, such as not depending on gene expression and the ability to map genes with little or no sequence variation. It also provides complementary mapping information to genetic map-based mapping and allows for the definition of pericentromeric regions if wheat–barley ditelosomic addition lines (WBTALs) are used.
Here, we reexamined the SNP genotyping calls from the original four populations included in the previous SNP consensus map (Close et al., 2009) and added data from another six mapping populations and additional individuals from one of the original populations. For most of the markers two independent lines of direct evidence, genetic mapping and flow sorting, supported the map positions. Cumulatively, we present a robust SNP-based consensus genetic map that incorporates marker data from 1133 individuals. Due to both its higher number of bins and improved marker order, the consensus map developed in this study constitutes a significant achievement in support of SNP selection for marker-assisted breeding, association genetic analysis, map-based cloning, and anchoring DNA sequence scaffolds and a physical map to the genetic map.
MATERIALS AND METHODS
A total of 10 mapping populations were used in this study, including nine segregating populations of DH lines and one recombinant inbred line (RIL) population. Four of those DH populations (Oregon Wolfe Barley [OWB] created with the Hordeum bulbosum L.-based approach [OWBH.b.], Steptoe × Morex [SM] 1 [SM1], Morex × Barke [MB], and Haruna Nijo × OHU602 [HO]) were used previously to construct a SNP consensus map (Close et al., 2009). The six additional populations were anther culture-derived OWB (OWBA.C.), Haruna Nijo × Akashinriki (HA), Mikamo Golden × Harrington (MH), Vlamingh × Buloke (VB), and Igri × Dobla (ID) DH populations and a Foster × CIho 4196 (FC) RIL population. These six populations had been previously developed in the context of other studies and some of them have been published (Cistué et al., 2011; Horsley et al., 2006; Sato et al., 2011a; Sato and Takeda, 2009). An additional 57 individuals from the SM population were also used and, together with the SM1 lines, made up population SM2. More details on the 10 populations and the additional individuals included in the SM population SM2 are given in Table 1.
|Population||Abbreviation||Growth habit†||Type‡||No. of lines (before curation)||No. of lines (after curation)||No. of SNPs§ assayed||No. and percentage of polymorphic SNPs||Reference|
|Oregon Wolfe Barley created with the Hordeum bulbosum L.-based approach||OWBH.b.||S × S||DH||93||82||4596||1469 (32%)||Costa et al., 2001; BarleyWorld. 2006|
|Steptoe × Morex 1||SM1¶||S × S||DH||93||92||4596||1215 (26%)||Kleinhofs et al., 1993; USDA-ARS, 2011b|
|Morex × Barke||MB||S × S||DH||94||93||4596||1574 (34%)||N. Stein, personal communication, 2010|
|Haruna Nijo × OHU602||HO||S × W||DH||100||94||1536||759 (49%)||Sato and Takeda, 2009|
|Steptoe × Morex 2||SM2||S × S||DH||150||146||3060||835 (27%)||Kleinhofs et al., 1993; USDA-ARS, 2011b|
|Anther culture-derived Oregon Wolfe Barley||OWBA.C.||S × S||DH||94||93||3072||1271 (41%)||Cistué et al., 2011|
|Haruna Nijo × Akashinriki||HA||S × S||DH||68||54||1536||734 (48%)||Sato et al., 2011a|
|Mikamo Golden × Harrington||MH||S × S||DH||95||91||1536||491 (32%)||Zhou et al., 2011|
|Vlamingh × Buloke||VB||S × S||DH||347||289||1536||440 (29%)||D. Moody, unpublished data, 2010|
|Foster × CIho 4196||FC||S × S||F8–9 RIL||94||89||1524||409 (27%)||Horsley et al., 2006|
|Igri × Dobla||ID||W × F||DH||106||102||1536||446 (29%)||M.P. Vallés, unpublished data, 2010|
Single Nucleotide Polymorphism Genotyping and Data Analysis
The high-throughput SNP-genotyping platform developed by Close et al. (2009) was used to genotype all the populations included in this study. Three of the populations (OWBH.b., SM1, and MB) were genotyped with the three pilot Illumina (San Diego, CA) GoldenGate oligonucleotide pool assays (POPA1, POPA2, and POPA3), which involved 4596 SNPs (Table 1). Additional individuals of the SM population were genotyped only with POPA1 and POPA2 (3060 markers) and were considered, together with SM1, as an individual population for mapping purposes (SM2). Foster × CIho 4196 was genotyped only with the 1524 SNPs represented on POPA1. Highly informative SNPs represented in these three POPAs were used to generate two BOPAs (BOPA1 and BOPA2), which were used to genotype the OWBA.C. population, with a total of 3072 assayed markers (Table 1). The rest of the populations (HO, HA, MH, VB, and ID) were genotyped only with BOPA1 (1536 SNPs).
Visualization and analysis of SNP data was performed using BeadStudio software (Illumina, 2008). Every SNP data cluster for each population was manually inspected to apply an accurate and consistent clustering method. Uniform criteria for inclusion or exclusion of SNPs were applied to all the populations and, as a consequence, some cases of apparent polymorphism were not used. Typical reasons for exclusion of an apparently polymorphic data clustering pattern included (i) homozygote clusters that are insufficiently separated (theta compressed) to readily distinguish heterozygotes from homozygotes in different germplasm, (ii) clusters that are vertically but not horizontally separated, which we found from the use of flow sorted chromosomes usually to be attributed to polymorphism in a locus different from the targeted SNP, and (iii) excessive dispersion of subclusters within an apparent homozygous cluster, which often manifested as segregation distortion but could be explained by signal interference from a different locus. In cases of minor doubts about the reliability of a SNP, we took advantage of barley–rice (Oryza sativa L.) synteny, viewed using HarvEST:Barley (HarvEST, 2011), to decide whether or not the cluster settings would cause the marker to map to a locus that seemed sensible in the context of synteny. Once this initial BeadStudio analysis was performed, genotyping data from each population were exported from the software for subsequent data processing.
Construction of the Individual Maps
Marker data were curated to identify and remove identical individuals and to exclude monomorphic or highly segregation-distorted markers. Individuals with a high number of heterozygous SNP loci and/or producing “No Calls” at an excessive number of SNPs were also detected and removed from the analysis. In general, these types of issues can be attributed to poor quality DNA samples, cross contamination between DNA samples, or intercrossing between lines. Individuals carrying nonparental alleles were also discarded; such individuals must represent errors in propagation, outcrossing, or DNA sample preparation. Command-line MSTMAP v4.3 (Wu et al., 2008; Wu, 2008a), which efficiently builds genetic maps by computing the minimum spanning tree of a graph associated with the genotyping data, was used to generate individual genetic maps for the 11 populations, using a cut off p-value of 0.000001, maximum distance between markers at 15.0 cM, no estimation before clustering, and the COUNT objective function and with genetic distances estimated using the Kosambi function (Kosambi, 1943).
Linkage groups (LGs) were assigned to chromosomes based on the previous Close et al. (2009) map. As a fairly stringent p-value cut off was used, several maps have two or more LGs that assign to the same chromosome. In this case, LGs were merged based on the ordering from the Close et al. (2009) map and confirmed for order and orientation based on two-point linkage analysis from MadMapper (Kozik, 2006; West et al., 2006). The MadMapper software was also helpful to visualize and validate all 11 genetic maps as well as to identify double recombinants. All double recombination events were inspected and only those supported by several markers or that were preceded and/or succeeded by long genetic distances were considered as “real,” while double recombinants for singleton markers not involving large genetic distances were called as missing data.
Pilot oligonucleotide pool assay names were used to designate SNP loci in the maps (e.g., 1_0894), where the first number corresponds to the POPA number and the next four digits indicate the SNP order in the corresponding POPA. A cross reference between alternative marker names is included in Supplemental Table S1.
Construction of the Consensus Map
All 11 genetic maps were used to generate a consensus genetic map using MergeMap v1.2 (Wu et al., 2011; Wu, 2008b), a software based on graph theory wherein individual maps are converted into directed acyclic graphs that are then merged into a consensus graph on the basis of their shared vertices (Jackson et al., 2005; Jackson et al., 2008; Yap et al., 2003). Equal weight was given to all genetic maps (weight = 1.0). MergeMap implements an efficient algorithm for resolving conflicts in the marker order among individual maps by deleting the smallest set of marker occurrences (Wu et al., 2011). In the case of equal probability of deletion among maps, we manually inspected the quality of each marker in conflict and assigned a higher weight to the most reliable maps. This only occurred once on chromosome 5H, where SM2 was given priority over SM1 due to the greater number of lines in SM2 (54 additional lines).
The current implementation of MergeMap (Wu et al., 2011; Wu, 2008b) inflates genetic distances between markers in the consensus genetic map. Previously, Close et al. (2009) used the arithmetic mean of individual LGs to determine an appropriate scaling factor for each LG. Here, we compared the genetic distances between consecutive markers in individual genetic maps to the same genetic distance as estimated in the consensus genetic map. The most stable estimate was found by dividing the arithmetic mean of these genetic distances in individual genetic maps by that of the consensus genetic map, with a scaling factor of 0.612 ± 0.062.
Plant Material for Flow-Sorted Chromosomes
Seeds of WBTAL (21″ + t″) carrying chromosome arms 2HS, 2HL, 3HS, 3HL, 4HS, 4HL, 5HS, 5HL, 6HS, 6HL, 7HS and 7HL were obtained from the collection maintained at Kyoto University, Japan.
Preparation of Material for Chromosome Sorting
Chromosome preparation and sorting was performed according to Suchánková et al. (2006). Briefly, metaphase cells were accumulated by treatment of root tips with 2 mM hydroxyurea (18 h), recovery in hydroxyurea-free medium (6.5 h), treatment with 2.5 μM amiprophos-methyl (2 h), and overnight incubation in ice cold water. Chromosomes were released by mechanical homogenization after mild formaldehyde fixation (for details see Vrána et al. ). Chromosome suspensions were stained by 2 μg ml−1 DAPI (4′,6-diamidino-2-phenylindole) and analyzed using a FACS Vantage SE flow cytometer (Becton Dickinson, San Jose, CA). Preparation of material for chromosome 1H was reported previously by Šimková et al. (2008). Chromosome arms were sorted from corresponding WBTAL at a quantity of 25,000 each and placed into 20 μL of double distilled H2O in a 0.5 mL polymerase chain reaction tube.
Amplification of Chromosomal Arm DNA
Flow-sorted arms were processed and amplified according to Šimková et al. (2008). Chromosome arms were treated with proteinase K (3 μg per 25,000 arms) for 36 h at 50°C in 70 μL (chromosomes) or 90 μL (arms) of buffer consisting of 2.5 mM Tris (pH 8.0), 1.25 mM ethylenediaminetetraacetic acid (EDTA) (pH 8.0), and 0.125% (w/v) sodium dodecyl sulfate. Half of the original amount of proteinase K was added after 20 h. Proteinase K was removed using a Microcon YM-100 column (Millipore Corporation, Bedford, MA) in four rounds of centrifugation (for details see Šimková et al. ). Chromosomal DNA was amplified using an illustra GenomiPhi V2 DNA Amplification Kit (GE Healthcare, Chalfont St. Giles, UK) in 20 μL reaction for 90 min according to manufacturer's instructions. Amplified DNA was lyophilized and subsequently diluted to a final volume of 100 μL by 10 mM Tris-HCl and 0.1 mM EDTA (pH 8.0). 50 μL were then purified using MicroSpin G50 columns (GE Healthcare).
Assignment of Genes to Chromosomes and Arms using Flow-Sorted Material
Flow-sorted chromosome 1H or arms, following amplification and purification by gel filtration, were applied to BOPA1 and BOPA2 to determine the location of each gene. Two independently prepared samples were used as replicates for all samples except 2HS, 3HL, and 5HS, for which only a single sample of each was applied to BOPA1. The location of each gene was determined by comparing the signal intensities (SIs) from all flow-sorted samples to the SIs from barley genomic DNA samples as positive controls (Morex, Betzes, and Akcent) and to negative controls, either salmon (Oncerhynchus keta) sperm DNA for BOPA1 or Escherichia coli DNA for BOPA2. The proportion of flow-sorted chromosomes or arms in each sample was adjusted by mixing with negative control DNA to achieve two to three times the relative concentration as would be in complete barley genome DNA. The final total DNA concentration was 80 ng μL−1, of which 250 ng was applied to the GoldenGate assay. The BeadStudio software (Illumina, 2008) for BOPA1 (data generated fall 2007) and Genome Studio (Illumina, 2010) for BOPA2 (data generated winter 2010) were used to cluster the data points. In general the data could be partitioned into signals that clustered with the positive or negative controls (Supplemental Fig. S1). In nearly all cases the data clusters required manual adjustment because the default clustering algorithm is intended to first seek heterozygotes and then identify homozygotes whereas in our case the distinction was simply gene presence or absence in the DNA sample. For some SNPs the data could not be adequately partitioned into gene-negative or gene-positive clusters, and in these cases the data were not used for further analysis. The SNP locus was assigned to a chromosome or arm if all replicate samples for that arm or chromosome provided the same interpretation; otherwise the SNP locus was considered to be unassigned.
RESULTS AND DISCUSSION
Data Analysis and Curation
To achieve uniformity in the analysis of each individual population contributing to the consensus map, we reexamined the previous genotyping data corresponding to the OWBH.b., SM1, MB, and HO populations. Due to the more stringent criteria for SNP inclusion in the present work (see Materials and Methods), fewer polymorphic SNPs were considered for individual map construction compared to Close et al. (2009) (Table 1). In particular, 5.9, 4.3, and 4.7% of the polymorphic markers included in the previous OWBH.b., SM1, and MB genetic maps, respectively, were not included in the present study. In contrast, for the HO population, the number of SNPs considered was increased by 3.4% with respect to the previous study, due to the addition of genotypic data from five more lines of this population (Table 1). The removal of less-reliable markers from these populations was intended to help reduce conflicts in the marker order among component maps and therefore assist in the construction of a consensus map (Jackson et al., 2008). The loss of markers in some individual maps was sometimes compensated by their presence in other maps such that in total, 116 markers that were included in the Close et al. (2009) consensus map were not included in the new consensus map produced in this work. A list of those 116 markers, along with their consensus map LG positions and neighboring markers, is provided in Supplemental Table S2.
The same criteria were followed for inspecting the SNP data corresponding to the six additional populations included in this work and the 54 additional lines from the Steptoe × Morex population that had been genotyped with a subset of POPA markers. Although this cannot result in a higher number of mapped markers, their inclusion increased the number of recombination events in the SM population and hence the marker resolution. We were also able to use a new OWB population of 94 lines developed by anther culture (OWBA.C.) (Cistué et al., 2011), which, as expected due to its high degree of phenotypic variation (Costa et al., 2001), contributed a high number of polymorphic SNPs (1215; Table 1). A high percentage of polymorphism (49%; Table 1) was also found in the Japanese HA population developed by Sato et al. (2011a) from crossing the malting cv. Haruna Nijo with the food landrace Akashinriki. The fact that both parents were expressed sequence tag donors from which oligonucleotide pool assay (OPA)-SNPs were identified increased the likelihood of the platform to detect polymorphisms (Sato et al., 2011a). The other four new mapping populations, which included parents from Japan (Mikamo Golden), Australia (Vlamingh and Buloke), the United States (Harrington, Foster, and CIho 4196), and Europe (Igri and Dobla), had lower numbers of polymorphic SNPs (Table 1). Their lower polymorphism rate was probably due to the similarity of the parental genotypes or perhaps their absence from the SNP discovery panel (Close et al., 2009; Moragues et al., 2010).
Single nucleotide polymorphism data were examined afterward to identify identical individuals as well as problematic lines. With the development of high-throughput genotyping technologies, it is becoming easier to detect identical lines in mapping populations, resulting in removal of redundant genotyping information that can cause bias in the linkage analysis. To identify duplicate lines, we compared the genotype calls between all pairs of individuals. The presence of 11 and 14 duplicated individuals had been observed previously in the OWBH.b. and HA mapping populations, respectively (Chutimanitsakun et al., 2011; Sato et al., 2011a). In addition, we found and removed one duplicated individual from the OWBA.C., SM1, and ID populations, two duplicated lines from SM2, MH, and FC, five duplicated individuals from HO, and 42 duplicated lines from VB. Lines with an excessive number of heterozygous SNP calls and/or “No Calls” were also identified and removed from the data set, in particular one line from both MB and HO populations, two lines from the SM2 and MH populations, three individuals from FC and ID, and 16 lines from the VB population. Table 1 shows the final numbers of lines from each population that were considered for further analysis while the specific lines removed from each population can be found in Supplemental Table S3.
Generation of Individual Linkage Maps
After curation of the SNP data, we constructed component maps from the 11 high-quality datasets. We chose the software tool MSTMAP (Wu et al., 2008; Wu, 2008a) to develop all the individual genetic maps due to its good performance compared to other available tools, especially in the speed and accuracy of map construction (Cheema and Dicks, 2009). The resulting linkage maps were also compared with those produced by JoinMap 4.0 (Van Ooijen, 2006). Maps generated by both programs were identical in marker order, probably due to the quality of our genotyping data, although MSTMAP assembled maps significantly faster than JoinMap. Since not all the individual maps had the same SNP coverage, we preferred not to force the number of LGs to match the number of chromosomes and to use a set of stringent parameters with MSTMAP, taking advantage of the wealth of genetic map information to link the disjointed LGs. Specifically, we used the Close et al. (2009) consensus map to join and orientate LGs.
All constructed individual maps were then validated by visualizing with CheckMatrix from MadMapper (West et al., 2006; Kozik, 2006). First, we confirmed the high quality of the genetic maps by generating two-dimensional heat plots, which show all pairwise recombination values for nonredundant markers. An example of a heat map from one of the individual maps is shown in Supplemental Fig. S2. Second, we generated a graphical genotyping plot from each map to easily identify all double crossovers. Double recombination events can be real or indicative of genotyping errors. We manually inspected all double recombinants that were not supported by large centimorgan distances between markers. In total, 98 singletons were replaced with missed calls. Most of these were identified in FC (49) and VB (34) mapping populations, probably because of their lower marker density compared to other maps, although in the case of FC some of these 49 double crossovers might be real, given the higher opportunity for recombination of RILs than DHs. We preferred to be conservative and err on the side of caution. The remaining rare singletons occurred in SM1, SM2, OWBA.C., HA, MH, and ID mapping populations.
We generated each of the 11 component maps from the filtered genotype datasets, and both the individual maps and the genotyping data used for their construction are presented in Supplemental Table S4. A total of four markers could not be placed into individual genetic maps: marker 3_1434 from OWBH.b., marker 2_0029 from MB, and markers 1_0739 and 1_0780 from FC. The rest of the SNPs were distributed among the seven barley chromosomes in each of the component maps (Table 2), with average densities ranging from one SNP per 2.52 cM in the MB genetic map to one marker per 5.02 cM in the ID genetic map. Genetic map sizes varied among the different populations, from 954.1 cM for FC to 1257.8 cM for OWBA.C. (Table 2).
A higher number of loci exhibiting segregation distortion were detected in OWBA.C,, HA, and VB genetic maps, but segregation distortion loci were present in almost every population and regions affected by distortion were not always coincident among individual maps (Supplemental Fig. S3). It is unclear whether or not the method for population development (RIL or doubled haploidization via H. bulbosum or anther or microspore culture) is associated with a greater degree of segregation distortion.
Development of an Integrated Consensus Map
Individual genetic maps were merged into a consensus map using MergeMap (Wu et al., 2011; Wu, 2008b), a freely available software tool that implements an algorithm based on graph theory (Jackson et al., 2005, 2008; Yap et al., 2003) to integrate linkage maps. Although JoinMap (Van Ooijen, 2006) has been one of the most commonly used softwares to build consensus maps, MergeMap outperforms JoinMap in marker order accuracy and speed of operation (Wang et al., 2011; Wu et al., 2011) and has been successfully used to generate previous SNP consensus maps of barley (Close et al., 2009) and cowpea [Vigna unguiculata (L.) Walp.] (Muchero et al., 2009). Given the high number of individual maps and differences in population size, which affect the accuracy of the marker positioning, a few ordering conflicts were found in all chromosomes except 6H and 7H. MergeMap resolved most of these conflicts by deleting the smallest set of marker occurrences necessary to remove the conflicts. However, in chromosome 5H there was a case of equal probability of marker removal between the two maps in conflict (SM1 and SM2). We then used the option of assigning “weights” to individual maps that the software offers (Wu et al., 2011) to give priority to the marker order of SM2, due to the greater number of lines in this population (Table 1). Since genetic distances in the consensus map were expanded relative to the individual maps, which is an algorithmic anomaly of the coordinate system used in MergeMap, chromosomal lengths were normalized after consensus map construction (see Materials and Methods).
Although a comparison of the consensus genetic map to the individual component maps showed a good consistency in the locus order between the populations, a total of four markers were found to map twice in the consensus map due to their different chromosomal position in the component maps. In particular, markers 1_0349 and 1_0716 mapped on both 1H and 3H, marker 2_1055 mapped on 1H and 6H, and 2_0029 was found to map on both 5H and 6H (Supplemental Table S5). Map data from flow-sorted chromosomes (see below) were used to manually curate two of these markers (1_0349 and 2_0029), for which 3H and 6H map positions were retained, respectively, while the second position was removed. Removal of one of the map positions for SNPs 1_0716 and 1_1055 was done based on population consistency and rice synteny, with the 1H and 6H map positions retained for these two markers, respectively.
The resulting consensus genetic map contained 2994 SNP loci in 1163 marker bins (unique loci) in an aggregate map size of 1137 cM (Table 2; Supplemental Table S6), providing an average marker bin density of 0.99 cM. The map has only one large gap of 11 cM in the long arm of chromosome 4H (Fig. 1; Supplemental Table S6) with the remaining gaps smaller than 5 cM. Although the genotyping of most of the additional individual populations with a subset of the total number of available OPA-SNP markers (Table 1) limited the mapping of new markers on the consensus map, 167 new SNPs were placed into the new consensus map compared to the Close et al. (2009) map, most of them mapping to chromosomes 2H, 3H, and 4H (Supplemental Table S7). This is an increase of 51 SNPs after subtracting 116 SNPs that were used previously but not included in the new map (Table 3). However, the resolution of the consensus linkage map was clearly higher with the inclusion of the seven additional individual maps, as shown by the increased number of bins in all chromosomes, with a total increase in map resolution of almost 20% (Table 3). Chromosomes 1H, 4H, and 6H had the smallest number of markers and bins and were also the smallest in size (Table 2). Close et al. (2009) also showed similar results, although we were able to increase both the number of markers and marker resolution for the three LGs, especially in chromosome 4H (Table 3). In general, small rearrangements were observed when comparing the two consensus maps, with the largest near pericentromeric regions. Since both maps were generated using the same software, these differences mainly reflect the greater resolution of the current SNP consensus map as a result of the addition of populations that have informative recombination events between closely linked markers or regions with little recombination. This improved consensus genetic map is publicly available at HarvEST:Barley (version 1.82 and higher; HarvEST, 2011) and GrainGenes 2.0 (USDA-ARS, 2011a).
|Close et al. (2009)||New consensus map|
|Number of individuals||373||1133|
|Number of SNPs||1H||341||345|
|Number of bins||1H||125||145|
To assess the impact of individual populations on the integrated map, we performed a leave-one-out analysis in the construction of the consensus genetic map. As shown in Table 4, the MB population had the greatest impact on the number of markers, with 9.0% reduction in the number of mapped SNPs if it is removed from the integrated map. This can be associated with the application of the three POPAs for genotyping this population and the use of both cv. Morex and Barke in the design of the GoldenGate assays (Close et al., 2009). As expected, leaving out SM2 had no impact on marker number, since SM2 markers are included in SM1. Of the new populations used in this study, FC added the most markers to the updated consensus map (1.4%; Table 4). Regarding their contribution to the consensus map resolution, OWBH.b. was the population that had the highest impact in the number of unique bins followed by OWBA.C. (6.5 and 5.2% bin reduction, respectively; Table 4), which reveals the importance of including this anther culture-derived population in the development of the new SNP-based consensus map. The inclusion of VB, FC, and the additional individuals of the SM population (SM2) also had a relevant influence on increasing map resolution, with numbers of marker bins reduced by 3.9, 2.4, and 2.1% due to their absence from the consensus map, respectively (Table 4). Surprisingly, the exclusion of HA increased the number of bins by 0.9% (9 bins) and is likely a direct result of the small population size (54 lines). Regardless of the negative impact on map resolution, the HA population contributed to an increase in the number of SNP markers on the consensus map by 0.6% (Table 4), which led us to keep it as a component population of the integrated consensus map.
|Population left out†||Consensus genetic map||Percent reduction in the number of bins||Percent reduction in marker inclusion|
For reference to previous data sets, historical markers were integrated into the SM and OWBH.b individual populations (Costa et al., 2001; Kleinhofs et al., 1993) and the consensus map was reconstructed (Supplemental Table S8).
Gene Mapping using Flow-Sorted Chromosome and Arms
The BOPA1 and BOPA2 platforms were applied to amplified, flow-sorted material to rough map genes to chromosome 1H and the chromosome arms of 2H to 7H. For the purpose of anchoring markers to individual chromosomes or arms, SI was more important than the clustering results as the chromosome location is independent of the allele. This approach was extremely robust, as 2930 genes were mapped with BOPA1 and BOPA2, which represents 96.1% of the genes surveyed (Table 5). An overlap of 2560 genes was mapped with both flow-sorted chromosome or arms and genetic maps, with an agreement of 99.4% (2545 genes) between mapping approaches. A clear correspondence observed between the number of genes mapped using both approaches indicates no significant bias based on chromosome or arm in mapping. An additional 370 genes were mapped using flow-sorted materials, which were not genetically mapped in any of the 10 populations (Supplemental Table S9).
|Method of mapping|
|Chromosome||Arm||Flow-sorted chromosomes or arms||Genetic map||Unique to flow-sorted chromosome or arms||Correspondence between flow sorting and genetic mapping|
An advantage with gene mapping using flow-sorted material is that it is an accurate method for determining the physical position of genes relative to chromosome arm. Applying this mapping information from the arms of chromosomes 2H to 7H permits the definition of the pericentromeric region when coupled with the consensus genetic map. Bins in the genetic map were evaluated for an admixture of genes mapped to both the short and long chromosome arm. The pericentromeric region was defined as the set of bins still containing this mixed state of physically mapped genes from both arms. These regions are shown in Table 6 for chromosomes 2H to 7H. A unique characteristic of these regions is the significant increase in gene density, likely caused by a complete lack of recombination. This is clearly observed in Fig. 1, where pericentromeric regions were used to anchor chromosomes on the horizontal axis.
Conflicts between consensus genetic map position and flow-sorted chromosome or arm was observed for 15 genes (Supplemental Table S5), and these may be explained by low-level error in the mapping of genes with the flow-sorting approach or the mapping of paralogs. Reevaluation of the 22 SNP loci markers detected on both short and long arms of a chromosome (Supplemental Tables S5 and S6) found four in the pericentromeric region (Supplemental Table S5). Flow-sorted chromosome or arm mapping supported the improved quality of SNPs included in the consensus genetic map. Of the 167 new SNPs included in this consensus genetic map, 86 were mapped using flow-sorted material. In contrast, the 116 SNPs removed with respect to Close et al. (2009) only had 32 genes mapped using flow-sorted material. An attempt was made to allocate SNP loci to chromosomes using DNA isolated from disomic wheat–barley addition lines. However, a much higher incidence of marker position conflicts than with flow-sorted materials indicates that flow sorting with subsequent amplification is the more robust approach for OPA-based gene mapping (data not included).
Supplemental Information Available
Supplemental figures and tables associated with this manuscript are located at http://www.crops.org/publications/tpg.