Soil Health Indicators Do Not Differentiate among Agronomic Management Systems in North Carolina Soils

This article has supplemental material available online. Soil Sci. Soc. Am. J. 81:828–843 doi:10.2136/sssaj2016.12.0400 Received 7 Dec. 2016. Accepted 11 Apr. 2017. *Corresponding author (dosmond@ncsu.edu). © Soil Science Society of America. This is an open access article distributed under the CC BY license (https://creativecommons.org/licenses/by/4.0/). Soil Health Indicators Do Not Differentiate among Agronomic Management Systems in North Carolina Soils Soil Fertility & Plant Nutrition

and time (Buol et al., 2011).Soil-forming factors are well known and soil properties are measurable, but the current emphasis on soil health requires an integrative assessment of how intrinsic soil properties are affected by soil management.
The inherent complexity within soil systems complicates soil health evaluations because temperature, precipitation, and other climatic factors can interact with intrinsic soil properties in variable ways.These effects are not equal for all soils, which indicates that soil properties vary because of combinations of different physical, chemical, and biological processes.Several soil indicators have been suggested by researchers attempting to define soil health (Idowu et al., 2008;Morrow et al., 2016), but the effectiveness of combining analyses of different soil indicators into a comprehensive interpretation of soil health remains elusive.
Crop and soil management practices such as crop rotations and reduced tillage can significantly influence soil productivity (Arshad and Coen, 1992), and soil evaluations are used as guides for appropriate application of similar beneficial management practices.Because the soil health philosophy emphasizes interactions between soil properties in agronomic systems, it is increasingly important to make both qualitative and quantitative assessments of the soil properties considered to be significant for soil evaluations and also to note how they are affected by agronomic management.It has been shown that no-till planting and cover cropping may add organic matter to soil, which improves physical structure (Tisdall and Oades, 1982), stimulates biological activity (Varvel et al., 2006), and enlarges the pool for C and nutrient cycling (Campbell et al., 1996), but there is limited research about the soil health indicators that are most representative of soil properties under different soil conditions.
A single soil health indicator is unlikely to provide a complete assessment of soil productivity (Liebig et al., 2001), so soil tests have been developed to analyze several soil properties simultaneously and interpret their use for a desired management objective.Some soil tests, like the Mehlich-3 soil test used by the North Carolina Department of Agriculture and Consumer Services (NCDA&CS), are regularly used to provide agronomic recommendations for plant nutrient applications (Hardy et al., 2014).There is a long history of agronomic recommendations from NCDA&CS soil testing for mineral nutrients, but their recommendations do not include biological or physical soil evaluations.Some soil tests integrated biological soil health indicators into their analyses as the importance of more comprehensive soil testing was recognized.The HSHT includes CO 2 respiration as a soil health indicator in addition to nutrient testing (Haney et al., 2006), whereas the CASH developed by the Cornell Soil Health Testing Laboratory includes multiple physical, chemical, and biological soil health indicators in its evaluation (Moebius et al., 2007).Although the HSHT and CASH analyze more soil properties than traditional soil tests, there is no confirmation that assessments of soil health indicators will guide soil management recommendations to improve soil health for different soils.
Because soil health tests are promoted as guides for soil management practices that can improve soil productivity, there is an increasing need to ensure that the soil health indicators used in testing are capable of differentiating among the effects of various soil management practices on diverse soils.Soil tests that are not calibrated to quantify responses to management may provide misleading recommendations that do not improve environmental functions or agronomic productivity because of intrinsic soil limitations.We used three distinct long-term agronomic management trials to: (i) compare results from different soil tests, (ii) assess the ability of soil health indicators to differentiate the soil management effects on soil properties in North Carolina, and (iii) assess the relationships between soil health and crop yields.

METHODS AND MATERIALS Experimental Plots and Design
There are three physiographic regions in North Carolina (coastal plain, piedmont, and mountain) that vary in soil characteristics (e.g., texture, organic matter, minerology), rainfall, temperature, and other climatic factors.To capture this variability, a research trial from each region was used in the analyses.These long-term agronomic trials are located in Goldsboro (17 yr), Reidsville (32 yr), and Mills River (22 yr), which represent the coastal plain, piedmont, and mountain physiographic regions of North Carolina, respectively (Fig. 1).Each trial has a unique management history that broadened the scope of the soil health test evaluation (Table 1).
The Goldsboro (coastal plain) research trial was established in 1999 at the Center for Environmental Farming Systems research farm (35°22¢59.9808¢¢N,78°2¢19.6722¢¢W) on soil mapped as Wickham sandy loam (fine-loamy, mixed semiactive, thermic Typic Hapludults) with interspersed sections of soil mapped as Tarboro loamy sand (mixed, thermic Typic Udipsamments).The trial began with four management treatments with asynchronous crop rotations, which meant that different crops were planted for different treatments depending on the year (Mueller et al., 2002).There were two treatments with chemical management using synthetic pesticides and fertilizer.One treatment included tillage (conventional tillage chemical, CTC) and the other was no-till (no-till chemical, NTC).They had 3-yr rotations of corn (Zea mays L.), peanut (Arachis hypogaea L.), and cotton (Gossypium hirsutum L.) from 1999 to 2005 and changed to 3-yr rotations of corn, the hybrid variety of sorghum [Sorghum bicolor (L.) Moench] and sudangrass [Sorghum × sudanense (Piper) Stapf ] lines combined into one species (sorghum-sudangrass), and double-crop soybean [Glycine max (L.) Merr.] with winter wheat (Triticum aestivum L.) in 2006.One organic treatment (conventional tillage organic 1, CTO1) included conventional tillage to produce soybean, sweet potato [Ipomea batatas Ipomoea batatas (L.) Lam.], and cabbage (Brassica oleracea L.) with winter wheat from 1999-2001.Corn and soybean were planted for CTO1 from 2002 to 2007, then sorghum-sudangrass in 2008 followed by two fallow years.Since 2011, CTO1 has had a 3-yr rotation with clean-tilled corn, soy-bean, and a year of a stale seedbed with a sorghum-sudangrass cover crop.The other organic treatment (CTO2) included conventional tillage to produce continuous corn from 1999 to 2002.From 2003 to 2007, CTO2 had corn, soybean, and organic forage followed by a fallow period from 2008 to 2010.In 2011, CTO2 changed to a 3-yr rotation of corn, soybean, and sunflower (Helianthus annuus L.).For CTO2, a rye (Secale cereal L.) cover crop was also planted before soybean, and a rye and legume cover crop mixture was planted before corn and sunflower.Changes to the organic systems were intended to better manage pests and other obstacles detrimental to crop yields.The four treatments were replicated three times for a total of 12 plots ranging in size from 0.43 to 2.30 ha (1.08-5.68ac).The plots were organized in a randomized complete block design.
Corn was planted in rows that were 76 cm (30 in.) apart and had varying row lengths depending on plot sizes.Approximate corn planting density was 67,500 seeds ha -1 (27,000 seeds ac -1 ) in conventional systems and 75,000 seeds ha -1 (30,000 seeds ac -1 ) in organic systems.After planting, N was added to corn crops at a rate of 170 kg N ha -1 (150 lbs ac -1 ) as suggested by the realistic yield database for soil series in North Carolina (North Carolina Interagency Nutrient Management Committee, 2014).Fertilizer N for chemically managed plots was in the form of urea-N; for organically managed plots, a combination of raw poultry litter and predicted N mineralization from cover crop residue was used to meet the 170 kg N ha -1 rate.Salt fertilizers of P and K were applied in amounts recommended by the NCDA&CS soil test in chemically managed plots; organically managed plots re-  ceived P and K based on the amount of raw poultry litter applied as N fertilizer.Before 2011, yield was estimated by harvesting 12 m (40 ft) of two rows with a combine.After 2011, yield was estimated from hand-harvesting 6 m (20 ft) of two rows and manually shelling the corn.All harvested corn was adjusted for uniform moisture content at 15.5% and yield was extrapolated to a perhectare harvest.Unharvested crop residue was left to decompose on the soil surface until the next cropping season.Monthly average temperatures for the April to October growing season ranged from 10 to 22°C (50-72°F) for lows and 24 to 33°C (75-91°F) for highs.Monthly average rainfall for the rain-fed plots ranged from 86 to 145 mm (3.4-5.7 in.) during the same period.
The Reidsville (piedmont) research trial at the Upper Piedmont Research Station (36°23¢2.1372¢¢N, 79°42¢6.8436¢¢W)has soil mapped as a Toast coarse sandy loam (fine, kaolinitic, mesic Typic Kanhapludults).In the first 5 yr from 1984 to 1989, the field was continuous corn and from 1989 to 2015, there was a corn and soybean rotation (except from 2005 to 2007, when corn was planted for three consecutive years), all with a 96.5-cm (38 in.) row spacing (Cassel et al., 1995;Meijer et al., 2013).There were nine tillage treatments replicated four times for a total of 36 plots in a randomized complete block design.Each plot had six rows and an area of 5.8 by 15.2 m (19 by 50 ft).All piedmont plots were managed with chemical pesticides and fertilizer and included NTC, disking in spring (DS), in-row subsoiling in spring, chisel plowing in spring (CPS), chisel plowing in fall (CPF), chisel plowing and disking in spring (CPDS), chisel plowing and disking in fall, moldboard plowing and disking in spring (MPDS), and moldboard plowing and disking in fall (MPDF).Tillage treatments were selected to represent minimal soil disturbance (no-till) to severe disturbance (moldboard plow) and tillage traffic was restricted to the same rows every year.
Plot maintenance included yearly fertilizer applications in spring according to NCDA&CS soil test recommendations.Fertilizer N applications to corn were split throughout spring to a total of 179 kg ha -1 (160 lbs ac -1 ); no fertilizer N was applied to soybean crops.Planting density was approximately 75,000 seeds ha -1 (30,000 seeds ac -1 ) for corn and 343,894 seeds ha -1 (137, 500 seeds ac -1 ) for soybean.Piedmont crop yields were calculated by harvesting 15.2 m (50 ft) of the middle two rows of a plot with a combine, adjusting the mass to a uniform moisture content of 15.5% for corn and 13% for soybean, and extrapolating the yield to a per-hectare harvest.Crop stover remained in the field to decompose after harvest.For tilled treatments only, crop residues were tilled into the soil before planting the next crop.Average monthly temperatures for the piedmont area range from 12 to 19°C (54-66°F) for lows and from 21 to 31°C (70-88°F) for highs during the April to October growing season.Average monthly rainfall ranged from 100 to 120 mm (3.9-4.7 in.) during the same period.
The Mills River (mountain) research trial at the Mountains Horticultural Crops and Research Extension Center (35°25¢39.126¢¢N,82°33¢24.7068¢¢W)began in 1994 on soil mapped as Delanco silt loam (fine-loamy, mixed, semiactive, mesic Aquic Hapludults).The study included five treatments: NTO, NTC, chisel and disk tillage with organic management (CTO), chisel and disk tillage with chemical management (CTC), and chisel and disk tillage with no fertilizer or pesticide inputs (CTX) as a control treatment (Hoyt, 2005;2007).The five treatments were arranged in a completely randomized design with four replications and a total of 20 plots, each sized 24.4 by 12.2 m (80 by 40 ft) with 16 crop rows.Organic plots were managed as such, but only received USDA organic status in 2010.
Sweetcorn was planted at approximately 65,000 seeds ha -1 (26,400 seeds ac -1 ).Yield was measured by hand-harvesting sweetcorn ears from 10-m (39-ft) lengths of the two center rows of each plot and extrapolating it to a per-hectare harvest.
Unharvested crop residue was left on the soil surface to decompose until being incorporated during tillage for the next cropping season.Average monthly temperatures in the mountain area range from 9 to 16°C (49-61°F) for lows and 19 to 29°C (66-84°F) for highs during the April to October growing season.Average monthly rainfall ranged from 110 to 125 mm (4.4-5.0 in.) during the same period.

Soil Sampling
Soil samples for the HSHT were collected from piedmont and mountain trials as part of a USDA-NRCS research project in fall 2014.For NCDA&CS and CASH analyses, soil samples were collected in November 2015 from piedmont trials and in December 2015 from coastal plain and mountain trials.Soil sampling procedures for the HSHT and NCDA&CS were similar: 8 to 10 subsamples totaling 473 cm 3 (2 cups) were collected from each plot using a soil probe (Plated Soil Probe with Slide Hammer, AMS, American Falls, ID). with a 2-cm (0.78 in.) diameter to collect from the top 15 cm (6 in.) of soil.The soil was mixed, dried at 60°C for 24 h, and submitted according to recommendations from the NCDA&CS and the HSHT lab.Soil samples submitted to Cornell were collected according to a slightly modified protocol listed in the CASH manual (Gugino et al., 2009) because of the size limitations of our research plots.In brief, three to five subsamples were collected with an auger to a depth of 15 cm (6 in.) to obtain approximately 1400 cm 3 (6 cups) of soil from each plot.After sampling, the soil was stored in sealed plastic bags and refrigerated at 4°C until being shipped to the Cornell Soil Health Testing Laboratory in January 2016.Three penetrometer (Field Scout SC-900, Spectrum, Aurora, IL) measurements to 45 cm (18 in.) depth were conducted in each plot while soil moisture was approximately at field capacity.The Cornell Soil Health Testing Laboratory was contacted to verify that this approach was acceptable for measurement at the plot scale, since their recommended procedure is for 10 measurements.The greatest pressures from the 0 to 15 cm (0-6 in.) and 15 to 45 cm (6-18 in.) depths were included in penetrometer data submitted to Cornell.

Soil Testing and Soil Health Scores
The NCDA&CS uses the Mehlich 3 soil extractant (Mehlich, 1984a) to analyze soil for essential plant nutrients and the Mehlich buffer pH method (Mehlich, 1976) for lime recommendations.A separate procedure used by the NCDA&CS to measure humic matter (HM) includes a colorimeter and a 650 nm light filter to detect the percentage of HM in the soil solution after extracting humic and fulvic acids using an alkaline solution containing NaOH (Mehlich, 1984b).The nutrient results of the NCDA&CS soil test are based on soil volumetric concentrations and are converted into weight per area based on a 20-cm (7.9-in.)depth (Hardy et al., 2014).The NCDA&CS soil test's recommended nutrient applications are provided on a per-area basis and do not include guidelines for tillage, cropping, N fertilizer or other management needs beyond plant nutrients and lime.
The HSHT for essential plant nutrients relies on the Haney, Haney, Hossner, and Arnold (H3A) soil extractant that contains the organic plant root exudates typically associated with plant nutrient uptake from soil (Haney et al., 2006;2010).The HSHT tool requires user inputs of expected crop yield and it provides fertilizer and cropping recommendations based on user inputs and soil analyses.Nutrient recommendations from the HSHT are based on concentrations of nutrients extracted from 4 g of soil, which is converted into units of weight per area.In addition to the nutrient extractions is a measurement of CO 2 -C released (in mg L -1 ) during a 24-h incubation of wetted soil (Franzluebbers et al., 2000).A HSHT score is calculated using the amount of CO 2 -C release in 24 h along with a separate procedure from the H3A extract to measure soil concentrations (in mg L -1 ) of water-extractable organic C (WEOC) and water extractable organic N (WEON) as follows: 2 Soil Health Score 10 100 10 The overall health score of the CASH is a combination of 12 quantifiable soil health indicators (Gugino et al., 2009;Moebius-Clune et al., 2016) of the most pertinent soil properties considered to affect soil health (Idowu et al., 2008).Several different analyses of physical, chemical, and biological soil properties are included in the CASH, as described in the Cornell Soil Health Assessment Training Manual (Gugino et al., 2009).Soil health indicators for a soil sample are scored based on a database of results from processed soil samples that were used to create separate cumulative normal distribution curves normalized to values from 0 to 100 for each indicator.The soil health indicator curves are adjusted for fine, medium, and coarse soil textures, with the exceptions of surface hardness, subsurface hardness, soil respiration, and P, K and "minor elements" (plant micronutrients Fe, Mg, Mn, and Zn combined as one index), which are not adjusted for soil texture.Soil health indicator results from soil samples submitted for CASH analyses are compared with the cumulative normal distribution curve of the indicator and the reported health score is representative of where the measurement value of the soil indicator is located on the normalized cumulative distribution curve.For example, if a soil indicator measurement from a soil sample is better than 80% of all the samples used to create the curve, the soil sample would receive a score of 80 out of 100 for that indicator.This method is used to score all 12 indicators, and the overall CASH index score for a soil sample is reported as an unweighted mean of the 12 indicators, which is supplied along with soil management recommendations.
The CASH chemical assessment involves the modified Morgan extractant (McIntosh, 1969) to extract essential plant nutrients from the soil.The results are used to calculate four separate indices for pH, P, K, and minor elements.The physical soil health indicators are available water capacity (AWC) measured by volume (Reynolds and Topp, 2008), surface and subsurface hardness measured with a penetrometer (Duiker, 2002), and aggregate stability measured with a Cornell sprinkle infiltrometer (Ogden et al., 1997).Biological soil health indicators measured for the CASH are organic matter (OM) by loss on ignition (LOI) (Broadbent, 1965), active C or permanganate oxidizable C (Weil et al., 2003), soil respiration of CO 2 after 96-h incubation (Zibilske, 1994), and soil protein content via spectrophotometry (Wright and Upadhyaya, 1996), which are all related to soil microbial activity.

Statistical Analysis
A separate statistical analysis was conducted for each trial and soil testing method.Statistical analysis software (version 9.3, SAS Institute, Cary, NC) was used to perform an ANOVA for individual soil test parameters as they may have been affected by soil management treatments within the location.Differences in soil test parameters were analyzed using the proc glimmix procedure with block as a random effect.A similar ANOVA was conducted for each location to compare mean crop yields among treatments, for which the proc glimmix procedure was used with block and year as random effects in the model.Differences among treatments for each parameter were tested for significance at the 95% confidence level using the Scheffe means comparison test.For statistical comparisons of fertilizer recommendations from NCDA&CS and HSHT, the analysis was between the two tests instead of the treatments.Correlations between the most recent crop yields and CASH index scores were calculated using the graphing features of Microsoft Excel 2013 software (Microsoft Corporation, Redmond, WA).

RESULTS AND DISCUSSION
The NCDA&CS Soil Test Soil test results from NCDA&CS included several soil properties and elemental analyses, but those pertinent for this study were HM, cation exchange capacity (CEC) estimated via summation, pH (1:1 soil/water by volume), and Mehlich-3 P and K.All soil test parameters considered in the analyses are summarized with F-test p-values organized by soil test (Table 2).Other plant nutrients were not considered for statistical analysis because they are not typically limiting factors in North Carolina soils.
Soil test P concentrations were different among the mountain treatments only, with NTO having greater Mehlich-3 P concentrations than CTC (Table 3) because it received poultry litter that provided excess P beyond critical soil test values.Although the critical P concentration suggested by NCDA&CS is 120 kg P ha -1 , previous research on mountain soils (Cahill et al., 2013) and soil test P research (Cox, 1992) have suggested that the critical levels are between 31 and 47 kg P ha -1 , and thus P was sufficient for all treatments at all locations.Soil P fertility has been linked to tillage because soil-adsorbed P tends to be depleted by tillage with greater erosion potential (Alberts and Moldenhauer, 1981;Gaynor and Findlay, 1995).The fact that tillage was not a significant factor controlling Mehlich-3-extractable soil P at any location was expected, because all plots had been continuously fertilized according to NCDA&CS recommendations for crop nutrients.
Both the coastal plain and mountain trials had different Mehlich-3-extractable K concentrations among treatments.The organic treatments (CTO1 and CTO2) of the coastal plain had greater K concentrations than the CTC and NTC treatments.Likewise, at the mountain location, NTO and CTO had greater K concentrations than CTC, but NTO had similar K concentrations to CTC and CTX.No differences in K availability were observed in the piedmont, whereas the average soil test P of NTC and CPF was slightly below the 195 kg K ha -1 recommended by NCDA&CS, but only slightly, again demonstrating that the fertility of major plant nutrients was not a significantly limiting factor in the trials.
Treatments of the coastal plain and piedmont trials had similar CEC, but the CEC of mountain treatments was statistically greater in NTO than in NTC, CTC, and CTX.Because the CEC reported by NCDA&CS is a summation of the number of cations detected in the soil solution, it is not a direct measure of CEC.The addition of cations from organic fertilizers and liming agents can add cations that are subsequently detected and measured as CEC, but are not completely representative of the soil's capacity to hold cations.The mountain NTO treatment may have had a statistically greater CEC than the other treatments because it received additional cations in the form of organic fertilizer that was not incorporated into the soil, not because of the inherent ability of the soil to hold cations.
Soil HM, an organic C fraction sometimes correlated to soil organic matter (Kononova, 1966), was not different among the treatments (Table 3).The piedmont NT and MPDF treatments represented two tillage extremes that could influence OM content, but they did not differently affect soil HM as measured by the NCDA&CS soil test.Reduced tillage was expected to allow organic substances to accumulate in the soil, but the NCDA&CS analysis showed no consistent trend of increasing HM with reduced tillage.Neither tillage nor management produced statistical differences in HM among the treatments in the trials, which meant that HM was not a differentiating parameter in these management systems.Soil management effects on soil C content can vary, but one recurring theory is that tillage does not reduce soil organic C but instead redistributes it among tillage depths, which means that sampling depth could greatly affect soil C analyses (D'Haene et al., 2009).Although this is a possible explanation for why HM may not have differentiated among tillage treatments, some research indicates that the HM analysis itself may not be a reliable indicator for soil C content (Lamar and Talbot, 2009) and may not accurately represent soil OM.

Haney Soil Health Testing
The HSHT, as mentioned previously, was only used in the mountain trial and three treatments at the piedmont.The piedmont treatments (NTC, CPDS, MPDS) were selected to represent varying degrees of soil tillage; fall tillage treatments were not included because no yield differences between spring and fall tillage systems occurred during the trial (Meijer et al., 2013).Soil nutrient availability detected by the HSHT was similar to NCDA&CS soil testing results, with the mountain treatments NTO and CTO having more P and K than the conventional treatments (Table 4).Organic treatments probably contained more nutrients because of the excess P and K typically found in organic fertilizers applied to satisfy N requirements (Smith et al., 1998;Larsen et al., 2014;Edgell et al., 2015).
Many soil extractants cannot simultaneously measure N and other plant nutrients (Holford, 1980), but the H3A extractant does not have this limitation (Haney et al., 2006).Measured values of NO 3 -N and NH 4 -N from the H3A extractant as well as predicted mineralized organic N represent the total N measurement of the HSHT.Both the CTC and CTO treatments of the mountain trial had greater total soil N than CTX, but not significantly more than the CTC and CTO tillage treatments (Table 4).
The amount of WEOC in soils was not statistically different among treatments at either location (Table 4).The amount of WEON measured in the mountain treatments was greater in soil from NTO than from CTO and CTX, but not CTC (Table 4).In the piedmont trial, more WEON was found in soil from NTC than in MPDS but not CPS.Plant residue contributes  more organic C and N when left on the soil surface than when incorporated by tillage (Smith and Sharpley, 1993;Chen et al., 2014), and the treatments had increasing WEON with decreasing tillage intensity, but the differences among tillage effects were not great enough at either location to differentiate no-till management from all other tillage.The piedmont and mountain trials were conducted for over two decades before this analysis, but research suggests that it may take centuries for crop residue deposition to significantly increase soil OM in some areas (Poeplau and Don, 2015;Strickland et al., 2015).
The amounts of CO 2 -C released from soil of the mountain treatments ranged from 48 to 158 mg L -1 on average, but large variability in CO 2 -C release resulted in no statistical differences among treatments (Table 4).At piedmont locations, CO 2 -C was statistically greater in NTC than in CPDS and MPDS.Microbial release of soil C can greatly vary depending on management, soil composition, soil collection methods (Rochette et al., 1991), and laboratory methodology (Parkin and Kaspar, 2004).Measurements of CO 2 -C released from soil have been correlated to soil biological activity (Franzluebbers et al., 2000;Wang et al., 2003), and CO 2 -C is likely to evolve more quickly in soil with more organic C and N being available to microbes (Touratier et al., 1999;Nguyen and Marschner, 2016).Although there was more CO 2 -C from NTC in the piedmont trial, variability in the soil CO 2 released from treatments in the mountain trial meant that soil management was not consistently influencing soil biological activity as measured by microbial respiration.
The HSHT uses CO 2 -C, WEOC, and WEON measurements to calculate soil health scores for soils.The average HSHT scores for mountain treatments ranged from 7 to 21, but these scores were statistically similar because of the large variability among replicate samples within treatments (Table 4).Piedmont HSHT scores ranged from 5 to 16 on average, and NTC scored higher than both CPDS and MPDS.Because a score of seven or greater from the HSHT is considered to be good soil health, all treatments except MPDS in the piedmont trial were considered to be "healthy" on average.Interpreting the practicality of HSHT scores is difficult, however, because CO 2 -C release from soils as well as soil organic C and organic N pools can have significant temporal and spatial variability (Rochette et al., 1991).Soil texture also influences CO 2 -C release (Franzluebbers et al., 2000), but it is not a factor considered in the HSHT scoring.

The NCDA&CS and HSHT Nutrient Recommendations
The NCDA&CS test does not use its standard soil test to provide soil-specific N fertility recommendations.Instead, N recommendations in North Carolina are made by using the realistic yield expectation (RYE) tool, which is regularly updated to provide a realistic N rate based on soil mapping units and crop species (North Carolina Interagency Nutrient Management Committee, 2014).The HSHT samples were collected in the fall of 2014 and NCDA&CS samples were collected in the fall of 2015, but it was not expected that soil properties, including fertility, would significantly vary over the course of a single year.Expected yields from the RYE tool were entered into the HSHT tool, which recommends fertilizer based on expected yields and concentrations of soil nutrients extracted via H3A.The RYE tool recommended 129 kg N ha -1 for the piedmont corn crop and 203 kg N ha -1 for the mountain corn crop, and the HSHT recommended-for the same crop-a range between 121 and 139 kg N ha -1 for piedmont corn and 205-246 kg N ha -1 for mountain corn on average, which were statistically different for most of the treatments (Table 5).Fertilizer N recommendations from the HSHT are based on the amount of water-extractable NH 4 -N, water-extractable NO 3 -N, WEON, and microbial activity contributing to total N, which means that the HSHT fertilizer N recommendations are more dynamic than those in the RYE database.Microbial activity and soil organic N are considered by the HSHT to be significant factors contributing N to crops.The amount of N fertilizer recommended for corn production in the mountain trial was probably greater than the piedmont N fertilizer recommendation because the cooler mountain climate is favorable for greater corn yields, and the RYE database reflects this.Soil test P was variable in both tests, which resulted in only MPDS having a greater P recommendation from the HSHT than from the NCDA&CS (Table 5).The amount of P fertilizer recommended by the HSHT is based on the amount of extracted inorganic P and organic P predicted by soil organic C to N ratios, which means that more P will generally be in soils receiving more organic material.The recommendations for K fertilizer differed for several treatments, with NCDA&CS recommendations generally being greater than the HSHT recommendations.The differences in recommendations may be because the H3A extract lacks calibration for diverse soil conditions, whereas the Mehlich-3 extractant used by NCDA&CS has been tested and calibrated to predict plant uptake of P and K during a growing season (Indiati et al., 1997).

Comprehensive Assessment of Soil Health
Data from measurements of the soil health indicators are provided along with the CASH soil health index scores (Supplemental Table S1).There are 12 separately rated soil health indicators included with the CASH, and their index scores are averaged to calculate an overall soil health index score.Index scores range from 0-100, with 0 described as "very low" soil health by the assessment, 100 described as "very high" soil health, and intermediate descriptors of "low", "medium", and "high" between them (Moebius-Clune et al., 2016).The indicators are grouped into chemical, biological, and physical categories.Like the HSHT and NCDA&CS soil tests, plant nutrients were sufficient in soils at each location (Table 6).There were no statistical differences among treatments for any of the chemical soil health indicators (pH, P, K, or minor elements).The CASH index scores for pH ranged from 2 to 23 for coastal plain, 10 to 58 for piedmont, and 6 to 46 for mountain soil, but the results were too variable to differentiate among treatments.The NCDA&CS recommends a soil pH of 6.0 for corn and soybean crops grown in North Carolina, which many of the treatments averaged (Table 3), but the CASH index scores for treatments with soil pH < 6.0 indicated that soil pH was too low in these soils.Soil pH in these trials are not likely to limit crop growth, as they are considered appropriate for southern soils, which indicates that the scale at which the CASH is evaluating pH is not well adjusted for North Carolina soil conditions.Concentrations of nutrients extracted by the modified Morgan extractant have a good correlation to the concentrations of nutrients extracted via Mehlich 3 (Ketterings et al., 2002), but there is no reliable conversion of soil test results between the two methods when different soil textures are considered.Soil extractants should be matched to the geographic area of the soils for which they have been developed and calibrated; otherwise, additional considerations will be needed for evaluating soil health across different regions (Ketterings et al., 2002;Herlihy et al., 2006).
Physical soil health indicators of the CASH were AWC, surface hardness, subsurface hardness, and aggregate stability.Tillage effects on soil penetration were inconsistent and there were no statistical differences among treatments at any location despite the large ranges in treatment means (Table 6).Subsurface hardness scores were also similar among treatments.Even no-till treatments failed to obtain better surface and subsurface hardness ratings from the CASH, which may indicate that the hardness test is not sensitive or consistent enough to differentiate management effects on the soils in the trials.Surface crusting typically develops in intensive tillage systems without residue cover (Pagliai et al., 2004), but crusting has also been observed in no-till systems to varying degrees (Rosa et al., 2013).In the piedmont trial, more soil crusting was observed in soil from tilled treatments without residue cover, but bulk density from 0 to 10 cm was lower when soil was tilled (Cassel et al., 1995), which complicates interactions between soil physical properties and their potential benefits to soil health.Subsurface hardening can result from compaction by machinery or plowing (Torbert and Reeves, 1995;Birkás et al., 2004), but with no statistical dif-  ferences seen among physical soil health indicator scores for tillage treatments, there was no measured soil management effect on subsurface hardness even after many years of cropping on the mountain, piedmont, and coastal plain soils in North Carolina.
All treatments at each location scored "very low" for aggregate stability (Table 6), which is in the 0 to 40 range of the CASH index.No-till management is typically associated with improved soil structure (Carter, 2002;Bronick and Lal, 2005), so it was expected that no-till treatments would perform significantly better than tillage treatments, but management was not differentiated by the aggregate stability test.Index scores for AWC were all within the medium (55-70) to high (70-85) range of the CASH index (Table 6), but were not statistically different among treatments within any of the three locations.Conservation tillage was implemented in each of the trials, but after many years of consistent management, this was still not differentiated from conventional tillage according to the physical soil health indicator measurements of the CASH.
The biological soil health indicator component of the CASH is comprised of OM, soil protein, soil respiration, and active C.No location had treatments with statistically different OM scores (Table 6) and, regardless of tillage or management, all treatments had "very low" (0-40) or "low" (40-55) soil health for OM according to the CASH index.Soil protein scores had no statistical differences among treatments within the coastal plain and mountain trials, but the soil protein scores of the piedmont soils were greater for NTC than for CPS, CPDS, MPDS, and MPDF (Table 6).
Soil respiration scores among treatments were statistically different only at the piedmont trial where soil respiration scores for DS and CPS were greater than that for MPDF (Table 6).Active C is different from soil respiration because it measures easily accessible C regardless of the current microbial activity in the sample.Unlike soil protein and soil respiration  scores, there was no statistical difference among active C scores for piedmont treatments, which varied among replications of each treatment (Table 6).There were, however, statistical differences among active C scores for treatments in the coastal plain and mountain trials.Active C scores for coastal plain treatments were greater in NTC than in CTO1 and CTC, but not CTO2.
For treatments in the mountain trial, NTO had more active C than CTC and CTX, but not NTC or CTO.The overall trend in biological soil health indicator scores showed that biological activity increased with reduced tillage and often with organic soil amendments.There was no consistent differentiation among management systems across the three trials in North Carolina, however, and biological soil indicators for all treatments had average ratings of "very low" (0-40) or low (40-55) on the CASH index scale.
Overall soil health index scores from the CASH averaged between 38 and 46 for the coastal plain, 35 to 46 for piedmont, and 44 to 55 for mountain soil (Table 6).Very few of the individual soil health indicator component scores were statistically different among treatments, which is probably why overall CASH soil health index scores for the treatments were not statistically different for two of the three trials.All treatments in the coastal plain and piedmont trials were rated "very low" (0-40) or "low" (40-55) by the CASH index, but NTO in the mountain trial, which received a "medium" score of 55, was statistically greater than the score for NTC and CTX.
The only CASH index scores with statistical differences among management treatments were soil biological health indicator scores, but all of them received no better than "low" (0-40) scores on average, regardless of management.Because long-term trials were used in the research, the soils were assumed to be in a stable state that is representative of the long-term effects of soil management on the soil properties used to assess soil health, and the scores received by the CASH are likely to be representative of the equilibrium of soil management effects on soil health indicators.
Included with the CASH were soil management recommendations to implement practices that could ideally improve soil health scores.The recommended practices were already implemented in treatments that received less than ideal scores (NTC, NTO), so there was little more that could be done to improve the scores of individual treatments if the management followed only the soil health assessment guidelines.Soils in all the trials were managed to satisfy fertility requirements, so there was no concern about deficiencies in chemical soil health indicators.Physical soil health indicators did not differentiate among soil management systems, despite there being varying degrees of tillage, and biological soil health indicators were inconsistently affected by management.As a consequence of the chemical, physical, and biological ratings, the overall soil health scores did not differentiate among the agronomic systems.The CASH did not provide more soil management information than that gained from soil tests designed for nutrient recommendations.If many years of conservation tillage did not result in discernable soil health scores among different types of soils and agronomic management, the current standards for soil health indicators may encompass too broad of a range of soils or lack the sensitivity to differentiate between the effects of recommended soil management practices.
Soils formed in different climates have intrinsic limitations in their ability to improve under favorable conditions (Sanchez et al., 2003), which implies that different standards may need to be considered to evaluate soil health.Researchers continue to evaluate soil indicators to determine how they can best be used to improve recommendations (Morrow et al., 2016), and some research is already being conducted to improve the CASH system to account for soil variation because of regional factors (Congreves et al., 2015;Moebius-Clune et al., 2016).Changes to soil health evaluations should consider that soil indicators vary in their significance for soils in different regions.For example, North Carolina soils are typical of Ultisols in the southeastern USA that have low organic matter content and an acidic pH, and have been irreversibly weathered to retain fewer nutrients than soils in the northeastern US climate (Buol et al., 2011), where the CASH was developed.By adjusting soil health assessments for regional soil properties, recommendations for improving soil management may better differentiate among agronomic management systems and also consider their practical implications for soil productivity.

Implications of Soil C as a Soil Health Indicator
A common parameter of each soil test is a surrogate measurement of soil C content.Soil C is a vital component of soil systems because C cycling is the driving force of many soil ecological interactions (Goto et al., 1994).In many soil systems, C content is strongly correlated with microbial activity (Lavahun et al., 1996), which favors ecosystem processes.A soil C component included in a soil test can capture this important link between soil properties, but the measure of soil C availability that best differentiates among management systems has not been determined.Although soil C related to soil biological activity has been used to evaluate soil health, measurements of the same soil management systems tend to differ because of spatial and temporal variability of soil C content and microbial activity (Rochette et al., 1991).
In the case of soil C measurements used by soil tests considered in this study, HM (Table 3), CO 2 respiration (Table 4; Table 6), LOI (Table 6), and active C (Table 6) were the tests chosen to represent soil C. The value of the HM analysis used by NCDA&CS has been questioned in other research because of its inconsistent correlation with other soil C measurements (Lamar and Talbot, 2009).No statistical differences in HM content were observed for treatments in any of the trials, and this may be because HM is not a strong indicator of soil organic C. The OM content determined by the CASH also did not differ among treatments at any location.The LOI technique used by the CASH can be skewed because LOI generally lacks precision to quantify the organic C content of soils with less than 15% OM, which may cause soil organic C overestimations aris-ing from heating temperatures and durations that can volatilize other soil constituents along with soil C (Szava-Kovats, 2009;Huang et al., 2012;Hoogsteen et al., 2015).
When other soil C measurements were conducted, however, some separation among soil management practices was observed.Measurements of WEOC by the HSHT did not differentiate among treatments of the piedmont and mountain trials; however, the HSHT analysis of CO 2 -C release from soil managed by NTC in the piedmont trial was greater than that in soil managed via CPDS and MPDS (Table 4).Previous research from the piedmont trial revealed a greater abundance of fungal biomarkers in NTC (Muruganandam et al., 2009), and fungal activity is also associated with decomposition of recalcitrant forms of C that may be converted to CO 2 (Strickland and Rousk, 2010;Malik et al., 2016).Although, the HSHT CO 2 -C measurements did not show differences among the treatments in the mountain trial, previous research from there revealed differences in laboratory measurements of field-moist soil CO 2 respiration among treatments (Overstreet and Hoyt, 2008).Respiration previously measured from soil in the mountain trial was observed to be greater in the crop rows of organically managed plots than in crop rows of plots with synthetic chemical management, even though total C from the treatments was similar.It was suspected that additional C added from organic fertilizers stimulated biological activity.The CASH measurement of soil CO 2 respiration was not different among mountain treatments (Table 6) but did reveal that piedmont tillage treatments DS and CPS had greater CO 2 release than MPDF.Spatial and temporal variability in soil CO 2 -C evolution may be the reason why soil tests are inconsistent in detecting differences between management systems.
By using an elemental analyzer and chloroformfumigation-extraction, another soil organic C experiment conducted for the mountain trial detected more total soil organic C and microbial biomass C in NTO than in other treatments (Wang et al., 2011).A later experiment also revealed that NTO had more total C, high and light fraction particulate organic matter, and microbial biomass C than other treatments (Larsen et al., 2014).The analyses matched expectations that a combination of no-till and organic production would have more microbial activity and labile organic C than conventional tillage systems (Huggins et al., 1998;Zuber and Villamil, 2016).Active C results from the CASH analysis, however, showed that although active C was greater in NTO than in CTC and CTX (Table 6), it was not different from that in NTC and CTO.
The NCDA&CS soil test results showed no differences in HM among treatments (Table 3).The HSHT revealed more CO 2 -C was released from NTC than from tillage treatments at the same location (Table 4), but the CASH CO 2 -C analysis suggested otherwise (Table 6).Across all three soil tests, the only C measurement that was different among treatments of the mountain trial was active C (Table 6), but when other soil C research was conducted on the mountain trial soils there were statistical differences among treatments for other C pools.Although most statistical differences among treatments were found in soil C analyses, patterns of differentiation using soil C analyses were inconsistent in their ability to differentiate among management systems.Inconsistency in the ability of soil C measurements to differentiate among soil management practices is problematic for soil health tests because of the importance of soil C to ecosystem functions of soil properties associated with C pools.If measurements of soil C availability are inconsistent for soil management, then management recommendations are also likely to be inconsistent.Other factors may therefore need to be considered to improve soil health in those systems.

Crop Yields
The NCDA&CS, HSHT, and CASH soil tests only focus on soil properties, but crop yields can possibly be integrated with soil health indicators to provide a measure of sustainable agronomic management.Fields of the coastal plain and piedmont trials were typically used for producing corn and soybean, whereas sweetcorn was the predominant crop of the mountain trial.
Corn yields from the years 2002-2013 of the coastal plain trial were used in the analysis.Because corn was grown asynchronously across treatments, year-to-year variability may not be equally reflected in treatments, though overall yields during the period are still representative of management effects.Overall  mean yields from treatments in the coastal plain trial were not statistically different during the period 2002-2013 (Table 7).Mean corn yields ranged from 5463 kg ha -1 corn for CTC to 7105 kg ha -1 corn for CTO2.The RYE for the Wickam sandy loam soil series is 9604 kg ha -1 corn (North Carolina Interagency Nutrient Management Committee, 2014), which indicates that long-term soil management for corn yield from the coastal plain trial is not as productive as it could be.This is partially because soil management in the coastal plain trial has been hindered by challenges such as pest pressure and flooding.Average rainfall did not significantly vary over the course of the trial, so it was not expected that rainfall provided an advantage for yields in any particular year.
Corn yields from the piedmont trial from 1987 to 2015 were considered because the first 3 yr of yield data were not available.Although the trial involved an annual rotation of corn and soybean, there were some years when corn was grown in successive crop years for various reasons.In total, there were 17 corn harvests in the piedmont trial, and mean corn yield was different among treatments (Table 7).The greatest yielding treatment was NTC, which yielded 6516 kg ha -1 corn; the lowest yielding treatment was MPDF, which yielded 3374 kg ha -1 corn.Corn yield from NTC was similar to that of in-row subsoiling in spring and CPF, but was greater than that for all other tillage treatments.The Toast coarse sandy loam soil series within the piedmont is predicted by RYE to yield 7846 kg ha -1 (North Carolina Interagency Nutrient Management Committee, 2014).Although long-term yields have averaged less than RYE, yields in individual years of the trial have exceeded RYE (data not shown).Thus it is possible for management conditions in the piedmont trial to produce the yields expected for the soil series, despite the soil receiving CASH index scores that could imply otherwise (Table 6).
The NTC treatment at the piedmont trial averaged from the 10 soybean crop years from 1990 to 2014, yielded 2832 kg ha -1 (Table 7).Other tillage treatments had soybean yields that were statistically similar to that of NTC, with the exception of MPDS and MPDF, which yielded 1991 and 1942 kg ha -1 corn, respectively.The Toast coarse sandy loam soil series is predicted by RYE to yield 2556 kg ha -1 soybean (North Carolina Interagency Nutrient Management Committee, 2014), which means that long-term soybean yield from the NTC management is exceeding expectations for the soil series, despite the soils receiving low soil health scores from the CASH.
Sweetcorn was first planted in 1997 and 1998 for the mountain trial, but was not planted again until the continuous sweetcorn crop that lasted from 2007 to 2013.Mean sweetcorn yields for the trial ranged from the lowest at 9200 kg ha -1 sweetcorn for NTO to the highest at 17,283 kg ha -1 sweetcorn for NTC (Table 7).Long-term average yields from NTC and CTC were statistically greater than those from NTO and CTO.CTX plots were not included in the analysis because they produced no yield because of the lack of management.No RYE exists for sweetcorn grown in this soil series, so there is no direct yield comparison. Yield differences in the mountain trial were attributed to difficulties with preventing weed competition and pests, which decreased yields in organic treatments, which had with limited management options (Larsen et al., 2014;Edgell et al., 2015).Management was especially challenging in the NTO treatment, which was the least productive treatment of the mountain trial.Although NTO management had the highest CASH overall soil health index score on average (Table 6), because of management issues, the measured health of the soil was not reflected in crop yields.
The soil health philosophy promotes sustainable soil management practices that are believed to be beneficial for environmental health and agricultural productivity.Soil management in the trials included conservation practices such as long-term no-till agriculture and cover cropping, both of which are recommendations to improve soil health as defined by CASH (Gugino et al., 2009;Moebius-Clune et al., 2016).Based on the scores the soils received from the CASH, however, there was no correlation between the ability for long-term soil conservation to achieve acceptable soil health scores and high agricultural productivity (Fig. 2).Recent corn yields from plots of the piedmont nine-tillage trial did not significantly correlate with CASH soil health scores (r 2 = 0.01) and neither did sweetcorn yields from plots of the mountain trial (r 2 = 0.10).The coastal plain data were not used to correlate recent crop yields and recent soil health scores because of the asynchronous cropping design of the trial.Even though soil health assessments did not clearly differentiate among agronomic management systems, the yield results showed that conservation tillage frequently produces greater yields when used in combination with proper fertilizer and pesticide inputs.Given that these results are consistent with the recommended best management practices, it may be useful to consider potential crop yields in soil health assessments for agronomic management.

CONCLUSION
We submitted soil samples to NCDA&CS, the HSHT, and the CASH to assess how their analyses and soil management recommendations related to each other and whether soil testing could differentiate among soil health indicators in diverse agronomic systems and regions in North Carolina.The tests used different extractants and methodologies to evaluate soils, and none returned results with practical or quantifiable differences among management practices used on different soils.Only no-till organic management received an adequate soil health score from the CASH, and all but moldboard plowing had average scores of good soil health with the HSHT, which places these two methods at odds with each other in their soil health evaluations.There may be substantial variability in soil properties regardless of agronomic management or it could be that measurements of soil health indicators may not be sensitive enough to differentiate among soil management effects on properties of soils with different compositions.Because soil management effects on soil health indicators used in current soil health testing are limited by intrinsic soil properties, a weighted system of soil health indicators depending on soil and climate may make soil health assessments more practical for various soil uses.Most recommendations for improving the soil health focus on conservation tillage practices like no-till and cover cropping, but the yields we observed from our conservation tillage systems were not consistently different from conventional tillage.Soil health management recommendations for agronomic systems need to be adjusted to account for differences in intrinsic soil properties that are currently incapable of supporting a broad standard for soil health.

Fig. 1 .
Fig. 1.Map of agronomic trials located across North Carolina.The trial locations are indicated by stars.To the left is Mills River in the mountains, in the middle is Reidsville in the piedmont, and to the right is Goldsboro in the coastal plain

Table 3 .
Abridged soil test results from the North Carolina Department of Agriculture and Consumer Services soil testing laboratory organized by location.Soil management treatments include no-till chemical (NTC), no-till organic (NTO), conventional tillage chemical (CTC), conventional tillage organic (CTO), conventional tillage without fertilizer or pesticides (CTX), in-row subsoiling (IRS), disking in spring (DS), chisel plowing in fall (CPF) or spring (CPS), chisel plowing and disking in fall (CPDF) or spring (CPDS), and moldboard plowing and disking in fall (MPDF) or spring (MPDS).Treatments are compared within each site using the Scheffe means comparison test at p = 0.05.Numbers in parentheses are SD of the means and letters indicate groupings for statistical differences.
2 respiration to calculate a health score for soils.Treatments are compared within each site using the Scheffe means comparison at p = 0.05.Numbers in parentheses are SD of the means and letters indicate groupings for statistical differences.

Table 6 .
Cornell Comprehensive Assessment of Soil Health (CASH) soil indicator index scores for each parameter included in the standard analysis grouped by location.Soil management treatments include no-till chemical (NTC), no-till organic (NTO), conventional tillage chemical (CTC), conventional tillage organic (CTO), conventional tillage without fertilizer or pesticides (CTX), in-row subsoiling (IRS), disking in spring (DS), chisel plowing in fall (CPF) or spring (CPS), chisel plowing and disking in fall (CPDF) or spring (CPDS), and moldboard plowing and disking in fall (MPDF) or spring (MPDS).Treatments are compared within each site using Scheffe means comparison test at p = 0.05.Numbers in parentheses are SD of the means and letters indicate groupings for statistical differences.
b † AWC, available water capacity.‡ The score for minor elements is the mean of subscores for Fe, Mg, Mn, and Zn soil concentrations.§The overall score is an unweighted average of the 12 individual indicator index scores.

Fig. 2 .
Fig. 2. Correlation between the Cornell comprehensive assessment of soil health (CASH) overall soil health scores and recent crop yields (Mg ha -1 ) for soils of the piedmont (a) and mountain (b) trials.Each solid circle on the graph represents an individual research plot.

Table 2 . The p-values for the F-test statistics of select soil parameters of soil anal- yses from the North Carolina Department of Agriculture and Consumer Services (NCDA&CS), Haney soil health test (HSHT), and Cornell comprehensive assess- ment of soil health (CASH).
† WEOC, water extractable organic C; WEON, water extractable organic N; NA, not assessed.