About Us | Help Videos | Contact Us | Subscriptions
 

Journal of Environmental Quality - Surface Water Quality

Watershed Regressions for Pesticides (WARP) Models for Predicting Stream Concentrations of Multiple Pesticides

 

This article in JEQ

  1. Vol. 42 No. 6, p. 1838-1851
    unlockOPEN ACCESS
     
    Received: May 09, 2013
    Published: June 25, 2014


    * Corresponding author(s): wwstone@usgs.gov
 View
 Download
 Alerts
 Permissions
Request Permissions
 Share

doi:10.2134/jeq2013.05.0179
  1. Wesley W. Stone *a,
  2. Charles G. Crawforda and
  3. Robert J. Gilliomb
  1. a U.S. Geological Survey, 5957 Lakeside Blvd., Indianapolis, IN 46278
    b U.S. Geological Survey, Placer Hall, 6000 J St., Sacramento, CA 95819

Abstract

Watershed Regressions for Pesticides for multiple pesticides (WARP-MP) are statistical models developed to predict concentration statistics for a wide range of pesticides in unmonitored streams. The WARP-MP models use the national atrazine WARP models in conjunction with an adjustment factor for each additional pesticide. The WARP-MP models perform best for pesticides with application timing and methods similar to those used with atrazine. For other pesticides, WARP-MP models tend to overpredict concentration statistics for the model development sites. For WARP and WARP-MP, the less-than-ideal sampling frequency for the model development sites leads to underestimation of the shorter-duration concentration; hence, the WARP models tend to underpredict 4- and 21-d maximum moving-average concentrations, with median errors ranging from 9 to 38% As a result of this sampling bias, pesticides that performed well with the model development sites are expected to have predictions that are biased low for these shorter-duration concentration statistics. The overprediction by WARP-MP apparent for some of the pesticides is variably offset by underestimation of the model development concentration statistics. Of the 112 pesticides used in the WARP-MP application to stream segments nationwide, 25 were predicted to have concentration statistics with a 50% or greater probability of exceeding one or more aquatic life benchmarks in one or more stream segments. Geographically, many of the modeled streams in the Corn Belt Region were predicted to have one or more pesticides that exceeded an aquatic life benchmark during 2009, indicating the potential vulnerability of streams in this region.


Abbreviations

    ALB, Aquatic-Life Benchmark; ARP, Acetochlor Registration Partnership; CCC, Criterion Continuous Concentration; CMC, Criteria Maximum Concentration; EPTC, S-ethyl N,N-dipropylcarbamothioate; LTMDL, long-term method detection limit; NASQAN, National Stream Quality Accounting Network; NAWQA, National Water-Quality Assessment; NCWQR, National Center for Water Quality Research; PI, prediction interval; pR2, pseudo coefficient of determination; RF1, River Reach File 1; SWMI, surface-water mobility index; WARP, Watershed Regressions for Pesticides; WARP-MP, Watershed Regressions for Pesticides for multiple pesticides

Pesticide concentrations in streams vary widely across the United States because factors influencing pesticide transport to streams, such as pesticide use, application practices and timing, climate, and watershed characteristics, vary geographically (Gilliom et al., 2006). Direct measurement of pesticide concentrations in all the streams in the United States would be ideal, but the high cost of monitoring and analysis has resulted in only a small number of streams having direct measurements. The lack of adequate data on pesticide concentrations in most streams creates a need for tools that can be used to predict pesticide concentrations for streams that are not sufficiently monitored. Modeled predictions cannot replace direct measurements when reliability requirements are high; however, they are useful for screening-level assessments and preliminary analyses, identifying vulnerable watersheds, and guiding more intensive monitoring when greater reliability is required.

Pesticide concentrations in unmonitored streams can be estimated with process-based models, but development of these models for the many unmonitored streams in the United States would be very time and resource intensive, and performance would be difficult to evaluate. Empirical regression models, based on relations between (i) pesticide concentrations observed at water-quality monitoring sites and (ii) selected watershed characteristics for those sites, are less time and resource intensive and can be readily evaluated because they are based on direct measurements of pesticide stream concentrations. Larson and Gilliom (2001) developed empirical regression models, called Watershed Regressions for Pesticides (WARP), for five herbicides (alachlor, atrazine, cyanazine, metolachlor, and trifluralin). Larson et al. (2004) and Stone and Gilliom (2009, 2012) further developed the WARP models for atrazine by updating and expanding potential explanatory variables and adding regional modifications. The WARP models can be used for predicting selected concentration statistics for unmonitored streams (including confidence bounds on predictions), for predicting the probability that concentrations will exceed a specified level at an unmonitored stream site, and for identifying vulnerable watersheds for more intensive study in areas with inadequate direct measurements of concentrations (USEPA, 2007).

Since the report by Larson and Gilliom (2001), most of the attention for WARP development has been directed toward refining the atrazine WARP models; however, there is a continuing need for the same type of models for other pesticides. The development of WARP models for each individual pesticide, such has been done for atrazine, is confounded by the very large percentage of censored (nondetections) stream concentration data for the various pesticides. Frequently, more than 75% of the stream concentration data for a pesticide are censored. When the majority of the stream concentration data are censored, the relation between stream concentrations and watershed characteristics is dominated by the censored stream concentration data and may produce biased results for the relatively few uncensored stream concentration data. An alternative approach to multiple single-pesticide WARP models is to develop a single multipesticide model that includes one or more adjustment factors that characterize the physical and chemical properties of each compound that may affect transport of the pesticide from land surface to streams.

The primary objective of this paper is to describe the development and application of WARP models for multiple pesticides (WARP-MP) that can be used to predict selected concentration statistics for a wide range of pesticides in unmonitored streams. The approach taken was to develop the multipesticide model as an adjustment to the atrazine WARP model by including factors that incorporate the properties of the individual pesticides. A second objective, therefore, was to update the national atrazine WARP models by including additional stream concentration data and new potential explanatory variable data so that the atrazine model was refined before being applied as part of the multipesticide model. The last review and update of the national atrazine WARP models was in 2009 (Stone and Gilliom, 2009), and the models need to be reevaluated periodically to include new, potentially significant watershed characteristic information, new monitoring sites, and new data for previous model development sites. The advancement of ancillary data collection and processing techniques over time provides new or updated watershed characteristic information that may not have been available for evaluation in the previous models. For example, Stone and Gilliom (2012) included the depth to a soil restrictive layer as a potential explanatory variable (not available for Stone and Gilliom [2009]) in the atrazine Corn Belt WARP models, and this variable was found to be highly significant.


Materials and Methods

Pesticides Selected for Modeling

Pesticides with nonagricultural use estimated to be greater than 10% of their national total use (USEPA, 2012a) were excluded because nonagricultural pesticide use information was not available. Estimated agricultural use and concentration data for 30 pesticides were used in model development, and six additional pesticides were used to evaluate model performance (Table 1).


View Full Table | Close Full ViewTable 1.

Pesticides used in development and evaluation of WARP-MP.

 
Common name Chemical name
Acetochlor 2-chloro-N-(ethoxymethyl)-N-(2-ethyl-6-methylphenyl)acetamide
Acifluorfen 5-[2-chloro-4-(trifluoromethyl)phenoxy]-2-nitrobenzoic acid
Atrazine 6-chloro-N-ethyl-N′-(1-methylethyl)-1,3,5-triazine-2,4-diamine
Bentazon 3-(1-methylethyl)-1H-2,1,3-benzothiadiazin-4(3H)-one 2,2-dioxide
Bromoxynil 3,5-dibromo-4-hydroxybenzonitrile
Butylate S-ethyl N,N-bis(2-methylpropyl)carbamothioate
Carbofuran 2,3-dihydro-2,2-dimethyl-7-benzofuranyl N-methylcarbamate
Chlorimuron ethyl ethyl 2-[[[[(4-chloro-6-methoxy-2-pyrimidinyl)amino]carbonyl]amino]sulfonyl]benzoate
Cyanazine 2-[[4-chloro-6-(ethylamino)-1,3,5-triazin-2-yl]amino]-2-methylpropanenitrile
EPTC S-ethyl N,N-dipropylcarbamothioate
Ethalfluralin N-ethyl-N-(2-methyl-2-propenyl)-2,6-dinitro-4-(trifluoromethyl)benzenamine
Ethoprop O-ethyl S,S-dipropyl phosphorodithioate
Fluometuron N,N-dimethyl-N′-[3-(trifluoromethyl)phenyl]urea
Fonofos O-ethyl S-phenyl P-ethylphosphonodithioate
Linuron N′-(3,4-dichlorophenyl)-N-methoxy-N-methylurea
Methomyl methyl N-[[(methylamino)carbonyl]oxy]ethanimidothioate
Metolachlor 2-chloro-N-(2-ethyl-6-methylphenyl)-N-(2-methoxy-1-methylethyl)acetamide
Nicosulfuron 2-[[[[(4,6-dimethoxy-2-pyrimidinyl)amino]carbonyl]amino]sulfonyl]-N,N-dimethyl-3-pyridinecarboxamide
Oxamyl methyl 2-(dimethylamino)-N-[[(methylamino)carbonyl]oxy]-2-oxoethanimidothioate
Pebulate S-propyl N-butyl-N-ethylcarbamothioate
Phorate O,O-diethyl S-[(ethylthio)methyl] phosphorodithioate
Pronamide 3,5-dichloro-N-(1,1-dimethyl-2-propynyl)benzamide
Propachlor 2-chloro-N-(1-methylethyl)-N-phenylacetamide
Propanil N-(3,4-dichlorophenyl)propanamide
Propargite 2-[4-(1,1-dimethylethyl)phenoxy]cyclohexyl 2-propynyl sulfite
Terbacil 5-chloro-3-(1,1-dimethylethyl)-6-methyl-2,4(1H,3H)-pyrimidinedione
Terbufos S-[[(1,1-dimethylethyl)thio]methyl] O,O-diethyl phosphorodithioate
Thiobencarb S-[(4-chlorophenyl)methyl] N,N-diethylcarbamothioate
Triallate S-(2,3,3-trichloro-2-propen-1-yl) N,N-bis(1-methylethyl)carbamothioate
Trifluralin 2,6-dinitro-N,N-dipropyl-4-(trifluoromethyl)benzenamine
Pesticides used for model evaluation
Alachlor 2-chloro-N-(2,6-diethylphenyl)-N-(methoxymethyl)acetamide
Benomyl methyl N-[1-[(butylamino)carbonyl]-1H-benzimidazol-2-yl]carbamate
Methyl parathion O,O-dimethyl O-(4-nitrophenyl) phosphorothioate
Metribuzin 4-amino-6-(1,1-dimethylethyl)-3-(methylthio)-1,2,4-triazin-5(4H)-one
Norflurazon 4-chloro-5-(methylamino)-2-[3-(trifluoromethyl)phenyl]-3(2H)-pyridazinone
Oryzalin 4-(dipropylamino)-3,5-dinitrobenzenesulfonamide

Concentration Statistics Selected for Modeling

The annual mean and maximum moving-average durations of 4, 21, 30, 60, and 90 d were selected to correspond to exposure durations used to estimate environmental concentrations for comparison to water-quality benchmarks for human health or aquatic life. These comparisons are intended to be used in the identification of vulnerable watersheds for more intensive study. The observed annual maximum concentration, typically used for comparison to acute benchmarks, was not selected for modeling. Observed annual maximum concentrations from sites with sampling frequencies of those used in the development of the WARP and WARP-MP models are consistently biased low compared with observations from more frequently sampled sites (Stone et al., 2008). In addition, annual maximum concentrations derived from discrete observations at a site are likely underestimates of the true annual maximum concentrations because of the limitations of any sampling frequency. The degree of uncertainty in relation to the difference between the measured and actual annual maximum concentration was considered to be too large to include this concentration statistic in the modeling.

Site Selection

Pesticide concentration data from stream sites sampled by the U.S. Geological Survey (USGS) National Water-Quality Assessment (NAWQA) and National Stream Quality Accounting Network (NASQAN) programs were used for model development. Data were collected between 1992 and 2007. This study was limited to stream monitoring sites with pesticide concentration data suitable for estimating concentration statistics, such as the annual mean and maximum moving-averages (4-, 21-, 30-, 60-, and 90-d durations). Ideally, a candidate site would have been sampled daily, at least during runoff events throughout the period of high pesticide use, to accurately characterize short-term concentrations, such as the annual maximum 4- and 21-d moving-average (USEPA, 2010). However, there are few sites with such high sampling frequencies. The selection criteria for model development sites balanced the goals of (i) restricting sites to those with sufficient concentration data determined adequate to estimate annual concentration statistics and (ii) retaining a large number of sampling sites. Site selection criteria were based on Larson et al. (2004) and focus on watershed area and sampling frequency during different pesticide runoff periods (Fig. 1).

Fig. 1.
Fig. 1.

Site selection criteria.

 

The pesticide runoff periods (Table 2) used to select sites were determined in consultation with USGS personnel familiar with different regions of the country. Each sampling site was assigned to one of seven pesticide runoff groups (Table 2). The same group was used for all sites in a NAWQA study unit (USGS, 2013). Low-, medium-, and high-runoff periods were chosen to reflect qualitatively the likelihood of pesticides being applied and transported by streams during a given period in a given area of the country. The groupings in Table 2 differ slightly from those used in previous work (Larson et al., 2004).


View Full Table | Close Full ViewTable 2.

Pesticide runoff periods used to screen sampling sites for developing regression models.

 
Pesticide-runoff group Low period Medium period High period
A Nov.–Dec. Jan., Feb., Aug.–Oct. Mar.–July
B Oct.–Dec. Jan., Sept. Feb.–Aug.
C Oct.–Jan. Feb., Aug., Sept. Mar.–July
D Oct.–Feb. Aug., Sept. Mar.–July
E Oct.–Mar. Aug., Sept. Apr.–July
F Oct.–Mar. Apr., Aug., Sept. May–July
G Aug.–Apr. May–July none
H Sept.–May June–Aug. none

More samples were required to calculate concentration statistics for small streams than large streams because of the greater temporal variability in small streams (Crawford, 2004). The maximum gap allowed between samples was longer for large streams than for small ones and for periods when pesticide concentrations were expected to be lower and less variable (for example, during the winter in northern states when pesticide use and runoff are low). During high pesticide runoff periods, the average gap between sampling dates was about 13 d for the WARP and WARP-MP model development data. The average gap between sampling dates for small and medium drainage area model-development sites was between 9 and 10 d, whereas the gap for large and very large drainage area model-development sites was about 16 d. The sampling frequency for the model development sites is generally expected to provide estimates of short-duration maximum moving-average concentrations, such as the 4- and 21-d models, that underestimate the actual stream concentrations; however, concentration statistics of longer duration are less likely to be underestimated (Crawford, 2004; Stone et al., 2008). Maximum moving-average concentrations calculated from samples collected as frequently as every 4 d during the high pesticide runoff period can be biased low when compared with estimates calculated from data collected more frequently and targeting runoff events (Lerch et al., 2011; USEPA, 2007; Crawford, 2004).

To illustrate the potential underestimation of atrazine concentration statistics calculated for model development sites, the sampling frequency simulation analysis described in Stone et al. (2008) was modified to evaluate 4- and 21-d maximum moving-average concentration estimates based on a 9- to 11-d sampling interval during the high pesticide runoff season. This analysis used four National Center for Water Quality Research (NCWQR) of Heidelberg University, Tiffin, Ohio sites that typically had a sampling frequency of 16 to 20 samples per month during the high pesticide runoff season. In addition, NCWQR collected multiple samples during selected days to characterize runoff events. The simulation procedure described in Stone et al. (2008) was followed with the exception of the interval between samples. For this analysis, the interval was from 9 to 11 d, randomly, between sampling dates during the high pesticide runoff season. This interval was chosen to represent the sampling frequency of small and medium drainage area model development sites.

The difference between the “true” concentration statistic derived from linearly interpolated NCWQR data for each year and site compared with the concentration statistic derived from a subsample of the linearly interpolated data is expressed as percent error. The drainage areas for the four NCWQR sites used in the simulation range from 90 km2 for Rock Creek to 16,400 km2 for Maumee River. For sampling frequencies typical of small drainage area model development sites (represented by Rock Creek), atrazine concentration statistics used in developing WARP were underestimated by a median error of 38% for the 4-d maximum moving-average and by a median error of 17% for the 21-d maximum moving-average (Supplemental Table S1). For some streams and rivers, the 4- and 21-d moving-average concentration statistics may be underestimated by as much as 84 and 73%, respectively. The sampling frequency simulation analysis provides only a general indication of the possible amount of underestimation that may be present in the model development data because four streams were used in the analysis, and these sites are not likely to be representative of all the stream conditions for the WARP and WARP-MP model development sites.

Atrazine data from 114 NAWQA and NASQAN sampling sites met the selection criteria and were used for development of the WARP models (Fig. 2). Data for pesticides other than atrazine from 122 NAWQA and NASQAN sampling sites were used in development of the WARP-MP models (Fig. 2). A complete list of sampling sites used in model development and evaluation is provided in Supplemental Table S2. The number of usable analyses per site for each pesticide varied because of missing data due to analytical problems; thus, the same number of observations for each pesticide was not available at all sites. Not all NAWQA and NASQAN sites that met the sampling frequency requirements were used because some sites represented the same area as another site. The NAWQA and NASQAN sites meeting selection criteria, but not included in the model development data, were used with model evaluation.

Fig. 2.
Fig. 2.

Sampling sites used in development and evaluation of WARP and WARP-MP.

 

Sites used for evaluation of the WARP and WARP-MP models include 10 sites sampled by the Acetochlor Registration Partnership (ARP), six sites sampled by Bayer Crop Science, nine sites sampled by the NCWQR, five sites sampled by the Monsanto Company, five sites sampled by the Syngenta Crop Protection Atrazine Voluntary Monitoring Program, and 28 sites sampled by the USGS NAWQA and NASQAN programs (sites not used as part of the model development data). Sites sampled by ARP and industry were targeted at community water supplies and are not identified by name in this article. The sampling frequency at model evaluation sites sampled by all data sources except the ARP satisfied the criteria shown in Fig. 1. For the ARP sites used for model evaluation, 14 to 15 samples were collected during the 1-yr period. Samples were collected at ARP sites about once every 2 wk during the growing season and once every 2 mo during the rest of the year.

Calculation of Concentration Statistics

The annual mean and maximum moving-average pesticide concentrations of varying durations were calculated following the methods described in Larson et al. (2004) and Stone et al. (2008). Any annual concentration statistic calculated to be less than the long-term method detection limit (LTMDL) for the specific pesticide was considered censored at the LTMDL.

The annual mean concentrations were calculated by weighting each observed concentration according to the amount of time it was used to represent the pesticide concentration in the stream. Specifically, the weights were computed as the amount of time extending from one half the time interval between a value and the preceding value and one half the time interval extending from the value to the subsequent value divided by the total time in 1 yr. If less than 10% of the weighted data for a station and sampling year were censored, censored values were replaced by one half the censoring threshold reported by the laboratory (USEPA, 2000). If more than 10% of the weighted data were censored and there were at least 20 annual observations with at least 10 uncensored observations and at least 33% of the sample weights were represented by uncensored observations, then the log-regression method (Gilliom and Helsel, 1986; Helsel and Gilliom, 1986) was used to approximate the mean concentration. Otherwise, the mean was considered to be censored at the LTMDL.

For determining annual maximum moving-average pesticide concentrations, hourly pesticide concentrations were estimated for the entire period of record for each site through linear interpolation of actual observations. Censored observations were assigned a value of zero for the process of linear interpolation. The hourly concentration estimates were averaged to obtain an estimated daily concentration. The hourly estimates facilitated computations for days with multiple samples but were not used for other purposes. Moving-average concentrations for the selected durations (4, 21, 30, 60, and 90 d) were computed for each day. The annual maximum moving-average concentrations for each duration were then determined for each site–year combination meeting the selection criteria.

To include temporal aspects of pesticide use, climate, and stream pesticide concentrations, all years of pesticide stream concentration data meeting the selection criteria for a site were included in the analysis. Most sites (73%) had 3 yr or less of pesticide concentration data meeting the selection criteria, and only a few sites (3%) had >10 yr of data meeting the selection criteria (Supplemental Fig. S1). For the atrazine WARP models, there were 347 site and year combinations used for model development. To reduce the influence of unknown nonagricultural pesticide use on the WARP-MP models, combinations of pesticide, site, and year with zero estimated pesticide use were removed. For the WARP-MP models, there were 6235 site–pesticide–year combinations used for model development and evaluation.

Comparisons between predicted concentration statistics and concentration statistics computed from observations are made frequently in the discussion of model performance. Terms used for these comparisons are defined here for clarity. Concentration statistics generated by the WARP and WARP-MP models are referred to as “predicted concentration statistics,” and concentration statistics computed from observations are referred to as “observed concentration statistics.”

Watershed Characteristics Used as Explanatory Variables

The WARP model uses an empirical, multiple-regression modeling approach. A large number of variables that could reasonably affect pesticide transport and runoff are considered as potential explanatory variables. Detailed descriptions of the potential explanatory variables are provided in Supplemental Table S3. Potential explanatory variables representing pesticide use, land use and population, agricultural management practices, soil properties, physical watershed characteristics, weather and climate characteristics, and hydrological properties were considered.

Statistical Analysis

Some atrazine stream concentration statistics computed for some sites were less than the LTMDL and were considered censored. In the context of an explanatory model, a censored observation is one for which the value of the response variable corresponding to values of the explanatory variables is not observable but is known to be below a specified level. Conventional least-squares methods for estimating parameters of this model using the entire sample or the subsample of complete observations yield biased and inconsistent estimates (Judge et al., 1985). One way of expressing the regression model in the presence of censored observations is with the Tobit model (Judge et al., 1985). When the regression model residual errors are independent and are identically and normally distributed with mean zero and variance σε2, the maximum likelihood method can be used to obtain parameter estimates of a censored linear model (Maddala, 1983). The survreg procedure (Therneau, 1999) in the statistical analysis program S-PLUS (TIBCO Software Inc., 2008) was used to obtain maximum likelihood estimates of the parameters of the atrazine WARP models for this study.

Measures of goodness of fit, such as the standard deviation of the residual error (commonly referred to as the root mean square error) or the coefficient of multiple determination (R2) used for conventional least-squares regression analysis, cannot be computed for the Tobit regression model. The standard deviation of residual error is alternatively referred to as the “scale parameter” in maximum likelihood estimation. Estimates of the scale parameter from the maximum likelihood procedure provide only asymptotically unbiased estimates of the standard deviation of the residual error when estimated from sample data (Aitkin, 1981). These values, on average, underestimate the true standard deviation. The bias is a function of the sample size and degree of censoring, and it is expected to be lower for models based on data with lower percentages of censored observations. In the remainder of this article, biased estimates of the standard deviation of residual error are referred to as “scale” in figures and tables. Several pseudo R2 (pR2) measures suitable for use with the Tobit regression model have been proposed in the literature as alternatives to R2. For this study, pR2 was calculated by using the method of Laitila (1993). As with conventional R2, pR2 ranges from 0 to 1 and is an estimate of the proportion of the variation in the response variable explained by the regression model (0 indicates no variation is explained; 1 indicates all variation is explained).

The maximum likelihood methods used for estimating the parameters of the regression models require that the relation between the variables be linear in the parameters and that the residual error be identically and normally distributed. Departures from these requirements can result in flawed estimates of model coefficients. One means of addressing departures from model assumptions is through transformation of the response or the explanatory variables or both (Neter et al., 1985).

Various transformations were considered to minimize departures from the requirements of the maximum likelihood methods used. The logarithm of concentration was used as the response variable throughout model development. For the explanatory variables, logarithmic, square-root, and fourth-root transformations, as well as the untransformed value, were considered during development of the regression models.

Because the response variable is a logarithmic transformation, concentrations predicted by the model (after retransformation) are the median concentrations expected for sites that have a given set of explanatory values, rather than the mean concentrations. Predicted concentration statistics were not adjusted for transformation bias because estimates of median values of the statistics were considered appropriate for the purposes of this study.

Selection of Explanatory Variables for the Atrazine WARP Models

A stepwise approach was used for the initial selection of explanatory variables to be included in the atrazine WARP models. The StepAIC procedure (Venables and Ripley, 1999), implemented in S-PLUS, was used for the initial selection of explanatory variables to be included in the regression models. The StepAIC procedure, based on Akaike’s Information Criterion (Akaike, 1974), balances model goodness of fit with the number of parameters needed to achieve that fit. The procedure attempts to quantify the concept of model parsimony by choosing simpler models over complex ones unless a complex model substantially improves the fit of the model.

An incremental procedure was used to eliminate potential variables because there were too many potential explanatory variables to be considered simultaneously by the StepAIC procedure. First, each potential explanatory variable and its transformations were considered as a group. Variables selected by the StepAIC procedure were retained for further consideration. Second, small groups (groups based on their general type such as land use, soil properties, or hydrologic properties) of the variables retained after the first step were selected and evaluated. Variables selected were retained for further evaluation. Third, variables selected from the second step were combined and evaluated. For all the selection steps, the StepAIC procedure may select two or more variables that are similar in terms of what they represent (redundant), highly correlated with one another, or dependent. In these cases, the variable with the lowest p value was retained, and the other redundant, highly correlated, or dependent variable was eliminated from consideration. Fourth, to see if they improved model fit, variables eliminated in earlier steps were tested with variables retained after the third step. Fifth, the selected set of explanatory variables was evaluated through cross-validation. The model development data were divided into two subsets: a calibration data set (containing 70% of the data) and an evaluation data set (containing 30% of the data). Each model was fit to the calibration data subset, and predicted values and residuals were computed for the calibration data and the evaluation data subsets. Then the process was reversed, and the models were fit to the evaluation data set. Residuals were compared between the two fitted models and evaluation data subsets to cross-validate the final selection of explanatory variables. Finally, explanatory variables were evaluated subjectively for reasonableness (e.g., the models predict increasing concentrations with increasing pesticide use) and their overall contribution to explaining the variation in the concentration statistics.

Analysis of Model Fit

Diagnostics for censored regression (Escobar and Meeker, 1992) available in the survreg procedure were used to assess influential observations and to aid in variable selection. In particular, the deviance residuals described by Escobar and Meeker (1992) were used to assess the appropriateness of transformations used for explanatory variables. Variance inflation factors were used for detecting the presence of multicollinearity among explanatory variables (Neter et al., 1985). Box and whisker plots (Tukey, 1977) were used to qualitatively assess model uncertainty. Boxplots were used for displaying the distribution of model residuals (the logarithm of the observed concentration minus the logarithm of the predicted concentration). For the purpose of creating boxplots and evaluating model performance, the residual for a prediction with a censored observed concentration, ec, was computed (Gregory Schwarz, USGS, written communication, 2004) aswhere σε is the standard deviation of the residual error, Φ is the cumulative normal integral, u is a random number drawn from a uniform (0,1) distribution, CT is the censoring threshold concentration, and C is the predicted concentration.

Uncertainty in the prediction of a concentration statistic can be expressed in terms of a prediction interval (PI) for a specified confidence level. Conceptually, each predicted concentration statistic is the median estimate of the particular concentration statistic (e.g., annual mean or annual maximum moving-averages) for all the stream sites that have the same combination of values for the explanatory variables. The PI is the range of values for a concentration statistic within which 95% of the actual concentration-statistic values are expected to occur for all stream sites with the same values of explanatory variables. In addition, the PI can be interpreted as the range within which the actual concentration statistic for an individual site and year is expected to fall 95% of the time. Prediction intervals and the probability that a predicted concentration will exceed a specific value can be estimated for WARP and WARP-MP models by the methods described in Stone and Gilliom (2012). For the WARP-MP models, the uncertainty in the model predictions is estimated by summing the variance from the atrazine WARP models and the variance from the adjustment factor optimization. However, the variance from the adjustment factor optimization was based on uncensored observations for all concentration statistics, not on individual concentration statistics or censored observed concentration statistics. Consequently, these uncertainty estimates may underestimate the true effect of censoring on the prediction intervals and on the probability of exceeding a specified concentration.

Adjustment Factor Estimation

The adjustment factor in the WARP-MP models is a multiplicative factor applied to the concentrations predicted by the atrazine WARP models to estimate concentration statistics for other pesticides. Pesticide physical properties, chemical properties, and primary application method were evaluated to determine adjustment factors for the WARP-MP models (Supplemental Table S4). Application method (surface applied or soil incorporated) was not found to be a significant variable to use in the adjustment factors; however, only 20% of the data used in evaluating the adjustment factors represented soil incorporated pesticides. Vapor pressure (Vp) and surface-water mobility index (SWMI) of the specific pesticides were found to be the best variables to use in the adjustment factors. Chen et al. (2002) developed the SWMI to characterize the likelihood that a pesticide would be transported to surface water based on the pesticide’s soil degradation half-life and organic carbon sorption coefficient (koc). Pesticide chemical properties were obtained from Norman et al. (2012) and the Pesticide Properties DataBase (University of Hertfordshire, 2013). The adjustment factor used in the WARP-MP models includes the ratio of the SWMI for a given pesticide to the SWMI for atrazine and the ratio of Vp for atrazine to the Vp for a given pesticide (when the Vp for a pesticide is less than the Vp for atrazine, this ratio is considered equal to 1). The adjustment factor has the form:where SWMIpest and SWMIatra are the surface water mobility index for the pesticide and atrazine, respectively, and Vppest and Vpatra are the vapor pressure for the pesticide and atrazine, respectively.

The exponential coefficients for the SWMI ratio (0.5288) and Vp ratio (0.0996) were estimated by optimizing the following relation:where Cobs is the observed concentration statistic for a given pesticide; Cuse is the predicted concentration statistic for the given pesticide from the atrazine WARP model with the reported pesticide use in the watershed (specific to each pesticide); Czero is the predicted concentration statistic for the given pesticide from the atrazine WARP model, assuming zero pesticide use in the watershed; Sswmi is the coefficient for the SWMI ratio; and VVp is the coefficient for the Vp ratio. All uncensored observations for all concentration statistics were used to determine these coefficients; data for atrazine were excluded. The residual standard error for the optimization was 0.61 (about 245%).

Both of the terms in the adjustment factor for pesticides with properties similar to atrazine would be near 1, and, thus, predicted concentrations would be similar to those for atrazine, given similar use. The adjustment factors for the 36 pesticides ranged from 0.1303 for bromoxynil to 1.0774 for chlorimuron ethyl; the median was 0.5412. This reflects the fact that atrazine is generally more mobile and more persistent than most other pesticides that were used during the time of this study, and the adjustment factor reflects this: 32 of the 35 pesticides had adjustment factors <1.

Application to Streams in the United States

The WARP-MP models were used to predict pesticide concentrations in streams for 2009. The network of streams was defined by the USEPA River Reach File 1 (USGS, 2002). For every River Reach File 1 (RF1) watershed, geospatial analysis tools were used to compute values of the explanatory variables required by the WARP-MP models. This application was limited to pesticides with chemical properties within the range of those used in WARP-MP model development and to pesticides with established Aquatic-Life Benchmarks (ALBs) (Supplemental Table S5) but was not limited to pesticides included in model development.

The application of WARP-MP to RF1 streams uses the USEPA Office of Pesticide Program’s Aquatic-Life Benchmarks (USEPA, 2012b) to illustrate how these models can be used to guide additional and more intensive monitoring. When there were multiple formulations of a pesticide and associated ALBs, this analysis selected the lowest ALB to be used in comparison to model predictions. To represent acute plant ALBs, the lower value between the nonvascular and vascular plant benchmarks was used in the comparison to model predictions. The acute fish/invertebrate and acute plant ALBs were compared with the WARP-MP–predicted 4-d maximum moving-average concentrations. The predicted 4-d maximum moving-average concentration underestimates the highest stream concentrations and, thus, when compared with the acute ALBs, underestimates the number of potentially vulnerable streams for these benchmarks. For the USEPA Office of Water Aquatic-Life Criteria, the Criterion Continuous Concentration (CCC) was used in the comparison; however, if the CCC was not available and the Criteria Maximum Concentration (CMC) was, then the CMC was used in the comparison. WARP-MP–predicted 4-d maximum moving-average concentrations were used in the comparison to the CCC or CMC values. The chronic fish and Aquatic Community ALBs were compared with the WARP-MP–predicted 60-d maximum moving-average concentrations, and the chronic invertebrate ALB was compared with the predicted 21-d maximum moving-average concentrations.


Results and Discussion

Atrazine WARP

The WARP models for the annual mean and maximum moving-average atrazine concentration statistics have the following form:where USEINTL is atrazine use intensity, the annual agricultural use in a watershed (kg), derived from EPest-low use estimates (Thelin and Stone, 2013) divided by watershed area (km2); SRL25 is the percentage of the watershed agricultural land with a soil-restrictive layer within the top 25 cm of the soil surface; PMJN is the total precipitation (in mm) during May and June of the year of sampling; R is the rainfall-erosivity factor from the Universal Soil Loss Equation; and PERDUN is the percentage of total streamflow derived from runoff caused by precipitation on saturated soil (Dunne overland flow). Statistics and coefficients for the atrazine WARP models are shown in Table 3.


View Full Table | Close Full ViewTable 3.

Statistics and coefficients for atrazine WARP models.

 
Model Regression coefficients
pR2 Scale
Intercept (USEINTL)^1/4 SRL25 PMJN log10R PERDUN
4-d† −3.34 (<0.001)‡ 1.19 (<0.001) 0.0131 (<0.001) 0.0020 (<0.001) 0.5489 (<0.001) −0.1088 (<0.001) 0.80 0.50
21-d† −3.38 (<0.001) 1.15 (<0.001) 0.0129 (<0.001) 0.0021 (<0.001) 0.5499 (<0.001) −0.1154 (<0.001) 0.81 0.47
30-d† −3.42 (<0.001) 1.14 (<0.001) 0.0128 (<0.001) 0.0021 (<0.001) 0.5615 (<0.001) −0.1219 (<0.001) 0.81 0.47
60-d† −3.44 (<0.001) 1.11 (<0.001) 0.0130 (<0.001) 0.0021 (<0.001) 0.5382 (<0.001) −0.1215 (<0.001) 0.81 0.44
90-d† −3.47 (<0.001) 1.08 (<0.001) 0.0128 (<0.001) 0.0020 (<0.001) 0.5449 (<0.001) −0.1233 (<0.001) 0.82 0.43
Annual mean −3.86 (<0.001) 1.01 (<0.001) 0.0124 (<0.001) 0.0016 (<0.001) 0.6278 (<0.001) −0.1432 (<0.001) 0.82 0.40
Annual maximum moving-average concentration.
Values in parentheses are P values.

The current atrazine WARP models are nearly identical to the previous atrazine WARP models (Stone and Gilliom, 2009) in terms of goodness of fit measures. However, the current models incorporate expanded stream concentration data and updated explanatory variable data. The current atrazine WARP models do not include the K-factor or drainage area variables that were significant in the previous models but include a new variable, SRL25, as described below.

The current models include a variable characterizing the amount of a soil restrictive layer presence in the watershed (SRL25). Lerch and Blanchard (2003) found that the presence of a layer impeding the downward movement of surface water to groundwater was associated with higher vulnerability of a watershed for pesticide transport to streams. SRL25 was highly significant and likely replaced the function of the K-factor variable that was significant in the previous national atrazine WARP models. SRL25 was not available for the United States at the time Stone and Gilliom (2009) updated the national atrazine WARP models. Stone and Gilliom (2012) found that SRL25 was a highly significant and important variable in the Corn Belt WARP models. SRL25 is likely a better indicator of watershed vulnerability for pesticide transport to streams than the K-factor.

WARP performance was not biased for model development sites. Median error ranged from 5% underprediction for the 60-d maximum moving-average concentration to 10% overprediction for the 4-d maximum moving-average concentration. However, the concentration statistics used to develop WARP are expected to be biased low for shorter-duration concentrations, such as the 4- and 21-d maximum moving-averages (median error of 38 and 17%, respectively).

WARP-MP

The WARP-MP models for the annual mean and maximum moving-average atrazine concentration statistics have the following form:where Cpred is the concentration predicted by the WARP-MP model for the given pesticide; Cuse is the concentration predicted by the atrazine WARP model for the given pesticide, using the reported pesticide use in the watershed; Czero is the concentration predicted by the atrazine WARP model, assuming zero pesticide use in the watershed; SWMIpest and SWMIatra are the surface-water mobility indices for the pesticide and atrazine, respectively; and Vppest and Vpatra are the vapor pressure for the pesticide and atrazine, respectively.

Considering model development and model evaluation data, which showed generally similar results, over 75% of the WARP-MP model predictions are within a factor of 10 of the observed concentration statistic (Table 4), and over two thirds of WARP-MP predictions are within a factor of 5. The performance of the WARP-MP models is better with uncensored observed concentration statistics, which are more reliable than estimates from censored data (Table 4).


View Full Table | Close Full ViewTable 4.

Summary of WARP-MP model performance for model development and evaluation sites.

 
Model Development sites, factor
Evaluation sites, factor
10† 5 2 10 5 2
All observed concentration statistics
4-d‡ 78 62 31 84 68 32
21-d‡ 81 66 32 86 70 35
30-d‡ 80 66 33 87 69 36
60-d‡ 83 68 35 88 72 36
90-d‡ 84 69 35 88 72 38
Annual mean 87 72 37 91 76 41
Only uncensored observed concentration statistics
4-d‡ 89 76 41 85 68 33
21-d‡ 91 79 43 87 72 37
30-d‡ 91 80 44 87 73 37
60-d‡ 92 82 45 87 74 38
90-d‡ 93 83 46 88 74 40
Annual mean 97 90 55 97 87 49
Percent of predictions within a factor of 10, 5, or 2.
Annual maximum moving-average concentration.

Atrazine is a herbicide primarily used on corn (Zea mays L.), sorghum (Sorghum spp.), and sugarcane (Saccharum officinarum L.) crops. The atrazine WARP models include a seasonal variable—total precipitation in May and June—that is reflective of the corn planting/growing season throughout much of the Midwest. It is expected that the use of the WARP-MP models with pesticides that differ greatly from atrazine application timing relative to the May–June period may result in predictions with greater uncertainty. Figures 3 and 4 show the performance of the WARP-MP models in terms of the difference between the observed concentration statistic and the predicted concentration statistic; therefore, differences <0 indicate overprediction, and differences >0 indicate underprediction. To generalize model performance tendencies, the location of the median of the differences is evaluated relative to zero, which would indicate exact agreement between the observed and predicted concentration statistics. Acetochlor, alachlor, bromoxynil, butylate, cyanazine, S-ethyl N,N-dipropylcarbamothioate (EPTC), metolachlor, and nicosulfuron are herbicides that are primarily used on corn crops. The WARP-MP models perform very well for metolachlor and nicosulfuron (Fig. 3 and 4). The WARP-MP models perform well for acetochlor, alachlor, bromoxynil, and cyanazine; however, there is a tendency for the models to overpredict the concentration statistics for these pesticides. The 4-d maximum moving-average concentrations for these pesticides were overpredicted, in terms of median errors, by 76 to 146%

Fig. 3.
Fig. 3.

Residuals for application of the WARP-MP models to model development sites. Pesticides with an asterisk are model evaluation pesticides not used in the determination of adjustment factor coefficients.

 
Fig. 4.
Fig. 4.

Residuals for application of the WARP-MP models to model evaluation sites. Pesticides with an asterisk are model evaluation pesticides not used in determination of adjustment factor coefficients.

 

Butylate and EPTC are also used on corn crops; however, the WARP-MP models tend to more consistently overpredict concentrations for these pesticides at model development and evaluation sites (Fig. 3 and 4). Butylate and EPTC differ from atrazine in that the typical field application method for these pesticides includes soil incorporation rather than soil-surface application. Soil incorporation during field application of a pesticide is expected to reduce the potential for transport to streams. The WARP-MP model overpredictions for these corn herbicides at model development and evaluation sites may be related to the difference in field application method between these pesticides and atrazine. Soil incorporation during field application was considered in the determination of the adjustment factors and was not found to be significant; however, soil-incorporated pesticides were likely underrepresented in the data used to determine the adjustment factors.

The remaining herbicides, not discussed above and shown in Fig. 3 and 4, are used primarily on a variety of crops other than corn. The WARP-MP models perform well for acifluorfen, bentazon, chlorimuron ethyl, fluometuron, linuron, metribuzin, norflurazon, oryzalin, and terbacil; however, there is a tendency for the models to overpredict the concentration statistics for these pesticides. The 4-d maximum moving-average concentrations for these pesticides were overpredicted, in terms of median errors, by 35 to 149% The WARP-MP models tended to more consistently overpredict concentrations for ethalfluralin, pebulate, pronamide, propachlor, propanil, thiobencarb, triallate, and trifluralin at model development and evaluation sites. Herbicides such as ethalfluralin, pebulate, triallate, and trifluralin are often soil incorporated when applied to fields. As expected, the differences between atrazine and these other pesticides in terms of cropping practices, application timing relative to the May–June period, and application method likely contributes to the WARP-MP overprediction of stream concentrations for these pesticides. For example, propanil and thiobencarb are used on rice (Oryza sativa L.). Herbicide application timing and methods for rice differ from that for corn, which likely contributes to the WARP-MP overprediction for these two pesticides.

For insecticides and fungicides other than benomyl, methomyl, and propargite, the WARP-MP models tend to overpredict stream concentrations at model development and evaluation sites (Fig. 3 and 4). The WARP-MP model performance for insecticides and fungicides does not exactly follow the same pattern in relation to cropping practices as that seen with the herbicides. Methomyl is primarily used on corn crops, but benomyl is not, and both of these pesticides performed reasonably well with the WARP-MP models. The timing of insecticide and fungicide applications to fields is often very different from that of corn–herbicide application timing. Because of this application timing difference, it is expected that the WARP-MP models may overpredict stream concentrations for insecticides and fungicides.

The interpretation of WARP-MP model performance should be tempered by the potential underestimation of the model development concentration statistics. The sampling frequency analysis based on atrazine concentrations measured at NWQCR sites in Ohio shows that the sampling frequency for sites used in WARP and WARP-MP model development may underestimate the actual 4- and 21-d maximum moving-average concentrations by median errors of 38 and 17% , respectively. For some streams, these concentration statistics may be underestimated by more than 80%. The underestimation of the model development concentration statistics may buffer the tendency for WARP-MP to overpredict many herbicides, insecticides, and fungicides. However, WARP-MP predictions for corn herbicides similar to atrazine may be underpredicted because of the underestimation of the short-duration concentration statistics. For example, the WARP-MP model predictions of metolachlor 4-d maximum moving-average concentrations are overpredicted with a median error of 16%; however, the sampling frequency median underestimation bias for the 4-d maximum moving-average atrazine concentration was 38%. Therefore, the WARP-MP model predictions for this pesticide concentration statistic and some streams may be biased low.

Application of WARP-MP to Streams in the United States

The WARP-MP models were applied to RF1 stream segments for 112 pesticides that had properties within the ranges of those used in model development and had ALBs (Supplemental Table S5). Predictions were not made for watersheds without estimated agricultural pesticide use or for lakes and reservoirs. The comparison of WARP-MP–predicted stream concentrations to ALBs illustrates how WARP-MP can be used for a screening-level assessment to identify vulnerable watersheds and to guide monitoring to where it is most needed. When possible, the determination or classification of a stream concentration as exceeding or not exceeding an ALB should be done with direct measurements of stream concentrations, but when this is not possible, the model can be used to estimate the probability of exceeding a benchmark. Assuming unbiased estimates, when a predicted concentration statistic is equal to or greater than the ALB, there is a 50% or greater probability that the predicted concentration statistic exceeds the ALB (Stone and Gilliom, 2012).

Figure 5 shows a summary of the WARP-MP predictions compared with ALBs in terms of the stream miles predicted to have a 50% or greater probability of exceeding any benchmark during 2009. Of the 112 pesticides used in the WARP-MP application to RF1 stream segments, 23 were predicted to have concentration statistics exceeding any benchmark (Fig. 5). Atrazine, chlorpyrifos, acetochlor, fipronil, terbufos, and metolachlor were predicted to have >2% of the modeled stream miles exceeding ALBs. For atrazine and acetochlor, the majority of stream miles predicted to have stream concentration statistics above an ALB were the result of the acute plant ALB comparison (Fig. 5, inset). For fipronil, metolachlor, and terbufos, the majority of stream miles predicted to exceed an ALB were the result of the chronic invertebrate ALB comparison. The percentage of modeled stream miles predicted to exceed ALBs for chlorpyrifos was the result of a comparison to acute fish/invertebrate, chronic invertebrate, and CCC/CMC benchmarks.

Fig. 5.
Fig. 5.

Summary of comparison between WARP-MP model predictions and Aquatic-Life Benchmarks in 2009 (only pesticides predicted to exceed a benchmark are shown). AL, aquatic life; CCC, Criterion Continuous Concentration; OW, USEPA Office of Water.

 

The WARP-MP model can be used for screening-level assessment to identify the geographic areas with vulnerable watersheds. Figure 6, for example, shows the modeled streams by the number of pesticides with predicted concentration statistics that have a 50% or greater probability of exceeding any benchmark during 2009. Geographically, many of the modeled streams in the Corn Belt Region are predicted to have one or more pesticides that exceed an ALB during 2009 (Fig. 6), with isolated areas predicted to have streams with more than four pesticides predicted to exceed an ALB, indicating the potential vulnerability of streams in this region. When resources for monitoring pesticides in streams are limited, application of WARP-MP, as illustrated in Fig. 6, can guide monitoring decisions to streams that are more potentially vulnerable than others.

Fig. 6.
Fig. 6.

Application of the WARP-MP models to RF1 streams and comparison to Aquatic-Life Benchmarks in 2009.

 

Model Application and Limitations

Application of the regression models described in this article to streams with watershed parameters outside the range of the sites used to develop the models will result in increased uncertainty. In particular, application of the models to lakes or reservoirs would likely result in variably biased predictions because the models were developed with data from flowing-water systems. Furthermore, the sampling frequencies of the model-development sites may not be sufficient to reliably characterize the highest moving-average concentrations during a year, particularly for the shorter-duration averages. For some streams, the short-duration moving-average concentrations used in model development may be underestimated by as much as 84%, as estimated for atrazine. Thus, application of the models to predict the maximum moving-average concentrations for short durations, such as the 4-d moving-average, is expected to generally underpredict the actual concentrations for pesticides that performed well with the model development and evaluation sites. Application of these models to pesticides that are used more heavily during months that differ from the high pesticide runoff period months used in model development may result in increased uncertainty and potentially biased results. Also, the pesticide-use intensity variable was limited to agricultural-use intensity because use data for other purposes were not available. Application of these models to pesticides with significant nonagricultural use may result in increased uncertainty and potentially biased results. Applications of the WARP-MP model to pesticides with properties outside the range of those used in development of the adjustment factor have greater uncertainty.

Conclusions

Additional potential explanatory variable data and an expanded set of atrazine stream concentration data were used to update atrazine WARP models. The updated atrazine WARP models include a new explanatory variable, SRL25, which characterizes the areal extent of a soil-restrictive layer present within the watershed. K-factor and drainage area, which were significant in the previous WARP models, were found not to be significant in the current WARP models. Other explanatory variables, such as use intensity, remain the same. The goodness of fit measures are generally similar between the current and previous WARP models; however, the current WARP models incorporate expanded stream concentration data and updated watershed characteristic data.

The development of single-pesticide WARP models was adversely affected by the very high percentage of censored data. The approach taken in this study incorporated a pesticide-specific adjustment factor, based on chemical and physical properties, with the atrazine WARP models to create WARP-MP models.

Adjustment factors for the pesticides used in WARP-MP ranged from 0.1303 for bromoxynil to 1.0774 for chlorimuron ethyl (median, 0.5412). Over three quarters of the WARP-MP predicted concentration statistics for the model development and the evaluation sites in the United States were within an order of magnitude of the observed concentration statistic. Over two thirds of predicted concentration statistics were within a factor of five of the observed concentration statistic for model development and evaluation sites. The WARP-MP models performed well with metolachlor and nicosulfuron and tended to overpredict the other herbicides used on corn that have application timing and methods similar to that of atrazine. However, the overprediction for these pesticides (the median error for 4-d moving-average concentration ranging from 76 to 146%) may be counterbalanced by the low bias of the concentration statistics used with model development, which may be underestimated by as much as 80%. The WARP-MP models tended to consistently overpredict concentrations for pesticides that have application timing and methods that differed from that of atrazine. Subsequent refinements of the adjustment factor may benefit from additional data for soil-incorporated pesticides and information about pesticide application relative to the May–June period.

Predicted concentration statistics from application of the WARP-MP models are not intended to replace concentration statistics computed from direct measurements of stream concentrations when high reliability is required. However, it is not possible to monitor pesticide concentrations in every stream. Of the 112 pesticides used in the WARP-MP application to RF1 stream segments, 25 were predicted to have concentration statistics with a 50% or greater probability of exceeding any ALB in one or more stream segments. Geographically, this application of the WARP-MP models in the conterminous United States showed that many of the modeled streams in the Corn Belt Region were predicted to have one or more pesticides that exceeded an ALB during 2009, indicating the potential vulnerability of streams in this region. This application of the WARP-MP models illustrates how these models could be used to identify vulnerable watersheds based on comparisons between predicted concentration statistics and ALBs.

 

References

Footnotes



Files:

Comments
Be the first to comment.



Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.