Selecting Reliable Models for Total Maximum Daily Load Development: Holistic Protocol

Computer models and simpler methods must be used to reliably determine total maximum daily loads (TMDLs) for impaired waters of the US. Models are also useful to develop implementation plans that allocate load reductions to meet water quality standards. During model selection, practical considerations sometimes override the fundamental requirements of parsimony and defensibility. For the first time, a rationale for efficiently selecting reliable models for TMDL determinations is presented in this paper, based on deliberations of the TMDL Analysis and Modeling Task Committee of the Environmental and Water Resources Institute of ASCE. This protocol is based on technical criteria (waterbody, water quality, and data requirements) and management constraints (state plans and priorities, stakeholder and expert engagement, and state authority to manage pollution sources and state allocation priorities). Uncertainties due to changes in economic activity, population, land use, climate, sea level, and policies are also included. To reduce wasteful trial-and-error iterations during model selection, an optimal conceptual modeling framework is identified and subsequently refined in a flowchart based on practical considerations. The proposed protocol holistically defines the state of the science in model selection and is anticipated to soon define the state of the practice for TMDL determination. DOI: 10.1061/(ASCE)HE.1943-5584.0002102. This work is made available under the terms of the Creative Commons Attribution 4.0 International license, https://creativecommons.org/licenses/by/4.0/.


Introduction
A total maximum daily load (TMDL) defined in Paragraph 303(d) of the United States Clean Water Act (CWA) of 1972 (33 US Code § § 1313 et seq.) is the maximum allowable pollutant loading or waste assimilative capacity for one or more water quality constituents such as nutrients, pathogens, algae, dissolved oxygen, heavy metals, emerging contaminants, pH, heat, and trash that enter a water segment, so that water quality standards are met (USEPA 2018c). The allocation of the TMDL is the basis of controlling and managing both point-source and nonpoint-source pollution. The control of pollution using the TMDL approach, if carefully considered, can provide socioeconomic and environmental benefits that help revive a sagging economy, bring back cultural vibrancy, and resurrect communities or even whole cities (e.g., Cropper and Isaac 2011). Nevertheless, the determinations of TMDLs typically require sophisticated hydrologic and water quality models for watersheds and hydrodynamic and water quality models for receiving water bodies, and intensive synoptic or water segmentwide and watershedwide data collection to calibrate and validate the models and reliably relate impairments to specific pollutant loads. These TMDL determinations typically require an optimal balance of focused modeling expertise, financial support, and sufficient time. For the first time, in this paper, we holistically integrate the fundamental scientific principles of model selection with a comprehensive set of practical criteria to rationally provide guidance using an efficient model selection protocol.
The objective of this study was to derive a carefully structured protocol and a corresponding flowchart to guide model selection for the determination and implementation of TMDLs. Model selection often has been heuristic and ad hoc as a result of conflicting state and federal policies that occur despite the cooperative federalism required by the CWA on TMDL determination and implementation (Brudney 2017). Moreover, in many instances, the dynamic and spatial variability in physical, chemical, and biological characteristics of the impaired waterbody and contributing watershed have not been adequately defined a priori in order to support a definitive and unique model choice. 1 We manually searched through all of the 315 USEPA-approved reports describing the TMDLs determined for 5,523 listed water segments published between 2015 and 2020 (Table 1) to look for the types of models used and the rationale for model selection. We found that there is significant variability in the use of different types of models [ Fig. 1(a)]. We also found that in 109 reports (35% of the reports), there was no justification given for the use of the model that was applied [ Fig. 1(b)]. In 31 reports (10% of the reports), we did not find that there was a data collection plan to support the TMDL process [ Fig. 1(c)]. These findings clearly indicate that systematic guidance for model selection is necessary.
In this model selection protocol, we recommend testing an optimal model satisfying the fundamental principles of parsimony (Thomann and Mueller 1987) and defensibility (Martin and McCutcheon 1999), along with various practical criteria, in order to organize conflicting information and minimize resource wastage when selecting a TMDL model. The heuristic and ad hoc selection of TMDL models can be overly resource-intensive and inefficient if the calibration of the initial model choice proves unreliable. Due to the early proliferation of models developed for purposes other than TMDL determination that have rarely been peer reviewed and tested (McCutcheon 1981(McCutcheon , 1983a, this process becomes severely iterative. Additionally, TMDL implementation planning may require resource-intensive watershed-scale simulations to distinguish manageable classes of nonpoint sources and background loading (Stow et al. 2007).
Therefore, a critical professional judgement in model selection is whether to start with the more intensive model necessary for implementation planning or to use a simpler model to determine and allocate the TMDL and then switch to the more elaborate model for implementation planning. In this protocol, we emphasize open stakeholder and expert collaboration to expend the available resources that are consistent with the collaborative federalism required by the CWA.
This protocol is unique in organizing both technical criteria such as the type of waterbody, impairment and synoptic data requirements and management criteria such as state plans and priorities, stakeholder and expert engagement, and state authority to manage pollution sources and state allocation priorities in the model selection protocol. The protocol is based on the lean design principle, an extension of the Toyota Way (Ballard and Zabelle 2000). In this approach, we recommend first logically defining the model selection tasks. Subsequently, we structure the workflow to avoid redundant flows of information. Finally, we recommend structuring the model selection process in a manner that minimizes repetition in the workflow to reduce value loss during model selection (Poppendieck and Poppendieck 2003).
We do not enumerate a comprehensive list of models useful for determining, allocating and implementation planning of TMDLs in this paper. The TMDL Analysis and Modeling Task Committee of the Environmental and Water Resources Institute of ASCE (ASCE TMDL Analysis and Modeling Task Committee 2017; Borah et al. 2019b; ASCE TMDL Analysis and Modeling Task Committee, forthcoming) has compiled a list of models that have been used to determine TMDLs. The ASCE TMDL Analysis and Modeling Task Committee has also covered other vital steps in the TMDL determination process including model calibration, validation (also known as confirmation), uncertainty analysis, and determination of the margin of safety; described data sources used in TMDL modeling including in situ data, geographical information systems (GIS), and remote sensing; and the state-of-the-practice in modeling for TMDL determination, allocation and implementation planning (ASCE TMDL Analysis and Modeling Task Committee 2017, forthcoming). Additionally, the ASCE TMDL Analysis and Modeling Task Committee has also published a collection of 15 research articles on the state-of-the-art in these topics in the Journal of Hydrologic Engineering (Borah et al. 2019b).
At the same time, no attempt has been made in the past to review the more than 70,000 USEPA-approved TMDL documents and build a comprehensive catalog of models and analytical procedures . To the best of our knowledge, our compilation of TMDL reports in Table 1 and derivation of statistics shown in Fig. 1 are the first attempts to closely look at and analyze models used in TMDLs and use these findings systematically in the model selection process. Very few receiving water quality models have been actually peer-reviewed, and in most cases, the stage of model development ranging from research models to fully developed models optimized for the practice of TMDL determination has not been defined (e.g., NCASI 1982;McCutcheon 1983b). More recent evaluations include selections of watershed and receiving water quality models that were heuristically chosen for TMDL determination (Martin and McCutcheon 1999;Bera 2003, 2004;Shoemaker et al. 2005;Muñoz-Carpena et al. 2006;Moriasi et al. 2012Moriasi et al. , 2015Martin et al. 2015; ASCE TMDL Analysis and Modeling Task Committee 2017; Borah et al. 2019a;Camacho et al. 2019;Quinn et al. 2019;Zhang and Quinn 2019;Yuan et al. 2020).
Modeling for TMDL determination is seldom a straightforward process, with a trade-off between fundamental principles of model selection and practical considerations. Specific types of waterbodies and particular impairments require site-specific domain expertise that may result in the fundamental model selection principles being overridden. Just selecting the best possible model based on fundamental criteria of parsimony and defensibility may not be a viable option given modeling skill and resource constraints. At the same time, prevailing practice, lack of requisite data, ease of justifying a model, and other such logistical issues should not be the only factors in selecting an appropriate TMDL model. To balance the trade-off between these aspects of model selection in a systematic manner, a protocol such as the one presented here is required. When used with modeling experience and site-specific knowledge, this protocol will be useful to researchers and practitioners alike in selecting a reliable TMDL model that can withstand scientific scrutiny. Implementation Planning The USEPA (2018c) defines the TMDL for an impaired waterbody in the US as follows: where WLA = waste load allocation for each point source; LA = load allocation for each category of nonpoint sources, as well as the difficult-to-manage background loads that are not directly caused by human activities, and a reserve capacity for reasonably anticipated point-source and nonpoint-source load increases; and MOS = margin of safety that conservatively accounts for simulation and measurement uncertainties. The need for a reserve capacity is due to local changes in human population, economic activity and land use and land cover (LULC), as well as changes in climate, sea level, and federal and state environmental policy. The CWA requires the states or other jurisdictions to determine a TMDL for each impaired waterbody. Even though private contracting firms or universities have developed about 20% of the TMDLs over the last 5 years, it is still the responsibility of the states  or other jurisdictions to submit the TMDL in a water quality management document to the USEPA. This document must provide details on the TMDL determination and load reduction/allocations. The USEPA also recommends that this document contain a TMDL implementation plan, a monitoring plan to track the performance of the TMDL, reasonable assurances on the reduction of nonpointsource loads, and adequate stakeholder engagement (USEPA 2002). The submission must include a signed submittal letter by the state or other jurisdictional authority that states whether the TMDL is being submitted for a technical review. This indicates to the USEPA the intent of the state to determine a TMDL in the future or for final approval (USEPA 2002). If the USEPA does not approve the TMDL in a reasonable time, the responsibility subsequently falls upon the USEPA itself to determine and allocate the TMDL (Birkeland 2001;USEPA 2018c).

Groups Involved in Determining the TMDL
To determine a TMDL, states or other jurisdictions (or a combination thereof for interstate waters) draft a water quality management document (USEPA 2018c) that defines the scope of the problem. In some cases, the problem may involve significant management challenges. For example, where the impaired waterbody and contributing watershed cross multiple jurisdictional boundaries, jurisdictional cooperation is vital to allocate reductions to all loads consistent with the laws and regulations of each jurisdiction. The Chesapeake Bay TMDLs are exceptional examples of this type of cooperation (USEPA 2010). Sometimes, differing political priorities and legal requirements of the jurisdiction, the USEPA, and other federal agencies may need to be considered. These issues can hamper model selection because conflicting guidelines on modeling needs or disparate modeling approaches between different jurisdictions can lead to confusion. Engagement with stakeholders and experts helps in resolving these issues. In this paper, we delineate the following groups: 1. States or other jurisdictions including territories and approved tribes of the US. These are the delegated water quality management agencies that are required under the CWA to list all impaired waterbodies and develop the TMDL. If the USEPA does not approve a submitted TMDL document, then the USEPA is itself responsible for developing the TMDL. When cooperative effort is required to address multijurisdictional watersheds or other complex TMDL determinations, the relevant jurisdictions, interstate commissions, associations, and other federal agencies are classified by the USEPA as partners (USEPA 2020b). 2. Modeling team. This is the team of practitioners specialized in watershed hydrologic and water quality and receiving water hydrodynamic and water quality models who may work for the state or other jurisdiction, the USEPA, or an independent contractor who perform the TMDL model selection and application for the state or other jurisdiction. 3. Stakeholders. These include local communities, environmental groups, nongovernmental organizations, economic development groups and interstate organizations, industry, and trade associations representing dischargers who all benefit from the waterbody and watershed. 4. Experts. These should include environmental and industry groups, research institutions, and universities who have experience and expertise developing and using TMDL models. The state or other jurisdictions must have a water quality management document, a monitoring plan, a synoptic data collection plan, and modeling necessary to develop the TMDL. These plans benefit immensely from the support of the key experts and stakeholders (Fig. 2). Engagement with experts during the model selection improves the reliability of the TMDL, and engagement with stakeholders ensures transparency and public acceptance of the TMDL. Engagements occur in four stages as follows ( Fig. 2): 1. Compilation of background information. Typically, most of the available monitoring and synoptic data, LULC patterns and nonpoint-source loads, load-reduction priorities, performance of best management practices (BMPs), and critical water quality conditions are available to the modeling team through federal and state water data portals and the National Pollutant Discharge Elimination System (NPDES). Nonetheless, the modeling team may benefit from early and continuous expert engagement if there are other domain-specific data sources. Citizen science monitoring can also help in this data collection step (Fig. 2). Surveys of the general public, local civic bodies, domain experts, information clearing houses and water managers allow the beneficial uses of the waterbody to be determined (Fig. 2). This allows a priori agreement on the types of models to be considered as well as the synoptic data collection and quality assurance plans needed for model development. 2. Consultation. Federal and state regulations require that public comment opportunities be provided at different stages of the TMDL determination, allocation, and implementation planning. If stakeholders are engaged during model selection to reach a consensus on the TMDL process, above and beyond the legally required public comment requirements, the number of public comments and controversies regarding the interpretation of simulation results can be significantly reduced. During model selection, experts such as members of the scientific community, practicing engineers, domain experts, policymakers, and system managers should be consulted to determine whether any water quality and watershed models have been calibrated and applied to the impaired waterbody and contributing watershed (Fig. 2). Even if such models are developed or calibrated for processes not necessary for TMDL determinations, the associated hydrodynamic and water quality observations should be useful to calibrate simpler or more practical models or methods. The modeling team could engage with focus groups of experts to understand the limitations of specific models and prepare a list of candidate models. During this stage, the modeling team can also develop model sensitivity, calibration, and confirmation tests. These strategies would help reduce time overheads, eliminate repetition, and pare down unviable options. For example, when determining the Chesapeake Bay TMDLs, the USEPA Chesapeake Bay Program Office and the Chesapeake Bay Program initially convened four expert modeling workshops. During the determination, allocation, and implementation planning of the Chesapeake Bay TMDLs, follow-up expert panels were convened to evaluate progress and provide technical guidance (USEPA 2010). 3. Evaluation and feedback. During model selection, the modeling team should brief stakeholders such as practitioners, interested parties, and regulatory agencies on their findings (Fig. 2). The modeling team can also acknowledge the feedback from the state or other jurisdictions and all interested stakeholders in order to build trust and achieve community acceptance. 4. Outreach. For transparency and increased cooperation, the modeling team can make codes, model results, and documentation publicly available to the scientific community and engineering practitioners to facilitate independent review (Fig. 2). Proprietary codes used by some contractors and federal and state agencies may be selected under unusual circumstances, such as when no publicly available code reliably simulates the pollutant and impairment. In these circumstances, some executable code must be made available with the simulations used to determine, allocate, and plan implementation of the TMDLs. With ongoing scientific advances, maintaining a good outreach program with experts will allow the modeling team to continuously update the model (Fig. 2).

Modeling for TMDL Determination, Allocation, and Implementation Planning
Models and methods that establish a cause-and-effect relationship between loads and an impairment must be used to determine the TMDL and allocate load reductions, and to develop the TMDL implementation plan. Models used to develop TMDLs have ranged from simple empirical models such as regression equations and load-duration curves, or analytical models such as mass balances (McCutcheon 1989;Zhang and Quinn 2019), to more reliable numerical models of critical condition focusing on specific hydrologic and impairment conditions (Zhang and Padmanabhan 2019) and complex and/or sophisticated numerical hydrologic, hydrodynamic, and water quality models (Martin and McCutcheon 1999 Yuan et al. 2020). In this paper, we define complex models as those involving many physical, chemical, and biological processes, and sophisticated models as those representing process with a high degree of integrity. Models are used throughout the TMDL development to initially understand the cause and effect of loading on impairment and the impacts of BMPs on the water quality (ADEC 2017), in TMDL determination (ADEM 2017), and on computation of MOS (Ahmadisharaf et al. 2019). Models can also be used as decision support tools to evaluate load reduction/allocation strategies (ADEM 2017;Ahmadisharaf and Benham 2020). Models are also suitable after implementing the TMDL for real-time or near-realtime forecasting of water quality (Leisenring and Moradkhani 2012).
All of the model types have been applied to determine TMDLs in contemporary applications. As elaborated in the next section and listed in Table 1, 5,523 TMDLs have been listed in 315 USEPAapproved reports between 2015 and 2020. As listed in Table 2, regression equations and load duration curves have been used in 125 instances (about 40% of the reports), simple analytical mass balance models have been used in 42 instances (about 13% of the reports), hydrologic watershed models have been used in 68 instances (about 22% of the cases), and hydrodynamic and receiving water quality models have been used in 80 instances (about 25% of the reports). Typically, receiving water quality models have been applied when hydrodynamics drive the impairment process for pollutants such as temperature and dissolved oxygen (Table 2). For pollutants such as nutrients, pathogens, pH, dissolved oxygen and metals, watershed models and integrated models have been typically used (Table 2). For emergent contaminants, heavy metals, and pathogens, empirical models have been used, whereas analytical model use has largely been restricted to the analysis of nutrients and pathogens (Table 2).

Protocol for Model Selection
The availability of a wide variety of models makes the task of appropriate model selection a challenging one. The more important listings of some TMDL models (Shoemaker et al. 1997;Martin and McCutcheon 1999;Bera 2003, 2004;Shoemaker et al. 2005 Water Environment Federation 2020). Thus, a principal motivation for this paper was to develop a comprehensive, systematic, and holistic selection protocol that was previously unavailable for TMDL modeling in general. We defined this motivation based on in-depth reviews of USEPA-approved TMDL reports during a recent 5-year period and the available literature on model selection as elaborated in the following sections.

Search for Existing Model Selection Protocol
To determine what, if any, model selection protocols were being used regularly in TMDL development, we undertook the task of assessing the state of the practice in TMDL model selection in the US. As mentioned previously, we scanned 315 TMDL reports with 5,523 TMDL listings approved by the USEPA between 2015 and 2020 in 44 states to identify the types of models being used, the rationale for model choices, and the synoptic data collection efforts  in support of the modeling (Tables 1 and 2). These TMDLs covered many types of waterbodies, including rivers, lakes, estuaries, and beaches, and impairments due to biotic, abiotic and emerging contaminant pollutants, trash, physical alterations, and habitat degradation within the water segments. For compiling Tables 1 and 2, we searched through the Assessment, Total Maximum Daily Load (TMDL) Tracking and Implementation System (ATTAINS) database, an exhaustive online catalog maintained by the USEPA (2020a), the USEPA's Region 1 database of TMDL reports (USEPA 2020c) and TMDL report databases of 26 states as listed in Table 1. We could not find data for six states during this period.
We found that although states predominantly use numerical models of varying degrees of complexity and sophistication, on several occasions, empirical and analytical models have also been used. On rare occasions such as in the case of a trash and debris TMDL (ADEC 2017), there have even been TMDLs approved that did not have any modeling [ Fig. 1(a)]. No rationale was specified for model selection in 109 reports (35% of the reports). In 18 reports (about 6% of the reports), the rationale was based on the prevailing practice in the state. In 51 reports (about 16% of the reports), perceived data limitations dictated model choice. In an additional four reports (about 1% of the reports), the model choice was based on the appropriateness to the available data [ Fig. 1(b)]. In five reports (about 1% of the reports), resource constraints were cited as the reason for using a particular model. Technical reasoning behind model selection was provided in only 128 reports (about 40% of the reports). In 100 reports (35% of the reports), there were either no synoptic data collection, or only sporadic ambient data on which to base TMDL determination [ Fig. 1(c)]. These patterns strongly indicate the need for a systematic protocol on model selection and associated synoptic data collection.
To determine what model selection guidelines are available in the literature, a list was compiled of several publications on metaanalyses of TMDL models and their applications. Specifically, we looked for studies based on surveys of state agencies and practitioners to identify challenges in model selection. Very few such studies exist (Neilson and Stevens 2002;Cabrera-Stagno 2007;Benham et al. 2008;Clark and Vanrolleghem 2010;Topp et al. 2020). We also studied documents describing the modeling and data needs of various commonly used watershed and water quality models (Deas and Lowney 2000;Muñoz-Carpena et al. 2006;Keisman and Shenk 2013;Quinn et al. 2019;Water Environment Federation 2020). In addition to these documents, we found guidance documents on TMDL model selection (Oregon Department of Environmental Quality, n.d.; Deas and Lowney 2000;Imhoff 2003;USEPA 2018d). Unfortunately, in the best circumstances, these documents typically were restricted to specific types of applications and impairments, were nonexhaustive catalogs of models, were practical guides for model selection, or were engineering guides that did not include practical considerations. Based on the survey papers, meta-analyses, guidance documents, and TMDL reports, we found that TMDL models are currently selected largely on an ad hoc basis. Borah and Bera (2004) and Borah et al. (2006) also reached this conclusion in their review of applications of nutrient and sediment transport and watershed models.

Proposed Model Selection Protocol
When developing our proposed model selection protocol, we had three considerations: 1. The model selection must be guided by fundamental principles of parsimony and defensibility when applicable as well as by practical criteria (discussed subsequently).
2. The model selection process must be inclusive. By this, we mean that the process must be open and incorporate expert opinions in such a manner that the choice of an optimal model selected by applying the fundamentals of parsimony and defensibility can be appropriately adapted at every stage of the model selection process. By identifying practical criteria and engaging with experts, the modeling needs and resources can be accounted for and the original model choice can be adapted suitably. 3. The model selection protocol must be designed in a manner that minimizes redundant information and process flows and allows for concurrent evaluation of all the alternatives. Typically, modelers iteratively select and explore modeling choices or choose a model in an ad hoc manner. By engaging with experts and evaluating all the options against an optimal model choice as more data on practical criteria become available, much of the repetition can be avoided. There may still be unavoidable repetition owning to situations in which the model documentation is unclear, or in which a model does not produce the expected results. In these circumstances, the modelers would have to re-evaluate the choice. However, by integrating the lean design principles (Ballard and Zabelle 2000) with continuous expert engagement, model selection can be carried out in an efficient manner.
McCutcheon (1989), Ambrose et al. (1990), McCutcheon et al. (1990), Shoemaker et al. (1997), Martin and McCutcheon (1999), the National Research Council (2001), Shoemaker et al. (2005), Martin et al. (2015), and Camacho et al. (2019) have presented extensive information on receiving water quality model selection. However, this information is not holistically organized into a general protocol. Very few evaluations (Borah et al. 2019a) have focused on selection of auxiliary watershed models to determine a TMDL or plan implementation. When changes in human population, economic activity, LULC, climate, sea level, and policies are to be considered, there is very limited guidance on model selection. We discuss and present a holistic organization of TMDL model selection tasks in Fig. 3 and focus on the technical criteria and management constraints to be considered in model selection subsequently in Figs. 4 and 5.
In the flowchart in Fig. 3, we prescribe a general protocol for model selection by carefully moving from initially listing an impaired waterbody to deploying an appropriate TMDL model through a series of decision points (gray boxes). We recommend considering both technical criteria and management constraints (Figs. 4 and 5) that influence model selection (solid and hatched parallelograms, respectively, in Fig. 3) along with an optimal model identified using fundamental modeling principles (choices and outcomes in the upper-left quadrant of Fig. 3). We also recommend continuous and open engagement with stakeholders and experts (dashed arrows in Fig. 3). By adopting these practices, it is possible to optimize the flow of information (thin arrows in Fig. 3) and reduce unnecessary iterations of the model selection tasks (thick numbered arrows in Fig. 3). These tasks are: 1. gathering information on factors influencing model selection, 2. applying fundamental modeling principles, 3. identifying an optimal model, 4. evaluating the data and resource requirements for using this optimal model and planning and budgeting for the synoptic data collection that will be required, 5. weighing practical considerations and selecting an appropriate TMDL model, and 6. deploying the TMDL model.
The selection of the appropriate TMDL model requires evaluating the modeling needs, the data, and the dynamics of the impairment (choices and outcomes in the lower part of Fig. 3). The protocol is general enough that practitioners can apply it to specific problems with relatively minimal modification.

Fundamental Model Selection Principles
The best practice when selecting models should be to ensure that only models that have undergone rigorous review including line-by-line evaluation of the code be considered in order to avoid models with serious hidden systematic errors (McCutcheon 1983b). However, it may not always be practical to satisfy this prerequisite. Several technical and management factors further influence the selection process. Model selection is thus largely a qualitative tradeoff between fundamental principles and practical considerations. Ideally, model selection should be guided by the following fundamental principles: • Parsimony. The simplest model with process integrity should be selected (Thomann and Mueller 1987). At the same time, an oversimplified model should not be used. Similarly, a model with too many calibration parameters that overfits the data should also not be used. • Defensibility. A model containing a process must only be included if the water quality forecast is significantly improved by including that process (Martin and McCutcheon 1999).
With emergent water quality problems and modern tools to tackle them, the fundamental principles of model selection can be applied in a somewhat flexible manner. There are some applications when the fundamental principles of model selection may be bypassed when selecting candidate models for further evaluation: • When a model has undergone rigorous peer-review and is shown to be scientifically defensible. Over the last century, since the first development of the Streeter-Phelps model in the 1920s, our understanding of many underlying physical, chemical, and biological processes has advanced (Ambrose et al. 2009). Alongside these fundamental scientific advances, the rapid improvements in scientific computing (Ambrose et al. 2009), and in situ (Smith 2015) and remotely sensed data collection (Topp et al. 2020) over the last 50 years has tremendously improved our ability to characterize processes and represent them in models to calibrate and validate receiving water quality and watershed models. When models with a sound scientific basis undergo a rigorous update cycle that synthesizes improved process understanding, better data, implementation of refinements, and rigorous peer review throughout their development, they eventually reach a point at which they can be trusted to represent the underlying physical, chemical, and biological processes through a parsimonious formulation. Such models have invariably been shown to be defensible in numerous real-world applications. Models typically reaching this stage of maturity include hydrologic models simulating runoff as a function of rainfall, or hydrodynamic and water quality models simulating soil erosion, sediment transport, and transport and fate of pollutants with runoff and pollutants adsorbed onto sediments [e.g., the models of Borah and Bera (2003) and Borah et al. (2019a)]. Such models can be selected as candidate models for further consideration. • When there is sufficient uncertainty in the quality of synoptic data or when the underlying processes are poorly understood. There might be circumstances such as when evaluating hydrologic processes with limited data coverage, when the effects of multiple processes cannot be determined individually, or when current monitoring and data collection limitations renders it difficult or even impossible to eliminate systematic errors in data collection. In such circumstances, either resource constraints limit the ability to gather synoptic data, or the science itself has not advanced to a degree to which appropriate and adequate data have been collected (Beven 2018). These situations can include emergent contaminants, complex coupled processes such as habitat degradation and bioaccumulation and amplification in aquatic foodwebs, and contaminants such as trash for which physical processes are either not well-defined or have not yet been translated to process-based models. In such cases, empirical models such as parametric or black-box regressions, which have been hypothesis-tested to establish cause-and-effect relationship between loads and impairment, could be considered for additional screening. In some cases, such as when useful synoptic data have not yet been collected or there are severe resource constraints faced by the modeling team, the prevailing practice may be to use statistical relationships such as load-duration curves.
In all instances, model reliability for the specific application must be tested by objective and scientifically defensible calibration and validation using adequate synoptic data. If the final model selection cannot be reliably calibrated and validated, the model selection process must be repeated if there are other candidate models that can be considered. If there are no other candidate models, then the TMDL could be implemented in a phased manner while allocating resources to collect the synoptic data required to develop a more reliable model (USEPA 2006).

Key Factors in the Model Selection Process
Both technical criteria and management constraints must be considered in the TMDL model selection process apart from fundamental principles (Benham et al. 2008). After applying the fundamental model selection principles, technical criteria and management constraints must be applied in a rational sequence. The major technical criteria and management constraints in TMDL model selection are presented in the following two sections (Fig. 3).

Technical Criteria
Once a waterbody is 303(d) listed, technical criteria provide the basis for applying fundamental principles to select the optimal model. These include four main criteria.
The first main criterion is identifying an impaired waterbody [Figs. 3 and 4(a); Table 3]. A TMDL may be determined for a single stream segment only when all the loading is from point-source discharges into the watercourse. When there are nonpoint sources or distributed point sources that must be managed, then the TMDL may be determined for an entire basin that includes tributary watersheds. Estuarine and lacustrine waters require evaluation of both upstream and downstream loads. Therefore, the type of watershed is an important consideration in identifying the impaired waterbody [first box in Fig. 4(a)].
In some cases, such as in the Willamette River in Oregon, there may be distributed loads in tributary watersheds (Annear et al. 2004). In other cases, such as in Florida, there may be significant surface water and groundwater interaction due to infiltration and delayed seepage through a high-water-table aquifer (Borah et al. 2019a). Hydraulic BMPs may also impact impairment. In all these cases, it is important to consider the relationship of the impaired waterbody with other waterbodies [second box in Fig. 4(a)].  In perennial waterbodies, hydrometeorological variability on hourly to annual timescales can cause changes in the flow regimes that must be considered [third and fourth boxes in Fig. 4(a)]. Additionally, about 3% of the gauged streams in the US are ephemeral, intermittent, nonperennial, or nonfunctional ecosystems (Zimmer et al. 2020) and do not currently need to be listed [fourth box in Fig. 4(a)]. However, many of these streams are concentrated in the Midwest and along the seaboards, and the water quality in these types of waterbodies is a concern for communities living in these regions. Ephemeral, intermittent, and nonperennial streams can change from flowing to stagnant. This can lead to altered water chemistry and ecology that should be accounted for (Skoulikidis et al. 2017). Nonfunctioning ecosystems may take years to revive, and, in such cases, ecological considerations drive modeling for hydraulic and water quality restoration in these waterbodies (Hall et al. 2014). When considering both perennial and nonperennial waterbodies, it is important to engage with stakeholders to understand the beneficial use of the unimpaired water to the community [fifth box in Fig. 4(a)].
The second criterion is identifying the pollutants causing impairment [Figs. 3 and 4(b); Table 3]. This is the focus of the TMDL. The types of loads must be enumerated, and their levels must be quantified [first and second boxes in Fig. 4(b)]. Understanding the nature of the impairment and the cause-and-effect relationship between the loads and the impairment is a necessary step in model selection [third box in Fig. 4(b)]. For example, the fecal coliform counts in the rivers of a rural watershed may be so large as to render the water unfit for potable use or recreation (Riebschleager et al. 2012). Another example may involve fish eggs in a segment of a river which have been rendered sterile by high water temperature (Oregon Department of Environmental Quality 2000). Although a load-duration curve or a fate and transport model would be appropriate for the former (Riebschleager et al. 2012), a heat budget model would be a better option for the latter (Daniels et al. 2018).
The nature of the impairment also influences model selection [fourth box in Fig. 4(b)]. If a persistent water quality problem such as eutrophication due to agricultural runoff needs to be addressed, a one-dimensional (1D) hydraulic model with a series of steady-state solutions used in conjunction with a water quality model may be considered. However, if an event-based problem such as a spike in total suspended solids due to a spatially localized storm needs to be addressed, then a two-dimensional (2D) or three-dimensional (3D) hydrodynamic model may be considered. The choice of model would also be influenced by the need to perform critical condition modeling (Zhang and Padmanabhan 2019). The selected TMDL model must be able to incorporate the water diversion, water treatment, and BMP management actions that are currently implemented and are being planned for the future to adequately simulate the water quality within the waterbody [fifth box in Fig. 4(b)].
The third criterion is data requirement [Figs. 3 and 4(c); Table 3]. Reliable data are required to parameterize, calibrate, and test models. Ambient monitoring for various water quality management objectives usually does not provide this data (Martin et al. 1991). Various sources of data [boxes in the bottom row in Fig. 4(c)] produce discrete, continuous, or episodic information [third row in Fig. 4(c)] on different physical, chemical, and biological processes with varying degrees of uncertainty and accuracy at multiple spatial and temporal resolution [second row in Fig. 4(c)]. The data obtained from water quality monitoring, GIS layers, and remote sensing of the watershed's LULC patterns, topographic and bathymetric surveys, soil surveys, soil maps, and stream gauges can be used to develop and force TMDL models [left side of Fig. 4(c)]. If the TMDL models are part of an integrated system, then the results of other models may also be incorporated [right side of Fig. 4(c)].
Infrequently, states collect and archive synoptic data every 5 years to calibrate and test WLA models (Mills et al. 1988) for NPDES permit renewal, and research institutions may also collect synoptic data (McKenzie et al. 1979). Hence, an independent synoptic data collection plan must be formulated. The collection of synoptic data would allow for a cause-and-effect understanding of the load-impairment relationship in the watershed and the physical, chemical, and biological processes that affect impairment within the waterbody. As indicated in Task 4 in Fig. 3, if the evaluation of the synoptic data requirements and the formulation of the data collection plan are integrated within the model selection process, a high-quality TMDL model can be developed. Along with the data itself, metadata on quality, accuracy, and precision allow the modeling team to determine whether the data are usable and evaluate the observational biases and errors.
Additionally, the spatial and temporal resolution, boundary conditions, and loads should be evaluated to determine if they are adequate to resolve the dynamic nature (steady state, cumulative, periodic, or episodic) of the impairment. Boundary conditions and loads should have a temporal resolution as close as possible to the model time step to maintain process integrity (Sridharan et al. 2018b). Results from climate, ocean, LULC, and population models may also feed into the TMDL model. The resolution, accuracy, and precision of these results must be quantified to ensure that biases and uncertainty are correctly accounted for.
The fourth criterion is uncertainty about the future [Figs. 3 and 4(d); Table 3]. Anticipated changes in a region contribute to the P LA term in a TMDL determination (Hoyer and Chang 2014). These changes can include sea-level rise in coastal watersheds and altered seasonal temperature and water quality trends and increasing frequency of natural disasters everywhere due to climate change [first box in bottom row in Fig. 4(d)]. The continuing exploitation of navigable waterways leads to altered channel morphology and hydraulics and impaired water quality [second box in bottom row in Fig. 4(d)]. LULC modifications due to rezoning, shifts in cropping patterns, rapid urbanization, natural disasters, and human population growth can also change the loads within watersheds [third and fourth boxes on the third row in Fig. 4(d)]. These changes can either directly influence the TMDL load allocation [first box on second row in Fig. 4(d)], or may result in changing policies, new management practices, and novel vulnerabilities and opportunities that managers have to consider [boxes on the third row in Fig. 4(d)] that can be incorporated into the reserve capacity [second box on second row in Fig. 4(d)].
For example, loads from point sources due to human population growth and commercial activities can change over time (LimnoTech 2015). In urban areas, increases in impervious surfaces tend to intensify the peakedness of stormwater loads, whereas agricultural development may result in nonpoint-source loads, increasing at a slower pace than from equivalent areas of urban development. Climate change can intensify stormwater loads during floods and exacerbate water quality problems during droughts. In coastal regions, sea-level rise can increase tidal inundation and salinity intrusion and degrade littoral vegetation and coastal infrastructure. A good TMDL model should be able to evaluate the role of these stressors and planned BMPs on the water quality (Herron 2017).

Management Constraints
Due consideration of practical criteria after selecting the optimal model allows the modeling team to select an appropriate TMDL model. There are three main key management constraints to be considered.
The first main constraint is due to available resources [Figs. 3 and 5(a); Table 4]. States or other jurisdictions and sometimes select stakeholders use the characteristics of the impairment (i.e., the waterbody type and key water quality constituents) to tentatively determine the type of TMDL models to consider when making funding requests. Water quality management plans approved by the USEPA a priori and annual state budgets and other funding from federal [e.g., USEPA 319 grants and Natural Resources Conservation Service (NRCS) watershed restoration funds] and state programs, interstate compacts, and nongovernmental entities define the priority and the resources available for TMDL determination. These resources influence the type of TMDL model that can be selected. Although process-based models can provide highresolution watershed-wide coverage, they are also budget-, timeand modeler skill-intensive [boxes in Fig. 5(a)]. The second main constraint is level of stakeholder collaboration [Figs. 3;5(b) and Table 4]. Voluntary stakeholder participation that extends well beyond the legally required public comment periods is an important factor in model selection. Ideally, stakeholders should be involved in each step of the TMDL determination and implementation as well as model selection so that trust can be built about the model simulations. For example, in California, oversight by the Little Hoover Commission (Levinson 1993) encourages well-defined ties between various agencies and state water boards. This oversight also ensures regularly updated models to produce scientifically developed and well-managed TMDLs (Little Hoover Commission 2009). With strong stakeholder participation, the USEPA, NRCS, other federal agencies, and other state and local agencies are likely to supplement state funding and provide in-kind services for TMDL determination and implementation of load reductions for nonpoint sources [first box in Fig. 5(a)]. Some collaborating agencies may provide additional modeling expertise [second box in Fig. 5(b)].
Polluters may voluntarily increase point-source sampling to supplement synoptic data collection or be compelled to more accurately quantify loads by modifying the NPDES permits for point sources and industrial stormwater nonpoint sources [third box in Fig. 5(b)]. Nongovernmental organizations such as environmental groups may deploy citizen water quality monitoring networks to supplement synoptic data collection [fourth box in Fig. 5(b)]. Citizen sampling might also be coordinated for fact finding and impairment assessments after the TMDL implementation.
The third major contraint is the scope of the TMDL model [Figs. 3 and 5(c); Table 4]. Although a TMDL model is useful to determine the TMDL, it should ideally also be useful for evaluating changing water quality and beneficial uses. When used for TMDL determination, ideally, the state or other jurisdiction must deploy models of suitable complexity and sophistication required to establish the cause-and-effect relationship between loads and impairment. In challenging flow regimes, this could even involve the use of research codes [first box in Fig. 5(c)]. Depending on the state authority to manage pollution sources and the state allocation priorities, sometimes the TMDL model may also be used to conduct comprehensive evaluations of multiple load reduction allocation or waste assimilation alternatives. Additionally, the model may also have to be used to evaluate the impact of BMPs on nonpoint-source load reduction and water quality. In these instances, these models may be used as decision support tools.
Typically, scalable models which require fewer computational resources to run with user-friendly interfaces are preferable as decision support tools than very sophisticated 2D and 3D numerical models [second and third boxes in Fig. 5(c)] (Huber et al. 2006;Sparkman et al. 2017;USEPA 2018b). When the model is to be used primarily as a forecasting tool, then multiple model instances with slight perturbations of the model parameters is required (Reichle 2008). In such cases, a good data assimilation technique would compensate for a simple cause-and-effect model (Daniels et al. 2018).

Model Selection
Identifying the impairment and flow regimes over which a TMDL is applicable allows for model selection fundamentals to be applied to formulate an optimal model (Fig. 3). Formulating this optimal model early ensures that iterations in the TMDL model selection will be minimized. An appropriate TMDL model can subsequently be adapted from the optimal model choice based on other criteria. Several options exist for TMDL determination. We subsequently list the major classes of TMDL models.

Types of TMDL Models
In this model selection protocol, we strongly advocate for the use of cause-and-effect models for reliable TMDL determination, allocation, and implementation planning. Such models systematically link all significant loads to impairments and are useful for understanding the underlying process mechanisms within the system. Practical and reliable cause-and-effect models provide the best opportunity to extrapolate TMDLs beyond the range of data available to calibrate and confirm the models.
The type of cause-and-effect model selected is based on the nature of the waterbody, flow regime, pollutants causing impairments, loading conditions, and other technical and management considerations. In this paper, we discuss three types of models to guide the selection process (bold boxes in Fig. 3).
Numerical process models are the first type. These are mechanistic models of the receiving waterbody and contributing watershed in which the laws of conservation of mass, momentum, and heat and some empirical or semiempirical kinetic relationships must be numerically simulated to relate loads to impairment (Bowie et al. 1985;Martin and McCutcheon 1999). Typically, when flows and loads do not vary rapidly over time, steady-state models may be used. Pseudodynamic models may be used when flows and loads change slowly over time or are likely to occur at distinct levels.
For example, such models are useful when diurnal eutrophication violates dissolved oxygen and nutrient standards for part of the day (Brown and Barnwell 1987). Dynamic models are typically necessary when rapidly changing episodic and periodic impairments cannot be simulated as a critical condition during which some averaging is possible to simplify simulations. If extensive monitoring of an impairment is not available to adequately define a critical condition, then a practical, reliable dynamic model must be selected (Zhang and Padmanabhan 2019).
The number of spatial dimensions necessary to reliably simulate an impairment and distinguish loads is another vital model selection criterion (Martin and McCutcheon 1999). For unstratified ponds, small lakes, or short river reaches, zero-dimensional or completely mixed models may be used when the violation of standards are consistent over the entire water segment. For long river reaches and shallow, unstratified, run-of-the-river reservoirs, 1D models that are resolved in the along-stream direction may be adopted. For deep lakes and estuaries in which stratification and vertical mixing processes are important, 1D models with vertical resolution may be selected. When wide, shallow impaired waters are laterally and longitudinally variable, 2D models should be selected. In impaired waters with complex hydrodynamics and physical, chemical, and biological processes varying longitudinally, vertically, and laterally, 3D models are strongly recommended.
Often pollutants enter impaired waters as a complex mixture from point and nonpoint sources. If the receiving water model domain cannot be extended into tributaries to distinguish and separate the water quality effects of each point source and classes of manageable nonpoint sources, auxiliary watershed models are necessary to distinguish manageable nonpoint-source loads during TMDL implementation (Effler et al. 2002;Annear et al. 2004;Liu et al. 2008). These watershed models can include dynamically varying loads. They can also include annualized loads that are approximated as annual averages, seasonally varying loads, or constant yield loads that are steady). Depending on the spatial resolution of lumped parameterizations within the watershed, these models lose process integrity or realism to some degree.
Numerical models may be further classified into • Receiving water quality models. These models simulate hydrodynamic and water quality processes within the waterbody to link loads to impairments (McCutcheon 1989;Martin and McCutcheon 1999;Camacho et al. 2019). Receiving water models are selected without watershed models when all point and nonpoint sources discretely discharge directly into the receiving water segments and are not attenuated between source areas and the receiving water (McCutcheon 1989;Sparkman et al. 2017). These models can also be used to evaluate the assimilative capacity of an impaired waterbody (Chapra 2003). More recently, the use of receiving water quality models has started enabling the management of nonpoint sources ). • Watershed models. These auxiliary models of landscape hydrology may be necessary to distinguish all significant, manageable point-source and nonpoint-source loads including manageable natural background loads, and if pollutants are significantly attenuated between source areas and the receiving water impairments during overland flow, interflow, and groundwater recharge of streams (Huber et al. 2006). Watershed models require specifications of precipitation, land use, impervious areas, slopes, soil types, drainage area, and other information (Borah et al. 2019a). These watershed models simulate the impact of BMPs on the water quality of runoff and stream recharge using semiempirical and fully empirical approximations (Huber et al. 2006;Sparkman et al. 2017;Borah et al. 2019a). Over the last 3 decades, the development of improved watershed models has faciliated more accurate modeling and managing of point sources. • Integrated models. Integrated models can either be models in which multiple processes are represented within one model or a combination of different models . Such models may be dynamically linked such that models communicate information with each other throughout the simulation. For instance, coupled surface water and groundwater models fall under this category. Such models may also be loosely linked, such that the outputs of one model are used to drive another model. Coupled water resources and socioeconomic models, or hydrodynamic and ecological models, are examples of such loose linkages . These are used when there are many water quality constituents of interest with multiple chemical and biological cycles, or when multiple waterbodies have to be included. These approaches may also be used when the models are used to achieve multiobjective management goals throughout the watershed . Numerical models may be further classified on the basis of complexity and sophistication. In this paper, we define complexity as increasing with the number of modeled processes and sophistication as increasing with improving process integrity. Models such as Hydrological Simulation Program-Fortran (HSPF) (Bicknell et al. 2001) that can be executed in a user-friendly environment such as better assessment science integrating point and nonpoint sources (BASINS) (USEPA 2019a) can include many physical, chemical, and biological submodels that are relatively simple. When nonpoint-source loads throughout the watershed are major drivers of impairment, such models may be considered. On the other hand, hydrodynamic models such as MIKE11 (Havnø et al. 1995) or Stanford Unstructured Nonhydrostatic Terrain-following Adaptive Navier-Stokes Simulator (SUNTANS) (Fringer et al. 2006) are more sophisticated. When transport and mixing processes within the water column are dominant drivers of impairment, and when waste assimilation in the waterbody is to be represented, these models may be considered. Sometimes, models such as the Environmental Fluid Dynamics Code (EFDC) (Hamrick and Wu 1997;Tetra Tech 2002), National Center for Computation Hydroscience and Engineering 2D (CCHE 2D) Jia 2013), or Delft-3D (Deltares 2012) suites can include sophisticated hydrodynamics as well as a complex set of chemical and biological process models.
The number of dimensions of the waterbody to be represented and the skill of the modeling team determine the sophistication of numerical models. Skilled practitioners can use models that resolve hydrodynamic processes but also require advanced computing and programming ability [e.g., SUNTANS, Semi-implicit Cross-scale Hydroscience Integrated System Model (SCHISM) (Zhang et al. 2016), unstructured three dimensional model (UnTRIM) (Cheng and Casulli 2002), Delft-3D]. These models typically have higher process integrity but can be less stable than simpler hydraulic or hydrologic models [e.g., Hydrologic Engineering Center-River Analysis System (HEC-RAS) (USACE 2016)]. The latter class of models allow large numerical diffusion to stabilize the model results in coarse resolution domains but seriously compromise process integrity as the resolution is increased (Sridharan et al. 2018b).
Analytical process models are the second type. These models include mass balances and solutions of the transport equation that mechanistically link loads to impairment (Martin and McCutcheon 1999;Zhang and Quinn 2019). In very rare cases, such models might be used for TMDL implementation planning in small tributaries or subwatersheds to account for nonpoint-source loads that do not vary significantly over time (Zhang and Quinn 2019).
Empirical models are the third type. These are models that relate loads to impairment in a structured but not mechanistic way. Paradoxically, empirical models are among the most data-intensive as very large monitoring datasets are required to build reliable models. These datasets must comprise of long time series of multiple water quality constituents and physical, chemical, biological, hydrological, pedological (related to the soil), meteorological, and land use characteristics. In principle, these empirical relationships are only applicable within a specific range of variable values that were investigated to develop them. These methods are useful when TMDLs are being determined for novel pollutants or when transport and fate processes are unknown. Empirical models include the following types: • Parametric models. These relationships between loads and impairment are developed by regression. For example, the spatially referenced regressions on watershed attributes (SPARROW) model relates loads in discrete areas within a watershed parametrically to water quality in receiving waters and has been used to establish TMDLs for total nitrogen and phosphorous in New England's rivers (Moore et al. 2004). • Stochastic models. These are models that include estimates of the variability in the water quality due to variability in the loads. For example, the Source Loading and Management Model (SLAMM) and Water Erosion Prediction Project (WEPP), respectively, include variability in runoff and meteorological inputs to predict a range of water quality levels (Shoemaker et al. 2005). • Statistical relationships. These include flow-duration and loadduration curves (USEPA 2007) that describe the statistics of the observed water quality. Load-duration curves can be used when flow dilution dominates water quality in gauged streams (USEPA 2007). The load-duration curve has been used to establish a TMDL for fecal coliforms in the Pee Wee River Basin in South Carolina (SCDHEC 2005). • Black-box models. These include models built using machine learning and artificial intelligence techniques such as genetic algorithms, artificial neural networks, and deep learning, among others. These relate loads to impairment via complex nonlinear hidden functions that cannot be easily described. Although these approaches have been shown to have improved performance in predicting water quality when compared with regression models when the loads, hydrology, meteorology, and BMPs are known a priori (Tufail et al. 2008), they typically require high degrees of expertise in data sciences to implement, understand, and justify. The ASCE TMDL Analysis and Modeling Task Committee (forthcoming) found no evidence that these methods have actually been used to determine TMDLs or that they will begin to be used frequently in the near future. There is likely to be much greater potential in applying these techniques in conjunction with numerical or analytical approaches to update specific model assumptions based on the data (e.g., Whigham and Crapper 2001).

Optimal Model Formulation
First, the hydrodynamic processes in the waterbody need to be understood (first choice in Fig. 3). For a single water segment or a whole basin, a receiving water quality or a receiving water quality model and an auxiliary watershed model may be optimal, respectively. If different types of waterbodies are interconnected, such as a series of lakes and rivers, then integrated models may be optimal . When selecting the modeling approach, the complexity and sophistication of the optimal receiving water quality model must also be determined . For instance, if the objective is to understand and mitigate eutrophication in a deep lake associated with seasonal stratification and overturn dynamics, a steady state lake eutrophication model such as BATHTUB (Walker 2006) or a vertical 1D model such as Dynamic Reservoir Simulation Model-Water Quality (DYRESM-WQ) Schladow and Hamilton 1997) may be appropriate. However, in a lake or reservoir where significant point-source and nonpoint-source pollution contribute to lake pollution (e.g., agricultural and animal husbandry activities on the shores), a 2D model such as CE-QUAL-W2 (Cole and Wells 2018) or a 3D model such as EFDC (Hamrick and Wu 1997;Tetra Tech 2002;Zou et al. 2006), which allows these sources to be explicitly represented, should be considered.
Next, the complexity of the impairment needs to be assessed (second choice in Fig. 3). Here, consideration of the impairment failing the designated use criteria may require simulating additional impairments that influence the system. For example, in California, a TMDL to improve dissolved oxygen conditions in a 32-km-long (20-mi-long) deep shipping channel for endangered migrating salmon in the Sacramento-San Joaquin Delta required a significant reduction in algal loading from contributing watersheds. Algae in the shallow river system settled in the deeper, wider ship channel, turning to periphyton and exerted an oxygen demand. However, a policy to reduce agricultural return flows to the lower San Joaquin River to reduce nutrient loads that stimulated algae growth caused an increase in salinity at a monitoring station upstream of the estuary. In this instance, control actions that were best suited to cost-effectively address one TMDL made it more difficult to achieve another.
In this case, salinity and phytoplankton growth in both the river and estuary were simulated with the Watershed Area Resource Management Framework (WARMF) (Herr and Chen 2012). The framework allowed a simple 1D hydraulic model in the river to interface with a 1D hydrodynamic model in the Delta. WARMF also includes a consensus module to support informed management decisions, which better presents the engineering module simulations for technical and nontechnical stakeholders. An integrated approach comprising of a cascade of models was thus necessary to simulate the fate and transport of more than one pollutant responsible for impairment.
Alternately, a TMDL may have to be developed for ecological objectives that require an understanding of the biology of a single species or an entire food web or ecosystem. In such cases, agentbased (Sridharan et al. 2018a) or individual-based models (Railsback et al. 2013) for the former or ecosystem models such as Ecopath with Ecosim (EwE) (Steenbeek 2016) for the latter may be selected.

Choice of TMDL Model
The actual TMDL model chosen depends on both the optimal model and practical criteria. Based on the complexity of the physical, chemical, and biological processes, the data required to parametrize those processes (e.g., high-resolution river-bottom substrate maps to set roughness values in a 3D hydrodynamic model), the expected changes in the region over time, and modeling skill at the state or other jurisdiction, different model types may be selected (third and fourth choices in Fig. 3). The time-varying nature of the flows and loads then determines whether the chosen model is steady-state, pseudodynamic, or dynamic (fifth choice in Fig. 3).

Model Inputs and Outputs
The type of model chosen dictates what type of inputs and parameters are required to run the model. For example, watershed models play an important role in linking nonpoint-source loads to impairment, and require meteorological, land use, hydrologic, topographic, pedological and morphological data inputs (Borah et al. 2019a). On the other hand, receiving water quality models require higher frequency inputs from point and nonpoint sources.
Outputs from numerical models are typically time series of flow and water quality constituents at discrete locations. Analytical solutions are functions that describes the evolution of water quality variability continuously in space and time. The results of empirical models are relationships among flows, loads, and impairment at one or more locations. To correctly understand these results and quantify the MOS, both the spatial and temporal resolution of the model must be matched to that of the synoptic data. This ensures that representation errors are avoided (Oke and Sakov 2008). For example, a continuous temperature sensor at a location may sample several parcels of flowing water in each sampling window. Therefore, model results should be averaged over the same time window to reduce representation error. Correctly representing the model results in more reliable water quality forecasting, TMDL determination, MOS estimation, and implementation modeling.

Discussion
TMDL model selection involves the interaction of various factors at different phases of the TMDL. We identified seven factors that impact model selection. The relative importance of factors shown in Figs. 3-5 is affected by the interplay among the practical criteria for successful TMDL modeling (Cabrera-Stagno 2007; Benham et al. 2008). By carefully considering these factors, modeling teams can develop parsimonious and defensible TMDL models chosen in a cost-effective manner. In Table 3, we list various benefits accrued with careful consideration of the technical factors, as well as drawbacks without due consideration of these factors on model selection and ultimate success of the TMDL. Similarly, in Table 4, we list the potential benefits of considering management factors versus not considering them. We compiled these tables using information in several targeted guidance documents on TMDL development and surveys of TMDL applications (Deas and Lowney 2000; Neilson and Stevens 2002;Imhoff 2003;Cabrera-Stagno 2007;Clark and Vanrolleghem 2010). With a synergy of these factors, the modeling team can choose among process-based models with different levels of complexity and sophistication to provide an appropriate level of detail for analysis. The type of model ultimately chosen will influence the effectiveness of the TMDL. When nonpoint-source loading is to be represented accurately and when processes within the water segment are important, receiving water quality models of appropriate sophistication and complexity may be selected . When loads are distributed throughout the watershed and there are many mechanistic processes that influence impairment, additional auxiliary watershed models may be selected to model the loads (Borah et al. 2019a). Owing to climate change and rapid urbanization, the frequency and intensity of natural disasters are increasing everywhere (Ramaswami et al. 2018). This will result in increasingly more common occurrences of critical conditions. Therefore, the selected TMDL model must be able to represent steady-state or dynamic critical conditions (Zhang and Padmanabhan 2019). As the interfaces between different sciences blur and solutions to challenging human problems become transdisciplinary, there will be increasing dependence on integrated or linked models .
In all instances, every effort must be made to select processbased cause-and-effect models. Parsimonious process-based numerical models developed with high-quality data are necessary to reliably forecast water quality and allow feasible load-reduction allocations and waste limitations. More complex models can resolve many physical, chemical, biological, and ecological pathways in nature and can be useful to link nonpoint-source loads within the watershed to impairment. More sophisticated models can simulate the water column with greater process integrity. Even though TMDLs have been determined using analytical and empirical methods in a limited number of cases (e.g., Moore et al. 2004;SCDHEC 2005;Shoemaker et al. 2005), the state or other jurisdiction should always work toward collecting synoptic data to build high-quality numerical models.
When selecting an appropriate TMDL model, it is crucial to budget for synoptic data collection. At the same time, limitations in data collection due to social and cultural taboos or pushback from local communities may hinder requisite data collection. For example, in the Sacramento-San Joaquin Delta, one of the most significant sources of uncertainty in flow and nonpoint-source loading is the volume of pumped diversions and agricultural runoff (Sridharan et al. 2018b). In such cases, the state or other jurisdiction should try to actively engage with local stakeholders to augment synoptic data collection through increased self-reporting. In certain instances, the signal of certain processes may be very difficult to isolate. For example, in macrotidal estuaries, subtidal density-driven circulation may not be discernable within the tidal fluctuations in flow (MacCready and Geyer 2010). These limitations are not easily overcome with even a well-designed synoptic data collection effort. In such cases, the most parsimonious and defensible model must be chosen even if there might be a more complex or sophisticated model that can represent such undiscernible processes.

Conclusions
We have developed and presented a comprehensive protocol and associated flowchart (Fig. 3) to aid researchers and practitioners in selecting a parsimonious and defensible model for determining TMDLs given practical criteria. This protocol includes seven classes of factors that affect the water quality management program: (1) types of waterbody impairment, (2) pollutants that cause the impairment, (3) data requirements, (4) uncertainty about the future, (5) resource constraints, (6) level of stakeholder collaboration, and (7) scope of the TMDL model.
It is clear from the TMDL reports published within the last 5 years that systematic model selection processes based on welljustified rationale and supported by a synoptic data collection plan are not currently being extensively applied (Fig. 1). Several attempts have been made in the past to catalog some models that can be adapted for determining TMDLs. More recently, a series of papers appeared in a special collection of the ASCE Journal of Hydrologic Engineering on TMDL analysis and modeling in which the applicability of numerous state-of-the-art models were discussed in relation to the types of impairment problems. In this paper, we build on this body of work by providing a comprehensive scientific basis that includes both technical and management criteria for appropriate model selection.
Throughout this paper, we list several examples of TMDL model selection. The simplest empirical and analytical models are not very useful in determining TMDLs but can be used for an initial qualitative assessment of the impairment. Statistical, stochastic, and black-box models could be also useful in initial assessment and in certain flow-dilution-dominated impairments be used to determine the TMDL. However, these models are data intensive and not very useful in establishing cause-and-effect relationships between loadings and impairment. At the same time, about 13% and 40% of the USEPA-approved reports on TMDLs developed between 2015 and 2020 have relied on analytical and empirical models, respectively.
However, there are encouraging indications that in the case of pathogen, nutrient, algae, sediment, temperature, and heavy-metal impairments, numerical process-based models are being increasingly used to determine TMDLs (Table 1). One and two-dimensional numerical process-based models are the most versatile and should ideally be used to determine TMDLs but have significant data and resource requirements. Sophisticated three-dimensional numerical models are very useful for targeted studies of a small part of a watershed or a single waterbody but are extremely resource-intensive and typically do not translate into versatile decision support tools. The rate of computational capacity increase in the last 3 decades has not kept pace with the extremely resource intensive nature of sophisticated 3D numerical models (Fringer et al. 2019). Therefore, it appears that 1D and 2D or even coarse-resolution 3D models on with simple computational algorithms complemented by a strong synoptic data collection program will continue to support management actions for the foreseeable future. We formulated the model selection protocol with lean design principles, so that (1) minimal looping of iterative model selection tasks is ensured, (2) information siloing and suboptimal interactions of cross-functional teams are minimized by specifying protocols of engaging with stakeholders and experts during the TMDL model selection process, and (3) optimal resource utilization is ensured by incorporating the data collection requirements of the optimal model into the practical criteria affecting the state or other jurisdiction. Urban sprawl means that over the coming decades, sustainable cities of the future will form a significant fraction of watersheds. The use of more complex and sophisticated integrated models that link water quality with environmental, ecological, and socioeconomic models will become increasingly necessary to determine and implement meaningful TMDLs. We hope that the application of this protocol and the associated flowchart will result in optimal TMDL model choices under a nimble management landscape that will rise to meet the challenges of an uncertain, yet exciting future.