Capturing ecological processes in dynamic forest models: why there is no silver bullet to cope with complexity

. Dynamic forest models are a key tool to better understand, assess, and project decadal- to cen-tennial-scale forest dynamics. Despite their success, many questions regarding appropriate model formulations remain unresolved, and few models have found widespread application, for example, across a whole continent. We aimed to scrutinize the representation of ecological processes in dynamic forest models so as to rigorously test core assumptions underlying forest dynamics and the consistency of their interplay, taking the ForClim model as a case study. We developed a set of alternative representations for the main ecological processes, that is, tree establishment, growth, and mortality, and light extinction through the canopy, based on diverse sources of empirical data. We applied a pattern-oriented modeling (POM) approach to test all combinations of the standard and alternative formulations (>500 model versions) against a comprehensive set of patterns for diverse model applications across a wide range of site conditions. We found that adapting one process in isolation can improve model performance for one speci ﬁ c application. However, the best model versions typically included more than one alternative formulation. Importantly, the best version for an individual application was generally not the best across multiple applications, emphasizing the varying in ﬂ uences of ecological processes. We conclude that the behavior and performance of complex models should not be analyzed for a few speci ﬁ c applications only. Rather, multiple applications, system states, and dynamics of interest should be scrutinized across a wide range of site conditions. This allows for avoiding over ﬁ tting and detecting and eliminating structural shortcomings and parameterization problems. We thus propose to make use of the ever-increasing data availability and the POM framework to challenge the core processes of dynamic models in a holistic manner. For model applications, we propose that a set of alternative formulations (ensemble simulations) should be used to quan-tify the impacts of structural uncertainty, rather than to rely on the projections from one single model version.


INTRODUCTION
Simulating decadal-to centennial-scale forest dynamics is of high interest for manifold purposes, from testing assumptions about fundamental ecological processes (Botkin et al. 1972b, Berzaghi et al. 2018 to projecting future stand development, forest composition, and ecosystem service provisioning under scenarios of forest management and/or climate change (Biber et al. 2015, Irauschek et al. 2015, Shugart et al. 2018. Dynamic, process-based vegetation models (DVMs), particularly dynamic global vegetation models (DGVMs; Foley et al. 1996, Friend et al. 1997) and forest gap models (FGMs; Botkin et al. 1972a, Bugmann 2001, Shugart et al. 2018, are important tools for assessing long-term forest dynamics because they incorporate the key processes of tree regeneration, growth, and mortality. Yet, the development of DVMs is challenging since many processes are involved and interact across temporal and spatial scales (Bugmann 2001, Keane et al. 2001, Price et al. 2001). Thus, large uncertainties remain regarding the quantification of the ecological processes that govern forest dynamics, including both structural and parameter-related uncertainties (Fortin et al. 2009, Hl asny et al. 2014, Horemans et al. 2016, Reyer et al. 2016, Neumann et al. 2017, Huber et al. 2018. Not surprisingly, a plethora of model formulations are used in DVMs, but they are rarely benchmarked rigorously. Yet, alternative process formulations have the potential to strongly alter simulation results, particularly when DVMs are applied for climate change impact assessments (Nishina et al. 2015, Horemans et al. 2016. Since the introduction of DVMs in the late 1960s (Siccama et al. 1969, Botkin et al. 1972a), many efforts have aimed at improving process representations and enhancing model applicability (Larocque et al. 2016). Yet, these most often focused on single processes such as regeneration (Didion et al. 2011), light extinction through the canopy (Courbaud et al. 2003), phenology (Chuine andBeaubien 2001, Schaber andBadeck 2003), or mortality (Manusch et al. 2012). However, the focus on a single process during model development may lead to the implementation of a formulation that improves model performance through counterbalancing shortcomings in other parts of the model, instead of increasing the overall ecological reality (i.e., getting the right patterns for the wrong reasons). This risk is particularly high if only few highly aggregated model outputs are used for model evaluation (e.g., basal area for forest models). Despite pleas to jointly revisit core processes (i.e., tree establishment, growth, and mortality; e.g., Bircher et al. 2015, Courbaud et al. 2015, H€ ulsmann et al. 2018), this has-at least to our knowledge-not been done so far. Clearly, such an effort requires a comprehensive approach considering process interactions at multiple spatial and temporal scales.
Pattern-oriented modeling (POM) provides a bottom-up framework to support such model assessment and improvement (Grimm et al. 1996). Pattern-oriented modeling is a transparent approach to evaluate dynamic models by formulating alternative assumptions that are (1) treated as hypotheses within the same model architecture and (2) tested against multiple patterns observed from real systems (Grimm et al. 1996, Wiegand et al. 2003, Grimm and Railsback 2012. Therefore, a pattern denotes a characteristic, clearly identifiable structure in the nature of the modeled system itself or in field data (Grimm et al. 1996). By testing formulations with different levels of complexity (i.e., representing the real system at different levels of resolution), POM facilitates the identification of optimal model complexity (i.e., referred to as Medawar zone; Grimm et al. 2005). Pattern-oriented modeling assumes that a model from which a diverse array of observed patterns emerges contains all relevant mechanisms with an appropriate level of complexity for the model's purpose (Grimm et al. 2005, Grimm andRailsback 2012).
While the original formulations of DVMs were sometimes developed completely independent of field data-simply due to the lack of relevant data (Botkin et al. 1972a)-increasingly available large datasets (Lichstein et al. 2014, Schelhaas et al. 2018, new techniques for the acquisition (Hyyppa et al. 2008, Choat et al. 2018) and analysis of ecological data (Hartig et al. 2012) provide a large potential for better ecosystem understanding.
We use such data in a POM approach to rigorously test the representation of core processes underlying forest dynamics and the consistency of their interplay with a focus on European temperate forests. Due to the fact that different processes may be relevant to different applications (cf. Huber et al. 2018) and interactions between them are likely important, we hypothesize that the simultaneous scrutinization of all core processes is needed to improve the robustness of model projections for a broad set of applications in time and space. Therefore, we revisit the formulation of all core processes (i.e., tree establishment, growth, and mortality, and light extinction through the canopy) and rigorously test the underlying ecological assumptions against multiple patterns characterizing the structure (e.g., basal area), composition (e.g., dominant species), and succession (e.g., stand development over time) of forest ecosystems. We consider a broad set of model applications including long-and short-term simulations, monospecific and mixedspecies forests, and different disturbance regimes at 16 sites representing a cross-European gradient. In particular, we address the following questions: (1) Are modifications in more than one process needed to improve model performance across applications? (2) Which combination(s) of process formulations yield(s) the highest model performance? And (3) what can we infer from these insights regarding the long-term dynamics of temperate forests?

METHODS
Forest gap models are a class of DVMs that simulate the establishment, growth, and mortality of individual trees (or cohorts) on small patches, approximating the size of a canopy gap created by the death of a dominant large tree, as a function of biotic and abiotic factors (Bugmann 2001). Based on the original JABOWA model (Siccama et al. 1969, Botkin et al. 1972a, several hundred FGMs have been developed worldwide across all major biomes (Kimmins 2004, Larocque et al. 2016. Despite numerous development efforts, many FGMs and other DVMs maintain several of the original assumptions of JABOWA (Bugmann 2001, Keane et al. 2001, Price et al. 2001, Kimmins 2004, Larocque et al. 2016. In JABOWA, trees are established as saplings and grow according to a diameter-related growth equation considering limitations due to shading and site insolation, climate (represented by patch-specific temperature), and soil quality (represented by patch-specific soil moisture and nutrient availability). Tree death is modeled as a random process and includes a mortality probability that is constant, depending on speciesspecific maximum tree age, and a stress-related mortality probability that occurs if tree growth falls below a species-specific minimum (Botkin et al. 1972b).
ForClim is a state-of-the-art JABOWA-type FGM that was initially developed to predict tree species composition and forest dynamics in the European Alps (Bugmann 1994, Bugmann 1996. The model has been used for numerous applications to address both fundamental ecological (Morin et al. 2014, Bircher 2015, H€ ulsmann et al. 2018) and applied questions such as future forest development under climate change (Gutierrez et al. 2016. ForClim consists of four submodules: The WEATHER and WATER submodules calculate the bioclimatic inputs for the PLANT submodule. The latter simulates speciesspecific establishment, growth, and mortality of trees at yearly time steps on independent patches (usually 200) as a function of biotic and abiotic conditions (Bugmann 1996), while the MAN-AGEMENT submodule allows for the simulation of a wide range of planting, cutting, and thinning techniques (Rasche et al. 2011). The model is parameterized for >30 central European tree species, whereby the species parameters were estimated independently of the process representations (e.g., based on relative classifications, field or inventory data; Bugmann 1994, Wehrli et al. 2007, Rasche et al. 2012). An indepth analysis and discussion of parameter sensitivity was performed by Huber et al. (2018)  To simultaneously scrutinize the formulations of the core processes tree establishment, growth, and mortality, and the light regime, we compared different assumptions on the representation of these processes in ForClim. We formulated alternative, putatively improved process and parameter representations and compared them to the previous, that is, standard formulations. In addition, we revised the values of four species-specific parameters related to these processes based on expert knowledge and literature (Appendix S1: Table S1).
An overview of all formulations (i.e., the standard and alternative formulations) is provided in Tables 1, 2, and their rationale is explained below (section Ecological processes: identification of alternative formulations; full details are provided in Appendix S1). To assess the standard vs. alternative assumptions of all processes simultaneously, we created a comprehensive set of model versions consisting of a factorial combination of the standard and all alternative formulations (Tables 1, 2), amounting to a total of 504 model versions.
Since the model's sensitivity with respect to the different processes is likely to vary under different applications and environmental conditions (Huber et al. 2018), we ran all model versions at 16 sites representing a cross-European gradient for five applications considering long-and shortterm forest dynamics, monospecific and mixedspecies settings, and different disturbance regimes (natural or anthropogenic; see section Study sites and simulation protocol). To benchmark the performance of the model versions, we used the POM approach (see section POM approach: patterns and grading system).

Ecological processes: identification of alternative formulations
Establishment.-Establishment determines whether a species is present at a given site (Bazzaz 1979) and thus strongly influences forest structure and species composition (Grubb 1977, Kobe et al. 1995, Price et al. 2001. Simulation results from DVMs are highly sensitive to the assumptions regarding tree establishment (Courbaud et al. 2015, Huber et al. 2018, thus rendering the accurate modeling of establishment a key research need (Price et al. 2001). Yet, this has not been picked up by the research community to date, with few exceptions (Lischke and L€ offler 2006, Wehrli et al. 2007, Seidl et al. 2012, and in surprisingly many DVMs, the original and highly simplistic assumptions of the JABOWA model (Botkin et al. 1972a) are maintained.
Many DVMs, including ForClim, capture regeneration in a highly aggregated manner by establishing trees with a diameter at breast height (dbh) of typically 0.5-1.5 cm and an initial height >1.5 m (Bugmann 2001, Price et al. 2001. Thus, the production, dispersal and germination of seeds and the growth and mortality of seedlings are typically not represented. This is mainly due to limited quantitative knowledge on the underlying processes, especially with regard to the very high mortality rates in the first years of tree life (Houle 1994, Germino et al. 2002, Castanha et al. 2013. When establishing trees of a certain dbh, the routines of DVMs have to answer the following questions: (1) Which species are able to establish given the currently prevailing conditions? and (2) How many trees can establish per species and time step? In ForClim and other FGMs, it is assumed that (1) seeds of all species are always available (Price et al. 2001) and environmental filters determine the site-specific set of species available for establishment, including bioclimatic (e.g., temperature and soil moisture) and competition factors related to the overstory (e.g., light; Bugmann 2001, Price et al. 2001. Interactions within the establishment layer are typically not considered. Further, it is often assumed that (2a) shade-tolerant species establish fewer saplings than shade-intolerant ones (Botkin et al. 1972a, Risch et al. 2005 and that (2b) the number of saplings to be established in a given year per species is determined using a uniformly distributed random number with a low maximum value (Botkin et al. 1972a, Bugmann 1996, thus effectively assuming recruitment limitation (cf. Clark et al. 1999).
This set of assumptions is consistent with the fact that shade-intolerant species generally produce a large amount of small seeds, while shadetolerant species tend to be larger-seeded, which generally comes along with comparably low average seed numbers (Salisbury 1942, Smith and Fretwell 1974, Messina and Fox 2001, Grubb 2016). Yet, producing more seeds does not necessarily imply producing a higher number of saplings. Rather, larger-seeded species usually feature higher competitive ability among seedlings and higher tolerance of hazards during establishment than small-seeded species (Kollmann et al. 1998, Leishman 2001, Moles and Westoby 2006, Grubb 2016. Thus, a trade-off exists with regard to seed number and seed size, that is, the so-called seed number-seedling survival (SNSS) trade-off (Coomes and Grubb 2003). Ignoring the SNSS trade-off can severely bias simulated species composition, both after disturbance and in long-term simulations, by disproportionally favoring shade-intolerant species. Furthermore, the standard assumptions can lead to recruitment limitation of shade-tolerant species because maximum sapling number is quite low. For species-poor or monospecific stands, the extent of recruitment limitation of shade-tolerant species as implemented in ForClim is not supported by recent empirical findings of post-disturbance forest dynamics (Collet et al. 2008, Petritan et al. 2013, B ılek et al. 2014, Bolte et al. 2014, Brang et al. 2015, Rozman et al. 2015. For the alternative establishment routine, we thus relaxed the recruitment limitation at the level of the individual species by introducing a fixed, but liberal, maximum tree number (kTrMax), that is, the maximum number of all trees per forest patch, from which the number of new saplings is deduced and subsequently distributed across the species that actually establish (Appendix S1: Fig. S2). This routine takes species interactions within the seedling and sapling layer • The maximum number of saplings that can be established is restricted per species; therefore, shade-tolerant species establish distinctly fewer saplings than shade-intolerant species (E 0 ) • The establishment probability and the number of saplings established of a species are independent of the establishment of other species (E 0 ) • A species is able to establish at a given site if speciesspecific requirements are fulfilled (i.e., light availability, winter temperature, degree-days, soil moisture, and browsing pressure) • The maximum number of saplings across all species depends on the number of existing trees, site conditions (E 1 -E 6 ), and establishment conditions (E 3 -E 6 ); no species-specific restriction • The establishment of a species is influenced by the establishment of other species (reduction in the number of saplings, E 1 -E 6 ) • Set of options resulting in E 1 -E 6 : Table 1 G • kG values based on the parameterization from JABOWA (G 0 ; Botkin et al. 1972a, Kienast 1987 • kG values parameterized based on data from the Inter- Notes: Abbreviations are E, establishment (E 0 and E 1 -E 6 ); G, growth-rate parameter (G 0 and G 1 ); R, growth-reduction factor (R 0 and R 1 ); A, allocation of volume growth to height and diameter growth (A 0 , A 1 , and A 2 ); M, mortality (M 0 and M 1 ); and L, light regime (L 0 , L 1 , and L 2 ). The subscript 0 always refers to the standard formulation, while values >0 refer to alternative formulation(s).
into account and thus renders the a priori definition of species-specific maximum sapling numbers obsolete. To define which species are able to establish at a specific site, environmental filters are applied accounting for site-specific conditions. These were the same as in the standard routine (i.e., light availability at forest floor, maximum and minimum winter temperatures, degree-days, soil moisture, and browsing pressure; Bugmann 1996, Risch et al. 2005, Didion 2009, Didion et al. 2011. Furthermore, an establishment probability parameter (kEstP) is used to determine whether establishment occurs in a given year or not.
Within the structure of the alternative establishment routine, we formulated a set of options that can be combined with each other, resulting in six alternative formulations of increasing complexity (E 1 -E 6 ; Table 1). This was done to identify an optimal level of model complexity, as a simple formulation may neglect essential mechanisms, while a complex formulation may not be needed for the model's purpose (Grimm et al. 2005, Arnold et al. 2015. In the following, a short overview is provided, while a detailed description is available in Appendix S1. First, for the derivation of the establishment probability parameter (kEstP), option a 1 features a universal establishment probability (kEstP), while option a 2 employs a site-specific establishment probability as a function of seasonal temperature and drought conditions. Thus, option a 2 considers that establishment often fails at cold sites due to frost, wind abrasion, desiccation, or snow movements (Kullman 2007), and at dry sites due to drought or overheating (Kolb andRobberecht 1999, Kitzberger et al. 2000).
Second, to derive the maximum number of new saplings, maximum tree number (kTrMax) is reduced by site-specific limitations (drought, cold, and number of existing trees; option b 1 ). Option b 2 takes additionally into account the actual establishment conditions (such as light availability at the forest floor, browsing pressure, or chilling requirement) by reducing the number of new trees as conditions are approaching the species' limits. Thus, option b 2 considers increasing stress-induced mortality that acts to reduce the number of seeds, germinants, seedlings, and saplings when confronted with increasingly harsh environmental conditions (Germino et al. 2002, Gomez-Aparicio et al. 2008. Third, for the allocation of the new saplings across species, option c 1 assumes that the number of saplings is distributed equally across the species, while the environmental conditions acting upon each species are considered in option c 2 . Thus, in option c 2 , species for which establishment conditions are comparably good experience a competitive advantage with respect to sapling number (Rigling et al. , B ılek et al. 2014. In summary, the alternative formulations (E 1 -E 6 , cf. Table 1), which first deduce the overall number of new saplings based on site and stand characteristics and subsequently distribute them across species, are expected to result in more realistic establishment patterns than the standard formulation (E 0 ), for which the overall number of saplings is strongly influenced by the number of species that are used in the simulation and the species' shade tolerances. Differences in simulation results are expected to be most pronounced for species-poor forests, and especially for shadetolerant species.
Growth.-Even though in most DVMs, tree growth is modeled in much more detail than other processes such as establishment or mortality (Bugmann 2001, Keane et al. 2001, recent studies have shown that the representation of tree growth can still be improved in many DVMs (Guiot et al. 2014), for example, by accounting for intra-annual variability in the growth response to drought (Dufrene et al. 2005 or by considering lag effects (Misson 2004, Guillemot et al. 2017. Virtually, all FGMs, including ForClim, use the approach of constrained optimum growth (Botkin et al. 1972a) to model the annual dbh growth of individual trees: The realized growth of an individual is derived from species-specific potential growth under (theoretical) optimum conditions, which is reduced by the effects of suboptimal conditions (Larocque et al. 2016), typically including bioclimatic constraints (e.g., summer warmth and soil moisture), biotic constraints (e.g., competition and crown length), and other limitations (e.g., nitrogen availability; Bugmann 2001).
The growth equation of ForClim (Eq. 1) adheres to this form and is based on a modified version of the carbon budget model by Moore ❖ www.esajournals.org (1989). Optimum annual growth is reduced by the growth-reduction factor (GRF) that considers the following constraints: degree-day sum, soil moisture, nutrients, light availability, and crown length (Moore 1989, Bugmann 1996, Didion et al. 2009a. Growth is then allocated to diameter and height growth by accounting for competition effects (i.e., the amount of resources invested in height vs. diameter growth depends on light availability; Pothier andMargolis 1991, Naidu et al. 1998; details are provided below) and site-specific climatic conditions (Bugmann 1996, Rasche et al. 2012: where kG is the growth-rate parameter, H is tree height, kHMax is maximum tree height, ƒ h is a function that allocates volume growth to diameter and height growth, and GRF is a function that reduces maximum growth according to environmental constraints (Moore 1989, Bugmann 1996, Didion et al. 2009a, Rasche et al. 2012. Stem volume, biomass of branches, and leaf area are derived via allometric equations, while root growth is not considered explicitly (Mina 2016).
In the context of such growth equations, the challenge of modeling annual radial tree growth can be decomposed into three parts: (1) the estimation of maximum growth rate (in the case of ForClim, kG); (2) the combination of multiple environmental constraints into one function (GRF, cf. Eq. 1) that serves to calculate actual annual radial growth; and (3) the allocation of volume growth to height and diameter growth. They are discussed in detail below.
Growth-rate parameter (kG).-The growth-rate parameter kG, which serves to capture maximum growth rate, strongly influences model behavior (Huber et al. 2018) while being difficult to estimate because it would require observations under optimal conditions (e.g., no competition and no water stress), for which data are excessively rare, especially across a broad range of species.
For JABOWA, kG was determined manually so that a tree growing under optimum conditions reaches two-thirds of its species-specific maximum dbh at half of its species-specific maximum age (Botkin et al. 1972a, Kienast 1987. However, since this parameterization assumes that maximum dbh is not reached before maximum age, it is forced to distribute diameter growth over a relatively long period, resulting in excessively low optimum diameter growth at intermediate tree size (Bragg 2001;Fig. 1).
To obtain an empirically based kG parameterization, we analyzed >50,000 individual tree-ring chronologies of 15 European species from the International Tree-Ring Data Bank (ITRDB), surprisingly revealing that (1) the 15 species fell into merely two groups of maximum radial increment and (2) differences between species within each group were negligible (Appendix S1: Table S2): Species able to reach a height of at least %30 m are characterized by maximum ringwidth increments of~10 mm/yr (99% percentile), while species with smaller maximum height reach maximum increments of~5 mm/yr. These maximum values are in line with other empirical studies (Bragg 2001, Lee et al. 2004, Fyllas et al. 2017) and a later analysis based on a pan-continental tree-ring width database comprising mostly other species M. Cailleret, personal communication). Thus, we fitted the kG parameter for every species so that maximum diameter increment (i.e., twice the maximum radial increment) under optimum conditions reached 20 or 10 mm/yr according to the group (Fig. 1). This alternative parameterization resulted in distinctly higher kG values, that is, roughly a doubling when averaged across all species (Appendix S1: Table S1).
We expect that this alternative kG parameterization (G 1 ) results in more realistic individual and stand growth, for example, with regard to the time needed to reach a certain dbh, thus resolving previously observed deficiencies in the representation of growth (Bragg 2001).
Growth-reduction factor. -The approach of constrained optimum growth is used in virtually all FGMs, and thus, the question how to combine multiple growth-reducing factors (GFs) into one annual GRF continues to be subject to debate (Bugmann 2001, Niinemets 2010, and different approaches are being used (Pretzsch et al. 2002, Fyllas et al. 2007.
In ForClim, a modified geometric mean is applied to combine the individual growth-reducing factors, GFs (Bugmann 1994, Bugmann 1996: This results in a more mellow reduction of growth compared to the strict multiplicative approach used, for example, in JABOWA (Botkin et al. 1972a), and it considers all GFs in contrast to, for example, Liebig's law of the minimum (von Liebig 1855). Yet, uncertainty exists with regard to the exponent of this equation, whose influence on model output has never been assessed but which seems to rather under-than overestimate the effect of unfavorable conditions on growth. Thus, we kept the general form of Eq. 2 for the alternative formulation (R 1 ) but used the square root instead of the third root (Table 2) to reflect a generally more pronounced growth reduction than the standard formulation (R 0 = Eq. 2; Appendix S1: Fig. S5). We acknowledge that this approach is arbitrary (since not directly measurable), but the comparison between the standard and the alternative formulation enables us to assess model sensitivity to this structural uncertainty.
Allocation of volume growth to height and diameter growth. -Traditionally, most FGMs have treated tree height growth as a variable that depends on dbh via a fixed allometric relationship, whereas in reality, the ratio of height to diameter increment is not constant, but depends on environmental conditions in a species-specific manner (Pothier and Margolis 1991, Naidu et al. 1998, Crecente-Campo et al. 2009).
The standard formulation of ForClim features a light-dependent allocation to height vs. diameter growth that was introduced by Lindner et al. (1997) for the FORSKA model (Appendix S1: Eq. S3). It is based on two species-specific parameters (Lindner et al. 1997, Lindner 1998. In ForClim, parameter values were derived using a linear regression between the values in FORSKA and the species' shade tolerance (kLa; Appendix S1: Eqs. S4, S5; Rasche et al. 2012). Although the assumption that light availability influences the partitioning of growth into height and diameter growth is theoretically sound and qualitatively correct (Forrester et al. 2017), inspection of model behavior revealed issues with the parameterization, that is, an overestimation of diameter growth of shade-tolerant species at the expense of height growth, thus questioning the approach by Rasche et al. (2012).
We introduced two alternative formulations (A 1 and A 2 ) to improve the partitioning of height vs. diameter growth for all species, and thus to better reflect basic ecological principles of species-specific allocation strategies. For A 1 , we used the original set of parameter values provided by Lindner (1998) for the FORSKA model. For A 2 , we retained the dependency on shade tolerance ❖ www.esajournals.org introduced by Rasche et al. (2012) but added a scaling factor (Appendix S1: Eqs. S7, S8) to diminish the very strong differences in the allocation to diameter growth for shade-tolerant vs. shade-intolerant species arising from the standard parameterization (A 0 ).
Mortality.-Tree mortality has large impacts on short-and long-term forest dynamics and ecosystem services (Franklin et al. 1987, Millar and Stephenson 2015, Anderegg et al. 2016), yet tree mortality patterns in time and space are hard to study and even harder to explain (Monserud 1976, Hawkes 2000, Eid and Tuhus 2001, Keane et al. 2001, Choat et al. 2018, H€ ulsmann et al. 2018. Even though mortality formulations in DVMs have been criticized for a long time (Keane et al. 2001), the rigorous testing of alternative formulations has only recently received increasing attention and revealed that DVMs are sensitive to mortality formulations, particularly when applied for long-term simulations (Bigler and Bugmann 2004, Manusch et al. 2012, Holzwarth et al. 2013, H€ ulsmann et al. 2018. To model individual tree mortality, DVMs either apply theoretical (Bugmann 1996, empirically derived (Pretzsch et al. 2002, Hl asny et al. 2014 or in a few cases highly mechanistic and detailed (McDowell et al. 2013) formulations. The original mortality formulation of JABOWA (Botkin et al. 1972a) and the standard formulation in ForClim (Bugmann 1996) belong to the first category. In JABOWA-type FGMs, the annual mortality probability is composed of (1) a constant background mortality, (2) a stress-induced mortality, and (3) a disturbancerelated mortality (Keane et al. 2001), as explained below.
The background mortality reflects individual tree death due to small-scale disturbance agents (e.g., windthrow, lightning, or fungal attacks; Bugmann 1994), which are not represented explicitly in the model, and complex interactions among those. In most models, background mortality is species-specific (e.g., depending on species maximum age or shade tolerance) but independent of actual tree age, stand density, or the environment (Keane et al. 2001). ForClim applies a constant background mortality derived from species-specific maximum age (Botkin et al. 1972a), scaled so that 1% of the trees survive to maximum age in the absence of other sources of mortality (Shugart 1984, Bugmann 1996. The stress-induced mortality is based on the findings that (1) radial growth is an integrative indicator of tree vitality (Dobbertin 2005) and (2) mortality events are typically preceded by periods of low radial growth (Bigler and Bugmann 2003). In most DVMs, this mortality is linked to growth limitations such as low light availability, drought, and other constraints (Keane et al. 2001, Larocque et al. 2016. In ForClim, it is acting when diameter increment falls (1) below 0.03 cm/yr or (2) below 10% of maximum diameter increment (i.e., GRF < 0.1) for at least three consecutive years (Bugmann 1996).
The disturbance-related mortality, also referred to as exogenous mortality, represents major landscape-level disturbances such as fire, pest outbreaks, avalanches or severe wind (Keane et al. 2001), or mortality due to management. Major landscape-level disturbances are captured via the parameter kDistP, representing the probability of disturbance killing all trees independent of species or tree height on a patch simultaneously (Busing and Solomon 2005). In addition, a wide range of planting, cutting, and thinning techniques are available in ForClim to simulate management interventions explicitly (Rasche et al. 2011).
The combination of a background and a stressinduced mortality formulation is often claimed to result in a U-shaped mortality function over tree diameter (Bugmann 2001, Keane et al. 2001). Such a relationship results from ecological theory because of high competition-induced mortality of small trees and amplified mortality of large trees due to their exposure to a high number of mortality agents (Goff and West 1975, Lorimer and Frelich 1984, Franklin et al. 1987, Harcombe 1987). Yet, empirical studies have not revealed a consistent pattern but either U-or reverse Jshaped mortality over tree size (Monserud and Sterba 1999, Hurst et al. 2011, Holzwarth et al. 2013, Ruiz-Benito et al. 2013, H€ ulsmann et al. 2016, H€ ulsmann et al. 2018). In the case of For-Clim, detailed inspection of model formulation and behavior revealed that high competition-induced mortality rates of small trees are represented by the stress-induced mortality formulation indeed. However, the increasing mortality rates of large trees are not captured by the standard formulation, but tall trees are able to persist for a very long time, especially at sites featuring good growing conditions, thus representing a reverse J-shaped relationship between dbh and mortality probability. Furthermore, trees of short-lived species feature a distinctly higher background mortality probability than trees of long-living species (Bugmann 1996; Appendix S1: Fig. S8). Especially in the context of small trees, it has been questioned (Keane et al. 2001) whether species-specific differences in simulated mortality probabilities originate from processes related to small-scale disturbances (e.g., windthrow, lightning, or fungal attacks).
To test the implications of U-shaped vs. Jshaped mortality for long-term forest dynamics, we formulated an alternative theoretical background mortality (M 1 ) that is not constant throughout tree life but increases exponentially with dbh, based on a modified version of the relationship proposed by Manusch et al. (2012;Fig. 2; Appendix S1: Figs. S6-S8 and Eq. S10). Instead of an alternative theoretical mortality formulation, other types of formulations could have been used, as, for example, empirical formulations. Such formulations have however been explored in other recent studies (H€ ulsmann et al. 2018, Thrippleton et al. 2020. In this study, when comparing the alternative to the standard formulation, we expect the alternative formulation to yield more realistic forest structure for long-term simulations of unmanaged stands, where mortality of large trees is particularly relevant (Worrall and Harrington 1988, Nagel and Diaci 2006, Trotsiuk et al. 2012, Holzwarth et al. 2013, Hobi et al. 2015. In addition to the increasing mortality probability with tree size, the alternative formulation assumes a similar, speciesindependent background mortality probability for small trees. In line with empirical findings (Niinemets and Valladares 2006, Petritan et al. 2007, Collet et al. 2008, Holzwarth et al. 2013, species-specific differences in simulated mortality probabilities of small trees thus result primarily from the species-specific susceptibility to environmental stress instead of background mortality.
Light regime. -Light availability is one of the most important constraints on temperate forest dynamics, codetermining both tree growth and the abundance and composition of the regeneration layer (Bazzaz 1979, Canham et al. 1994, Kobe et al. 1995, Valladares and Niinemets 2008. Consequently, DVMs are highly sensitive to light-related processes (Leemans 1991, Beaudet et al. 2002, Pappas et al. 2013, Huber et al. 2018.
To represent the forest canopy, global DVMs have typically used a big-leaf approach that describes the entire canopy as one single vegetation element (Bonan 1996, Sellers et al. 1996. In contrast, FGMs have typically applied an opaque blanket approach (Kimmins 2004) at the level of the individual tree that assumes that all its leaves are located in an infinitesimally thin layer at the tree top and are spread evenly across the entire patch (typical size of 100-1000 m 2 ; Botkin et al. 1972b, Bugmann 2001. In contrast to the big-leaf approach, the opaque blanket canopy accounts for vertical canopy heterogeneity, yet it does not consider horizontal heterogeneity. Thus, it may be appropriate for spatially uniform forests composed of species that feature a plastic crown architecture and can grow toward areas with reduced competition in response to enhanced light availability (Canham 1988, Young and Hubbell 1991, Muth and Bazzaz 2003, Kimmins 2004). Yet, with all leaves being concentrated at the top of the tree, the assumption implies that trees of a certain height shade all smaller trees without being shaded by them at all (Fig. 3b), which leads to sharp asymmetry of competition (Bugmann 2001). This may be problematic particularly for (1) stands that feature small canopy gaps (Kimmins 2004) because in the opaque blanket approach, no direct light reaches the forest floor, or (2) stands that are composed of trees of similar height, such as even-aged stands because the few dominant trees (i.e., the tallest cohort; Assmann 1970) are assumed to fully shade all other trees. The resulting bias is expected to be most pronounced in conifer-dominated forests because conifers generally feature comparably narrow crowns (B€ usgen and M€ unch 1929, Burger 1952, Assmann 1970. Various approaches have been proposed to capture light attenuation more accurately, which come along with much larger complexity, for example, by modeling three-dimensional tree crowns (Pacala et al. 1996, K€ ohler and Huth 1998, Courbaud et al. 2003, Strigul et al. 2008. For ForClim, we decided to adhere to the original canopy representation, that is, applying the Lambert-Beer law (Monsi and Saeki 2005) as a function of leaf area index to simulate light availability across the canopy profile (Fig. 3b-c ;Bugmann 1996). However, with the aim of increasing realism while keeping complexity low, we relaxed the restrictive assumption of the horizontal homogeneity of tree crowns by simulating crown projection area (cpa) as a function of dbh (Pretzsch 2014). Thus, the fraction of the patch that is shaded by tree crowns is determined for each canopy layer. In the shaded part, the Lambert-Beer law is applied, whereas light penetrates to the lower layers without any extinction in the remainder of the patch (Fig. 3d). To simulate individual tree cpa, species-specific allometries (cpa = e a + dbh b ) were derived from long-term experimental plots (Pretzsch 2014). We used two parameterizations, resulting in the alternatives L 1 and L 2 , to assess model sensitivity to the additional species-specific parameters (a and b): While L 1 was based on the standard linear regression of the allometry, a 95% percentile regression was used for L 2 (cf. Pretzsch 2014; Appendix S1). Thus, L 2 represents an in-between formulation of the standard (L 0 ) and L 1 formulation because it distributes the leaves across wider Fig. 3. Light regime: (a) Schematic representation of an idealized forest patch, (b) opaque blanket representation of the canopy for this patch, whereby ForClim applies a cohort approach (i.e., trees of a species with identical age belong to one cohort), (c) standard light formulation L 0 , and (d) alternative formulations (L 1 and L 2 ) considering cpa. Note that for the alternatives L 1 and L 2 , two parameterizations were used (i.e., the standard and 95% regression allometry). The gray areas in (c) and (d) represent the fraction of the patch area that is filled with leaves, while the yellow arrow represents the light that is penetrating through the canopy. crowns than L 1 , but not necessarily across the whole patch. Note that total foliage area per tree and per patch is the same for all formulations; it is merely the horizontal pattern that differs.
Overall, the alternative formulations that take into account species-and diameter-specific cpa (L 1 and L 2 ) are expected to provide more realistic results of light penetration across the canopy and light availability at the forest floor.

Study sites and simulation protocol
Since the model's sensitivity to the different core processes, that is, tree establishment, growth, and mortality, and light extinction through the canopy, varies under different applications and environmental conditions (Huber et al. 2018), we assessed all model formulations at 16 sites along a cross-European gradient (Fig. 4) for both long-term unmanaged and short-term managed applications of mixed-species and monospecific stands, as described in detail below.
The long-term unmanaged simulations refer to natural forest dynamics excluding any management interventions. We used nine Swiss sites from a frequently applied environmental gradient (Bugmann and Solomon 2000, Didion et al. 2009b, Rasche et al. 2012, covering a wide range of temperature and drought conditions ( Fig. 4; Appendix S1: Table S5). For the climate data, long-term monthly means of temperature and precipitation sums, their standard deviation, and cross-correlation were calculated from an interpolated gridded database representing historical climate (resolution of 100 9 100 m; Thornton et al. 1997). Based on the long-term climate data, the weather generator of ForClim (Bugmann 1996) created monthly weather data for each of the 200 forest patches that were simulated per site. To analyze natural forest dynamics, we ran simulations for four applications differing in species composition (mixed vs. monospecific) and the exogenous disturbance regime, that is, frequency of major landscape-level disturbances (old-growth vs. shifting mosaic). For the mixedspecies simulations, all species were allowed to establish via natural regeneration, while for the monospecific simulations, the species set was restricted to the most dominant species of the Fig. 4. Study sites in environmental space spanned by mean annual temperature and mean annual precipitation sum. The biomes (boreal, temperate, and Mediterranean) are based on the Whittaker (1975) biomes modified by Ricklefs (2010). Europe-wide climate data were derived from WorldClim (resolution = 2.5'; extent: latitude 35°-80°, longitude À10°to 40°). Note that forests are not able to persist across the whole European temperatureprecipitation range but are replaced by grassland and shrubland under cold or hot and dry conditions, and that no Mediterranean tree species were considered in this study. expected potential natural vegetation (PNV; Appendix S1: Table S5). For the old-growth simulations, mortality due to major landscape-level (stand-replacing) disturbances was excluded, while we applied an average disturbance period of 300 yr (Schumacher 2004, Janda et al. 2017 for the shifting-mosaic simulations. For all long-term applications, we started the simulations from bare ground and ran them for 1500 yr (i.e., to a dynamic equilibrium state).
For the short-term managed simulations, we used seven sites featuring managed and predominantly even-aged planted forests of the PRO-FOUND reference dataset (Reyer et al. 2019;Fig. 4; Appendix S1: Table S6), which provides harmonized climate, soil, and forest inventory data. We initialized ForClim with the individual tree data from the first inventory and ran simulations to the last inventory (13-66 yr) using the measured weather data from local meteorological stations to drive the model. Mixed-species and monospecific simulations were run for two and five sites, respectively, according to the actual species composition of the respective site. Historical management was deduced from forest inventory data. Apart from management, no disturbances were considered.

POM approach: patterns and grading system
We adopted the POM approach to assess and simultaneously evaluate the standard and alternative formulations of the three processes establishment, growth, and mortality, and the light regime. POM seeks model versions that reproduce a diverse set of patterns, that is, characteristics of the complex system of interest. In the case of forests, such observable patterns may include tree number, size-class distribution, tree height, growth rates, basal area, biomass, species composition, or the amount and spatial distribution of dead wood (Wiegand et al. 2003, Grimm et al. 2005, Grimm and Railsback 2012. These patterns should emerge from the mechanisms believed to be important for the model's application, characterize the modeled system, and be observed at multiple levels and scales (Grimm and Railsback 2012).
We defined two comprehensive sets of patterns, that is, one for the long-term unmanaged and one for the short-term managed application (Table 3). For the long-term unmanaged applications, we derived site-specific lower and upper values for each pattern (referred to as plausible range) based on published data from old-growth forests across Europe and site-specific literature (for details, see Appendix S1: Table S7). We defined two sets of values, depending on the exogenous disturbance regime: For basal area (BA7), the share of giant trees (BATrs80), and the minimum age of trees of the mature timber stage (MinAge50), the ranges of values were assumed to be higher for the oldgrowth (no major landscape-level disturbances) than the shifting-mosaic applications, while for tree numbers (Trs and Trs7), it was vice versa. For the short-term managed application, inventory data were used for benchmarking (see Appendix S1: Table S6; Reyer et al. 2019).
To benchmark the 504 model versions and identify combinations of model structures performing particularly well or poorly, we applied a grading system. The performance of the model versions was first assessed for each pattern individually (pattern performance) by comparing the simulated with the expected/observed values at each site and for each application. Depending on the difference between simulated and expected/ observed values, the model versions gained or lost points for each pattern. In order to identify model versions with unacceptably low performance for at least one pattern, we applied a strong negative penalty that could not be compensated by good performance for other patterns. We applied the following grading system: (1) Each model version got +1 point for every pattern whose simulated value was within the specified site-specific plausible range (long-term simulations) or within AE20% of the observed value of the last inventory (short-term managed simulations); (2) each model version got a penalty of À10 points for every pattern whose simulated value was equal/larger than twice the value of the upper threshold or equal/smaller than half the value of the lower threshold (long-term simulations) or outside AE50% of the observed value (short-term managed simulations); and (3) 0, otherwise. For the dominant species (SpDom), each model version got +1 point if the simulated most dominant species corresponded to the expected most dominant species of the PNV (Appendix S1: Table S5). If this was not the case, 0 was assigned if the percentage of basal area (BA7) of the expected most dominant species of PNV was >5% of total BA7, and a penalty of À10 points was assigned otherwise because the species was either not able to establish at all or able to establish but not able to grow and persist.
The derived pattern performances were aggregated to a site performance per version and application by summing up the points assigned to each pattern. Subsequently, a version performance was derived for each model version and application by summing up the site performances of the respective model version across all sites. Based on the version performances, we finally derived three sets of model versions per application: (1) a set of the 10% most promising (i.e., best-performing) model versions; (2) a set of model versions performing better than the standard model (i.e., ForClim v3.3.4); and (3) a set of the 10% worst model versions, that is, those with the lowest performance. A sensitivity analysis confirmed that these sets were robust regarding different grading systems, especially regarding the penalty, and also robust when considering one or two criteria more or less (results not shown). Moreover, the correlations of the pattern performances were mostly weak (|r| < 0.3; maximum = 0.46), indicating that the patterns addressed different characteristics of the modeled system and thus were not redundant.

RESULTS
The best model version per application (i.e., across patterns and sites) always performed better than the standard model (Fig. 5), which was not included in the set of 10% best model versions of any application. The performance of the best model version for a specific application was generally lower when used for another application, and in some cases, version performance then was even below the performance of the standard model (Fig. 5). Seven model versions performed better than the standard model for all five applications (Appendix S2: Table S1), but none of these versions was included in all five sets of 10% best model versions.
The sets of 10% best model versions were predominantly composed of versions including more than one alternative process or parameter formulation (cf. PmAF in Fig. 6). Yet, depending on the application, the best model versions consisted of different combinations of standard and alternative formulations. For example, the alternative mortality formulation, assuming an increasing mortality hazard for large trees (M 1 ), was included in most of the best model versions of the two old-growth applications (Fig. 6a, c). Thus, in combination with certain formulations of the other processes (e.g., L 2 or an alternative formulation of establishment), M 1 matched the patterns used to assess model performance more often than the standard formulation. Yet, this was not the case for those applications featuring exogenous disturbances (Fig. 6b, d, e). While the combination of formulations for the 10% best model versions depended strongly on the application, the five sets of 10% worst model versions were structurally quite similar (Appendix S2: Fig. S1): Almost all of them contained one of the alternative establishment formulations, which were most often combined with L 1 and M 0 . Therefore, the combination of the alternative establishment with L 1 was problematic because L 1 strongly overestimated light availability (cf. Results: Light regime), inducing high establishment rates in the alternative formulations for establishment, which had been set up to relax the recruitment limitation at the level of the individual species.
Across all sites and applications, 18-77% of the simulations passed the individual patterns; that is, simulated values were within a plausible range (Appendix S2: Fig. S2). The percentage of simulations penalized was small for most Fig. 5. Relative version performances for each application (n = 504 in each box plot). Relative version performances were derived by dividing version performances (see POM approach: patterns and grading system) by the maximum number of points the model versions could achieve per application (i.e., a relative performance of 1 indicates that all patterns were met at all sites). Abbreviations are OG, old-growth; SM, shifting mosaic; mono, monospecific; and MAN, short-term managed. For mono and mixed OG, the same model version performed best (resulting in only four-instead of five-blue asterisks per application). Further, some of the seven versions performing better than the standard model for all applications (red open circles) achieved nearly identical performances for certain applications, and are thus not discernible in the figure. patterns (0-31%; mean = 8.6%), with basal area and tree numbers (Trs7 and TrsPole) representing the strictest patterns (Appendix S2: Fig. S2; 4-19%, 0-25%, and 12-31% of the simulations were penalized for BA7, Trs7, and TrsPole, respectively). For the long-term applications, the patterns were matched more frequently at sites with good growing conditions (i.e., Huttwil, Bern, Schaffhausen and Basel), while cold and dry sites (i.e., Bever, Grande Dixence, Davos, Adelboden and Sion) obtained more penalties (Appendix S2: Fig. S3a-d; 4-24% (min-max) vs. 1-5% of the simulations were penalized at cold and dry sites vs. sites with good growing conditions, respectively). For the short-term managed application, most of the penalties occurred at the two Solling sites and at Peitz (i.e., 39%, 21%, and 15% of the simulations were penalized for Solling (Beech), Solling (Spruce), and Peitz, respectively; Appendix S2: Fig. S3e).

Establishment
Assumptions regarding tree establishment were relevant to all long-term applications (Fig. 6a-d), but not to short-term simulations of managed stands (Fig. 6e). For the long-term Fig. 6. Combination of model formulations that constitute the 10% best model versions for each model application. If a process is not relevant to a specific application or if the standard and alternative formulations perform equally well, we expect a 50:50 distribution and 33:33:33 distribution for 2 and 3 formulations, respectively. Abbreviations are E, establishment; G, growth-rate parameter (kG); R, growth-reduction factor (GRF); A, allocation of volume growth to height and diameter growth; M, mortality; and L, light regime. For establishment, all alternative formulations were summarized as Alternative 1. PmAF denotes the percentage of model versions including more than one alternative process or parameter representation. For a description of the standard and alternative formulations, see Table 2. applications, the alternative establishment routine (including E 1 -E 6 ) relaxing the recruitment limitation at the level of individual species and considering the SNSS trade-off was included distinctly more often in the 10% best model versions than the standard formulation (E 0 ; Fig. 6a-d).
Relaxing the assumption of recruitment limitation affected simulated patterns more strongly for the monospecific than the mixed-species applications (Fig. 7) by increasing tree numbers (Fig. 7b) and thus basal area (Fig. 7a), which often led to an overestimation of these two patterns. Moreover, relaxing the assumption of Fig. 7. Performance of selected patterns of the standard (E 0 ) and alternative (E 1-6 ) establishment formulations for the long-term applications (see Appendix S2: Figs. S4, S5 for the short-term managed application and the full set of patterns, respectively). Note the modulus-transformed y-axis (John and Draper 1980), where the transformation represents a modification of the log transformation that however preserves zero and the sign of the data (i.e., transforms both negative and positive values). The lower and upper values of the plausible ranges defined for each site and pattern were scaled to À0.2 and +0.2, respectively, to allow for a comparison across sites. High positive or negative values represent simulations that strongly over-or underestimate the value of the respective pattern. (a) BA7, basal area of all trees with dbh ≥7 cm; (b) Trs7, tree number of all trees with dbh ≥7 cm; (c) Min-Age50, minimum age of the mature timber stage; and (d) SpDom, percentage of BA7 of the expected most dominant species of PNV on total BA7. For SpDom, the pattern performance is shown: 1, the simulated most dominant species corresponded to the expected most dominant species of the PNV; À10, the percentage of BA7 of the expected most dominant species of PNV was <5% of total BA7; and 0, else. For a description of the standard and alternative formulations, see Tables 1, 2. Fig. 8. Performance of selected patterns of the growth-rate parameter (panels a-d), growth-reduction factor (panels e-h), and background mortality (panels i-l) for the long-term applications. Other details are as in Fig. 7. BA7, basal area of all trees with dbh ≥7 cm; Trs7, tree number of all trees with dbh ≥7 cm; BATrs80, percentage of BA7 of giant trees (trees with dbh ≥80 cm); MinAge50, minimum age of the mature timber stage; and TBA50, time to reach 50% of BA7. For a description of the standard (referred to with 0 ) and alternative formulations (referred to with 1 ), see Table 2. See Appendix S2 for the short-term managed application and the full set of patterns, respectively. recruitment limitation increased the minimum age of trees of the mature timber stage (Min-Age50; Fig. 7c) due to increased competition, and it speeded up basal area growth during early succession. However, due to the increase in equilibrium basal area, this effect was blurred in the criterion TBA50 representing the time to reach 50% of the equilibrium basal area (Appendix S2: Fig. S5f).
With regard to species composition, the alternative establishment routine passed the SpDom criterion more often than the standard formulation (Fig. 7d) because the expected dominant species of PNV are mostly shade-tolerant, which were thus strongly recruitment-limited under the standard formulation. Moreover, the alternative routine is not sensitive to the number of species participating in the succession; that is, differences in basal area and tree numbers between mixed and monospecific simulations were lower (Fig. 7a, b).
With regard to the patterns analyzed, differences in the performance were most pronounced between the standard and all alternative formulations, but rather small among the six alternatives ( Fig. 7a-d; Appendix S2: Fig. S5). Within the alternative formulations, two groups were evident, with E 1 and E 2 behaving slightly differently than E 3-6 , especially with regard to tree numbers (Fig. 7b): Tree numbers were higher and thus more often overestimated for E 1 and E 2 than for those alternatives considering the actual establishment conditions when calculating maximum sapling number (E 3-6 ).

Growth
We found diverging patterns for the growthrelated processes across the five applications (Fig. 6). In general, the parameterization of the growth rate (kG) had the strongest effect on the emerging ecological patterns, followed by the GRF.
Growth-rate parameter (kG).-While the alternative parameterization of the growth rate (kG) derived from tree-ring data was clearly preferred over the standard parameterization for the shortterm managed applications (Fig. 6e), its effect on model performance was more complex for the long-term applications (Fig. 6a-d).
On the one hand, higher maximum growth more often overestimated basal area (BA7; Fig. 8a) and particularly the share of giant trees (BATrs80; Appendix S2: Fig. S6e) because trees grew faster and thus reached higher dbh. On the other hand, increased maximum growth accelerated basal area growth and reduced tree numbers (due to stronger competition), which in turn improved the patterns for tree numbers (Trs7; Fig. 8b), the minimum age of the mature timber stage (MinAge50; Fig. 8c), and early successional behavior (TBA50; Fig. 8d).
Growth-reduction factor.-The alternative GRF was included more often in the 10% best model versions than the standard GRF, especially for the long-term mixed-species applications (Fig. 6). It decreased basal area and the share of giant trees, two patterns that were matched more often by the alternative than the standard GRF ( Fig. 8e-f). It further increased the age of the mature timber stage (Fig. 8g) and retarded basal area growth (Fig. 8h), two patterns that were not matched more often by the alternative than the standard GRF, however.
Allocation of volume growth to height and diameter growth. -None of the formulations allocating volume growth to height and diameter was preferred in all five applications (Fig. 6). While all three formulations appeared about equally often among the best versions in the long-term mixedspecies applications (Fig. 6a-b), A 0 and A 1 were typically preferred for the short-term managed (Fig. 6e) and the long-term monospecific applications (Fig. 6c, d), respectively.
Even though the formulations allocating volume growth to height and diameter affect dbh and thus basal area directly, the differences between the formulations were blurred in the aggregated results due to species-specific lightdependent growth patterns (Appendix S2: Figs. S4, S8). Yet, analyzing the predicted dominant height (HDom) of individual species in short-term managed simulations (Fig. 9) revealed that considering the species' shade tolerance (formulations A 0 and A 2 ) resulted in better predictions. This was especially the case for stands dominated by Norway spruce (Picea abies; Bily Kriz, Solling Spruce, Kroof) or Scots pine (Pinus sylvestris; Hyyti€ al€ a), while stands dominated by European beech (Fagus sylvatica; Solling) benefited from the A 2 formulation, which relaxed the high priority of allocation into diameter. It should be also noted that the drought-related reduction of site-specific maximum tree height (Rasche et al. 2012) resulted in very low HDom in two sites (Peitz and Sorø).

Mortality
For the old-growth applications (Fig. 6a, c), the U-shaped mortality formulation (M 1 ) was distinctly preferred over the reverse J-shaped formulation (M 0 ). It reduced the number of giant trees compared to M 0 and thus basal area, which in turn reduced the time needed to reach half of the equilibrium basal area. Thus, compared to M 0 , M 1 improved the representation of several patterns (BA7, BATrs80, and TBA50; Fig. 8i, k, l) and prevented the model from simulating excessively high absolute values of basal area (results not shown).
In contrast, M 0 appeared slightly more often in the best model versions of the mixed shiftingmosaic and the short-term managed applications (Fig. 6b, e). In case of the mixed-shifting-mosaic application, differences with regard to the patterns analyzed were rather small between the two mortality formulations (except for the share of giant trees; Fig. 8k). On the contrary, the mortality formulations strongly influenced basal area and tree numbers of the short-term managed application (Appendix S2: Fig. S4). This is because M 1 features a lower mortality probability compared to M 0 for the small-to-medium dbh classes ( Fig. 2; Appendix S1: Fig. S9), which are quite important in managed stands. Thus, compared to M 0 , M 1 allows more trees to survive as long as they are not subject to the stress-induced mortality. This often resulted in a better estimate of basal area, which however came at the cost of overestimated tree numbers (Appendix S2: Fig. S4f), which in turn explains why M 0 appeared more often in the 10% best model versions of the shortterm managed application than M 1 .

Light regime
Assumptions on the representation of vertical light attenuation across the canopy strongly Fig. 9. Observed (red asterisks) and predicted (box plot, n = 168 per plot) dominant heights (HDom) for the three formulations (A 0 , A 1 , and A 2 ) allocating volume growth to height and diameter growth at the short-term managed sites for the last inventory (except for Kroof, where the first inventory was used). Abbreviations are PAbi, Picea abies; PSyl, Pinus sylvestris; and FSyl, Fagus sylvatica. For a description of the formulations and the study sites, see Methods and Appendix S1, respectively. Note that the model was initialized based on the dbh values of the individual tree inventories, but not based on observed height. For a description of the standard and alternative formulations, see Table 2. influenced simulated forest structure (Appendix S2: Fig. S10), yet no formulation was clearly preferred over the others across all applications (Fig. 6). Of the two alternative formulations considering cpa, L 2 met most patterns distinctly more often than L 1 (Appendix S2: Fig. S10a-c). Thus, L 1 hardly appeared in the 10% best model versions of any application (Fig. 6), while L 2 was preferred over L 0 , which assumes an opaque blanket canopy, for three of the four long-term applications (Fig. 6a-c).
Yet, a supplementary comparison of the underlying simulated relative light availability across the canopy with empirical data (Fig. 10) revealed that L 0 yielded the most realistic light extinction, while light availability at the forest floor was strongly overestimated for the two alternatives, although they consider cpa (L 1 and L 2 ).

DISCUSSION
We scrutinized the representation of key ecological processes in dynamic forest models to revisit, assess, and improve their formulation and the internal consistency of the models as a whole, taking ForClim as a case study. We developed a set of alternative process and parameter representations for the three core processes of establishment, growth, and mortality of trees, and for the light regime as the central limiting resource in forests based on a diverse array of empirical data. Below, the results are discussed for each process individually and from an overall modeling point of view.

Establishment
Our results demonstrate that the assumptions underlying tree establishment are of key importance for shaping forest dynamics under any circumstances, except for short-term simulations (<50 yr) of managed, even-aged stands. Therefore, the alternative routine relaxing the assumption of recruitment limitation (Clark et al. 1999) at the level of individual species and considering the SNSS trade-off (Coomes and Grubb 2003) is clearly preferable over the standard formulation. The bars represent diameter distribution (referring to right y-axis). Note that the amount of leaves (foliage weight and foliage area) was the same for all formulations. Empirically derived relative light intensity at forest floor was found to be 2-15% and 2-8% for mature European beech (BA >20 m 2 ; Szwagrzyk et al. 2001Szwagrzyk et al. , G alhidy et al. 2005 and Norway spruce (BA >35 m 2 ; V ezina 1961, Mitscherlich et al. 1967, Johansson 1987) stands, respectively.
Although often overestimating basal area and tree numbers, especially in combination with increased light availability (L 1 ), the alternative establishment routine was able to simultaneously match multiple patterns, such as tree numbers, the share of giant trees, basal area growth during early succession, or species composition. The former three patterns were particularly improved for monospecific stands (i.e., stands where no successional species replacement took place), while the latter was exclusively related to mixedspecies stands, where the alternative routine resulted in a higher share of late-successional species with respect to both tree number and stand basal area. This change in patterns is consistent with known properties of forest dynamics, as shade-tolerant species contribute large shares of stand basal area in many old-growth forests (Bolte et al. 2014, Hobi et al. 2015, while they were clearly underestimated by the standard formulation. Moreover, the alternative routine was distinctly less sensitive to the number of species participating in the succession, and thus was able to reproduce observed regeneration patterns better than the standard formulation (Collet et al. 2008, Petritan et al. 2013, B ılek et al. 2014, Bolte et al. 2014, Brang et al. 2015, Rozman et al. 2015. Overall, our results suggest that assuming recruitment limitation for shade-tolerant species in contrast to shade-intolerant species does not accurately represent long-term forest dynamics, particularly in favorable habitats. Among the alternative establishment formulations (E 1 -E 6 ), the assumptions on total sapling density (i.e., option b; Table 1) influenced forest structure more strongly than those on the allocation of saplings to individual species (i.e., option c; Table 1). This is in line with empirical findings indicating that forest structure is strongly determined by general establishment conditions (Germino et al. 2002, Kuijper et al. 2010, Castanha et al. 2013, Rozman et al. 2015 and that competition among saplings is one of the most important filters determining the later dominance by a few species in the upper canopy, despite very high species diversity in the seedling layer (Collet et al. 2008, Petritan et al. 2009, Hobi et al. 2015. Thus, we recommend to focus further research efforts on better understanding the processes that determine overall sapling density (Huber et al. 2018) rather than the actual species composition of the regeneration layer. In a modeling context, such efforts should not be undertaken independently of the remainder of the model. In particular, shading by the overstory should be considered and verified simultaneously, since light conditions strongly determine the abundance and also the composition of tree regeneration.

Growth
We revisited the growth formulation of For-Clim by re-parameterizing the maximum growth rate (kG), by developing an alternative for the function that combines multiple environmental constraints into one factor reducing optimum growth (GRF), and by introducing two alternative formulations for the allocation of volume growth to height and diameter growth. On the one hand, our results underline the importance of growth formulations in dynamic forest models. On the other hand, they clearly demonstrate that substantial improvements in model performance across multiple patterns can be achieved by small, data-or theory-based changes to the formulations that do not necessitate adding structural complexity.
Of the growth-related process representations, the parameterization of the growth rate (kG) had the strongest effect on the emergence of ecological patterns. The alternative parameterization derived from a large set of individual tree-ring chronologies was included more often in the best model versions than the standard parameterization. Although the database is not representative for all growing conditions (Klesse et al. 2018), the alternative kG produces considerably higher diameter increments than the standard parameterization, consistent with the empirical method by Bragg (2001) for estimating size-specific optimal growth. This suggests that the standard parameterization underestimates individual tree growth, especially under favorable conditions. Unexpectedly, the analysis of the maximum absolute increments suggested that temperate European tree species form just two groups of different maximum growth potential, with only minor differences between the species within a group: Tall and typically single-stemmed species are separated from medium-sized or short species forming one or several stems. This suggests that trees growing under optimum conditions face fundamental physiological and climatic constraints, such as growing season length, as well as allocation trade-offs, which constrain maximum diameter growth across species. Yet, different size-specific optimum growth patterns across species arise when the alternative kG values are implemented in ForClim's growth equation (Eq. 1) due to species-specific allocation strategies and maximum height. This is in line with empirically derived optimum tree growth (Bragg 2001), although further investigations regarding simulated size-specific optimum and realized growth would be useful. Still, we are confident that the alternative kG provides a more accurate characterization of optimal radial growth rates that is, after all, empirically based rather than relying on theoretical, qualitative reasoning (Botkin et al. 1972a).
The alternative GRF, which has a stronger effect under suboptimal conditions than the standard GRF, was found to be more appropriate especially in combination with the alternative kG parameterization. Yet, our results suggest that the growth of trees that are constrained by one environmental factor only (e.g., suppressed trees growing under low-light conditions in an otherwise favorable habitat) may be overestimated, which results in an underestimation of stress-induced mortality. Thus, further investigations are needed regarding the growth-reducing factors and their combination into one single annual scalar and the variability of dbh increments between individual trees, years, or sites, which is known to be substantial (Bleicher 2013, Weber et al. 2013, for example, by exploiting inventory and tree-ring data along environmental gradients (Weber et al. 2013, Ojeda et al. 2018. The representation of the allocation of volume growth to height and diameter growth influences intra-and interspecific competition and has a direct effect on basal area. Among the three formulations, A 2 , which assumes a weaker dependency on shade tolerance than posited by the standard formulation (Rasche et al. 2012), turned out to simulate species-specific height growth most appropriately. The assumption that interspecific differences regarding the allocation of growth to height and diameter are related to shade tolerance is supported by recent empirical studies (Forrester et al. 2017), yet using wood density instead of shade tolerance may be more robust as it can be quantified better (Forrester et al. 2017) and does not need a discretization into tolerance classes. Nevertheless, more detailed analyses and comparisons with empirical data of different species growing under various light conditions are needed, especially in the context of mixed-species stands (Petritan et al. 2009, Forrester et al. 2017).

Mortality
We tested two formulations of the background mortality that, in combination with the stress-induced mortality, either represent a reverse Jshaped or a U-shaped relationship between dbh and mortality probability. For the old-growth simulations, the U-shaped mortality formulation was clearly preferred over the reverse J-shaped one, suggesting that at some point, large trees become susceptible to additional mortality agents (Canham et al. 2001, Nagel and Diaci 2006, Perivancich 2010, Trotsiuk et al. 2012, Holzwarth et al. 2013. This is in line with ecological theory postulating high competition-induced mortality of small trees and amplified mortality of large trees due to their exposure to a high number of mortality agents (Goff and West 1975, Lorimer and Frelich 1984, Franklin et al. 1987, although the empirical evidence for an increasing mortality probability for larger diameters is not conclusive (Harcombe 1987, Monserud and Sterba 1999, Holzwarth et al. 2013, Ruiz-Benito et al. 2013, H€ ulsmann et al. 2016. Unfortunately, even though this pattern may be present in reality it is often not detectable due to the comparably low number of large trees in almost any dataset (Holzwarth et al. 2013(Holzwarth et al. , H€ ulsmann et al. 2016) and the possibly long right tail of the dbh distribution.
Furthermore, in contrast to the reverse J-shaped formulation, the U-shaped formulation is in line with the within-species trade-off between growth rates and life span (Loehle 1987, Black et al. 2008, Bigler and Veblen 2009, R€ otheli et al. 2012, Bigler 2016. Under the U-shaped formulation, a slowly growing tree may survive longer than a rapidly growing tree (provided that it does not die due to stress-induced mortality), as the latter is subject to the increasing background mortality probability at younger age than the former. Thus, overall, our results indicate that a U-shaped relationship better represents long-term forest dynamics than a reverse J-shaped relationship.
However, the mortality probability of small trees may be underestimated by the combination of a stress-induced and an exponentially increasing background mortality as formulated here, for two reasons: First, the alternative formulation for the background mortality assumes that the mortality probability increases exponentially with dbh, starting from a baseline of zero (Fig. 2). A mortality baseline larger than zero may be more appropriate to account for the death of small trees due to multiple causes not accounted for in the model, such as fraying by ungulates (Kupferschmid and Bugmann 2005) or falling boles (Holzwarth et al. 2013); and second, the stress-induced mortality is likely underestimated for suppressed trees growing under low-light but otherwise good growing conditions (cf. previous discussion on GRF). Hence, these issues likely explain why the reverse J-shaped mortality formulation was preferred over the U-shaped formulation for some applications considering exogenous disturbances (e.g., management).

Light regime
Model outcomes were highly sensitive to the assumptions regarding the representation of light attenuation through the canopy, which is a pivotal process determining the living conditions of sub-canopy trees (Feldmann et al. 2018, Rutenbeck et al. 2018. With the two alternative formulations that consider the horizontal extent of tree crowns, we expected to improve the accuracy of the estimation of light availability for sub-canopy trees. Yet, these alternative representations did not perform consistently better than the simple standard formulation assuming an opaque blanket canopy, at least not for the right reason, but they rather led to a considerable overestimation of light availability. In fact, even though one of the alternative light representations (L 2 , simulating wider crowns than L 1 ) was often found among the sets of best model versions, this good performance needs to be interpreted cautiously because it is due to the compensation of structural and parameter-related deficiencies in other parts of the model: The simulated higher light availability compared to the standard formulation increases diameter growth and sapling numbers, which tend to be underestimated by the standard model (cf. discussion sections Establishment and Growth). Thus, for this specific example, introducing more alleged realism did not improve internal model consistency but rather compensated for other shortcomings (getting the right answer for the wrong reason).
Assuming that tree crowns fill the horizontal space across a patch (L 0 ) leads to a more efficient light absorption than assigning the leaves to individual tree crowns (L 1 and L 2 ) because leaf overlap in the vertical profile is minimized. Despite the simplicity of this approach, it thus considers optimized resource-use efficiency of plants. This is ecologically reasonable because trees feature numerous strategies to avoid self-shading and improve light absorption (e.g., Valladares et al. 2002), such as leaf orientation (e.g., Pearcy and Yang 1996), leaf form, size, and morphology (Kimmins 2004), branching patterns (e.g., Yang 1996, Valladares andPearcy 1998), and phototropism (e.g., Canham 1988, Young and Hubbell 1991, Muth and Bazzaz 2003. Improving the representation of light attenuation in the canopy would most likely require a more detailed representation of crown structure considering the vertical crown extension as well (K€ ohler andHuth 1998, Goreaud et al. 2006). Furthermore, the estimation of crown projection area (cpa) could be improved by considering the geometry of stand structure and species composition to account for crown plasticity (Pretzsch 2014, Forrester et al. 2017, in spite of the fact that species-specific cpa is primarily related to dbh (Forrester et al. 2017). In particular, cpa tends to be underestimated for large trees (particularly for European beech; Appendix S1: Fig. S9), which are absorbing most of the incident light (Binkley et al. 2013). Moreover, the model ignores light absorption by stems, branches, and other structures (Vose et al. 1995, Binkley et al. 2013), which is not or only partly incorporated even in highly complex three-dimensional crown models (Canham et al. 1994, Pacala et al. 1996, Courbaud et al. 2003. Including a wood area index to account for the effect of woody parts on light absorption is likely to be promising (Dufrêne andBr eda 1995, Propastin andPanferov 2013).
Using POM to improve our understanding of forest dynamics Dynamic vegetation models are key tools to improve the mechanistic understanding of the ❖ www.esajournals.org factors determining forest ecosystem dynamics, and to project potential future forest trajectories (Shugart et al. 2018). Rather than developing one single set of model equations that represents a specific hypothesis of forest dynamics (Shugart 1984), we applied POM to test a large set of competing, similarly plausible process and parameter representations that were treated as distinct hypotheses regarding the quantification of the underlying ecological processes. Therefore, POM enabled us to revisit the model's core processes jointly for different model applications and, importantly, to study the interactions between these processes.
Depending on the application, distinct process formulations performed best. This points at varying influences of ecological processes with stand structure and site conditions as also found by studies analyzing the sensitivity of DVMs through time, along ecological gradients, or with differing climate change scenarios (Pappas et al. 2013, Huber et al. 2018, Collalti et al. 2019). The short-term managed application was mainly sensitive to growth-related process formulations, which is little surprising (Pretzsch 2009). In contrast, long-term unmanaged applications were very sensitive to the formulations of mortality and establishment. A better understanding of mortality has been repeatedly claimed to be pivotal for improving the robustness of projections of future forest dynamics , Ammer et al. 2018, Vanoni et al. 2019.
The sensitivity of future projections to the formulation of natural regeneration processes, however, has not been picked up by the broad forest modeling community so far (but see Lischke and L€ offler 2006, Wehrli et al. 2007, Seidl et al. 2012). Yet, a better understanding of establishment-related processes is clearly of utmost importance for more robust projections of future forest dynamics, as underlined by our simulation results. This is particularly the case in view of climate-induced changes of disturbance regimes , as establishment strongly determines post-disturbance forest recovery and species range shifts. An increasing number of empirical studies on postdisturbance dynamics of forest ecosystems provide insights into the interactions of overstory mortality and tree regeneration (Brang et al. 2015, Redmond and Kelsey 2018, Roccaforte et al. 2018, Fettig et al. 2019, Tard os et al. 2019). This continuously increasing empirical database can provide the foundation for further POM-based model development, as done in the present study, thus increasing our understanding of long-term forest dynamics.
Yet, we would like to stress that the behavior of process formulations (based on theoretical considerations) should always be accompanied by a thorough check of variables at intermediate scales where this is feasible, rather than focusing on properties emerging at the stand scale only. For example, accounting for the horizontal extent of tree crowns in the representation of the light regime requires to not only focus on the resulting stand structure, but more importantly to also analyze the underlying light availability profile. Otherwise, there is a risk of accepting an implementation that may be erroneous but compensates for shortcomings in the formulation of other processes.

CONCLUSIONS
We applied POM to simultaneously assess the representation of the main ecological processes in a dynamic forest model, that is, tree establishment, growth, and mortality, and light extinction through the canopy. We scrutinized multiple ecological assumptions underlying the model, which partly date back to the first generation of FGMs and have often been developed in the absence of empirical data. We confronted these standard assumptions with revised, data-constrained alternative formulations, aiming to rigorously test the representation of core processes underlying forest dynamics and the consistency of their interplay.
By confronting individual process formulations and multiple model outputs with data, respectively, we identified structural shortcomings and parameterization issues in the representation of the core processes. Furthermore, we derived alternative formulations, most of which were able to resolve deficiencies in model behavior without significantly increasing the model's complexity. In particular, our results demonstrate the huge potential of large datasets (e.g., International Tree-Ring Data Bank [ITRDB] or PRO-FOUND database; Reyer et al. 2019) for refining process-based formulations that have been partly developed completely data free into data-informed process-based formulations while achieving a better understanding of the nature of the core processes underlying long-term forest dynamics.
Further, we showed that simultaneously considering multiple core processes is key for revealing internal inconsistencies in the model framework. Moreover, we were able to highlight the model's sensitivity to different processes and thus its dependency on the simulation context (e.g., temporal scale, forest composition, or disturbance regime). This challenges the traditional focus on simplifying the daunting complexity of model development by improving a single process in isolation. Although conceptually appealing, focusing model development on a single process in isolation comes at the high risk of running into a dead end, where an improvement may come at the detriment in another field of model application. Hence, in line with other studies, we conclude that the behavior and performance of complex models should not be analyzed for few specific applications only, but for multiple applications, system states and dynamics of interest across a wide range of site conditions. Using such a comprehensive approach, we can build structurally realistic models, reduce parameter uncertainty, avoid overfitting, and bypass the need of site-specific model calibration prior to application. We thus conclude that the forest ecology community should make good use of the ever-increasing data availability and the POM framework to challenge the core processes of dynamic models in a holistic manner.
In view of the many uncertainties regarding appropriate process representations and model parameterization, we suggest that ecologists should abandon the use of one single model version for doing simulation studies. A more promising approach may be offered by developing and using a set of meaningful alternative model versions (ensemble modeling; Dormann et al. 2018) and rigorously confronting them with multiple data sources and applications. Although followup studies are needed, we have shown that this approach has a high potential to deliver conceptual insights on the internal consistency of a model while at the same time improving our understanding of core ecological processes from an ecosystem perspective, and also enabling more robust projections of future forest dynamics.

ACKNOWLEDGMENTS
We are grateful to all data providers: We thank Peter Brang, Meinrad Abegg, and Hans Pretzsch for sharing long-term data of regeneration on storm areas, of individual tree mortality from close-to-nature and unmanaged plots of the Swiss National Forest Inventory, and of species-specific allometries of crown projection area, respectively. Further, we would like to thank the contributors of the International Tree-Ring Data Bank (ITRDB) and Christopher Reyer and the contributors of the reference dataset assembled by TG2 in the context of the COST Action PROFOUND ("Towards robust PROjections of European Forests UNDer climate change"). Moreover, we thank Andreas Rudow for his expertise when revising the species-specific parameters, and Dominic Michel for his support in all IT-related questions. Furthermore, we thank Maxime Cailleret, Gunnar Petter, Martin Br€ ullhardt, and Timothy Thrippleton and two anonymous reviewers for their helpful input. This work was partially funded by the Swiss State Secretariat for Education, Research and Innovation SERI (grant no. C14.0046) in the context of the COST Action PROFOUND, and by the EU Project FP7 IMPRESSIONS (grant no. 603416).