Soil respiration in a northeastern US temperate forest : a 22-year synthesis

To better understand how forest management, phenology, vegetation type, and actual and simulated climatic change affect seasonal and inter-annual variations in soil respiration (R-s), we analyzed m ...

R s was highly variable at all spatial (respiration collar to forest stand) and temporal (minutes to years) scales of measurement. The response of R s to experimental manipulations mimicking aspects of global change or aimed at partitioning R s into component fluxes ranged from −70% to +52%. The response appears to arise from variations in substrate availability induced by changes in the size of soil C pools and of belowground C fluxes or in environmental conditions. In some cases (e.g., logging, warming), the effect of experimental manipulations on R s was transient, but in other cases the time series were not long enough to rule out long-term changes in respiration rates. Inter-annual variations in weather and phenology induced variation among annual R s estimates of a magnitude similar to that of other drivers of global change (i.e., invasive insects, forest management practices, N deposition). At both eddy-covariance sites, aboveground respiration dominated R e early in the growing season, whereas belowground respiration dominated later. Unusual aboveground respiration patterns-high apparent rates of respiration during winter and very low rates in mid-to-late summer-at the Environmental Measurement Site suggest either bias in R s and R e estimates caused by differences in the spatial scale of processes influencing fluxes, or that additional research on the hard-to-measure fluxes (e.g., wintertime R s , unaccounted losses of CO 2 from eddy covariance sites), daytime and

INTRODUCTION
Terrestrial ecosystems exchange ~120 gigatons of carbon (Gt C) with the atmosphere annually through photosynthesis and respiration (R e ), equivalent to one-sixth of all C present in the atmosphere, making R e one of the largest fluxes in the global C cycle (Prentice et al. 2001). R e is dominated by soil respiration (R s ), the sum of belowground autotrophic (roots and associated mycorrhizae) and heterotrophic (mainly microbes, microfauna, and mesofauna) respiration. Estimates of global R s range from 68 to 98 Gt C yr -1 (Raich and Schlesinger 1992, Schlesinger and Andrews 2000, Bond-Lamberty and Thomson 2010a, or about two-thirds of all of the C emitted to the atmosphere by terrestrial ecosystems. The amount of C emitted through R s is ~10 times more than that released through fossil fuel combustion and cement manufacturing (IPCC 2007, Peters et al. 2012, although, for the most part, R s is closely coupled to a large photosynthetic uptake, leading to a much smaller net C exchange with the atmosphere (Schlesinger and Andrews 2000). R s varies substantially across space and time (Norman et al. 1997, Rayment and Jarvis 2000, Drewitt et al. 2002, implying that long-term measurements over a large area are required to constrain flux estimates. Multiple environmental factors affect R s . For example, R s exhibits a seasonal pattern that is generally positively correlated with temperature (Davidson and Janssens 2006). R s peaks under optimal soil moisture conditions and becomes depressed in soils that are too wet or too dry . Nitrogen additions can reduce R s , in part because they cause declines in plant belowground C allocation (Janssens et al. 2010). Also, N additions can decrease microbial respiration by inhibiting lignolytic enzyme activity (Berg and Matzner 1997).
Alternatively, where plant photosynthesis is strongly limited by low N availability, N additions can lead indirectly to increased R s (Janssens et al. 2010) by increasing root respiration and 7 organic matter production that fuels litter (leaves and roots) decomposition. The availability and chemistry of carbon substrates also influence the apparent temperature sensitivity of R s Janssens 2006, Gershenson et al. 2009). Furthermore, environmental drivers influence the residence time of C, and a change in the drivers can induce a transient change in R s as the carbon pool adjusts to a new steady state (e.g., Bradford et al. 2008). R s also varies as a function of biotic drivers, including vegetation type (Raich and Tufekcioglu 2000, Hibbard et al. 2005, Roehm 2005) and phenology (Curiel Yuste et al. 2004).
Both of these are related to photosynthesis, which has an important effect on R s because large amounts of photosynthates (C compounds) are allocated to roots and associated mycorrhizal fungi (Högberg et al. 2001, Janssens et al. 2001, Tang et al. 2005a, Drake et al. 2012, Hopkins et al. 2013, Savage et al. 2013. The efflux of low-molecular-weight organic compounds from roots (via rhizodeposition and root exudation) also impacts R s by enhancing microbial activity and soil organic matter decomposition Cheng 2007, Kuzyakov 2010).
The Harvard Forest, located in north-central Massachusetts, USA, is one of the most intensively studied forests in the world (Foster and Aber 2004). In particular, carbon cycling has been extensively studied: more than 100,000 measurements of R s have been made during the 25 years of Harvard Forest's involvement in the Long Term Ecological Research (LTER) program. R s has been measured in several different forest types and in response to 15 experimental manipulations simulating various aspects of global change. The Harvard Forest is also home to the world's longest-running eddy covariance (EC) system, measuring whole-ecosystem CO 2 exchange in a deciduous forest-the Environmental Measurement Site (EMS)-and a second EC site located in a mature hemlock stand. The rich datasets provide a unique opportunity to synthesize diverse data sources into a better understanding of how actual climatic change, forest 8 management, phenology, vegetation type, and simulated global change together affect seasonal and inter-annual variations in R s . We also compared the data on R s to tower-based estimates of R e to examine seasonal variations in the partitioning of above-vs. belowground respiration in mature hardwood and hemlock forests.

Site Description
The Harvard Forest is a 1200-hectare LTER site located in Petersham, Massachusetts, USA ( Fig. 1). Elevation ranges from 220 m to 410 m above sea level. Mean annual air temperature (1964-2010) is 7.5°C; January is the coldest month (−6.1°C) and July and August, the warmest (20.1°C and 19.3°C, respectively). Precipitation (rain and snow water equivalent) averages 1119 mm yr -1 and is well distributed throughout the year. Background nitrogen deposition is 0.66 g m -2 yr -1 (Munger et al. 1998). Throughout Harvard Forest, soils are predominantly Typic Dystrochrepts-sandy loams overlying a glacial till. Poorly drained forested swamps are also found in some areas; in the well-surveyed Prospect Hill Tract, about 3% of the surface area is covered by peat deposits and 22% is poorly or very poorly drained (Foster and Motzkin 2003). Because of the presence of glacial till, rocks are ubiquitous, covering 7.2% of the surface area (Foster and Motzkin 2003). Rocks also represent up to 25.8% of the soil volume from 0-to 50-cm depth in the Prospect Hill Tract (Raymer et al. 2013). The most common dominant tree species are red maple (Acer rubrum L.), red oak (Quercus rubra L.), and eastern hemlock (Tsuga canadensis (L.) Carr.).
The Harvard Forest has a history of agricultural use, mainly as pastures and woodlots, dating back to the mid-18 th century; current land cover is heavily influenced by prior land use as 9 well as natural disturbances (Foster 1992). Slightly more than half of the originally forested areas were cleared, but remote areas and locations where swamps or steep rocky sites predominate were cut only selectively for a variety of wood products. Beginning in the mid-19 th century, large areas of farmland were abandoned and forests regrew. Logging, mainly of white pine (Pinus strobus L.), increased in the late 19 th and early 20 th century. Large swaths of the forest experienced extensive hurricane damage in 1938 when as much as 75% of the timber-mostly white pine stands, older hardwood forests, and conifer plantations-was blown down (Foster and Boose 1992). Hardwood forests of oaks, maples and birches (Betula spp.) often replaced the pine stands. More than 10% of the trees were damaged, but not killed, in a 2008 ice storm (Yao 2011).

Soil respiration measurements
In this paper "soil respiration" (R s ) refers to total soil CO 2 efflux, the respiration of soil fauna, roots, and mycorrhizae and other microbes. We compiled data from 23 studies of R s , identified herein as S1 to S23 (Tables 1-2). In all cases, R s was measured in fixed locations on a given sampling day, generally where PVC or aluminum collars had been inserted and left in the soil, usually for the duration of the study. Because comprehensive descriptions of the methods of measurement have already been published, we summarize them only briefly here.
Four methods were used to measure R s , in order of increasing measurement frequency: (1) soda-lime systems where pellets were left beneath a closed chamber for 24 hours to absorb CO 2 emitted from the soil, (2) static chamber systems where a chamber was placed on each collar and headspace air samples were taken at fixed intervals over 15 to 30 minutes and subsequently analyzed with an infrared gas analyzer (IRGA) or a gas chromatograph, (3) dynamic chamber systems in which a chamber was placed on each collar, chamber air was circulated to and from a portable IRGA system, and the rate of increase in CO 2 concentration was measured in situ for a period of five minutes, and (4) automated chamber systems (herein autochambers), in which a datalogger-controlled system closed one chamber at a time and circulated the headspace air through an IRGA. Although criticized early on, the soda-lime technique, used in one of the oldest studies, has shown good correspondence with other methods (e.g., Raich et al. 1990, Keith andWong 2006). Importantly, the use of soda-lime data was restricted to the analysis of experimental treatments on R s ; they were not used in comparisons with R e . The four methods were never used side-by-side, so we cannot formally analyze whether there were systematic, method-based biases, although Savage and Davidson (2003) found no significant differences in seasonal flux estimates and in fluxes measured within one hour using the autochamber and dynamic chamber methods at Harvard Forest. For >70% of collars, soil temperature was measured at 10-cm depth. In all other instances, soil temperature was measured between 2-and 8.5-cm depth.
Fourteen of the 23 studies were observational in nature and hence measured R s in untreated or "control" plots only (Table 1). These studies covered a broad range of ecosystem types-natural and planted forests, wetlands-and times since most recent disturbances. The remaining data were collected from field experiments. In these studies, R s was measured in control plots as well as in treated plots. Wetlands data were from accessible wetlands only; no measurements were made in flooded areas.
The experimental treatments (Table 1) were as follows: in S2, half of the plots were subjected to a simulated drought. Translucent roofs and rain gutters were used to prevent rainfall from reaching the ground. In the experimental warming studies S15, S19, and S20, soil was 11 heated to 5°C above ambient temperature using underground heating cables. S20 also included a soil disturbance control in which heating cables were inserted in the ground but not activated. S15 is a soil warming × N fertilization (5 g N m -2 yr -1 ) factorial. In S16, nitrogen fertilizer was applied at two levels (5 g N m -2 yr -1 and 15 g N m -2 yr -1 ) for 20 years to assess the impact of long-term N amendment on adjacent hardwood and red pine stands.
S22 assessed the impact of plant inputs on R s . Treatments included the doubling of annual aboveground litterfall, excluding aboveground litter, excluding root inputs by trenching, excluding aboveground litter and root inputs, and replacing the organic and A horizons with Bhorizon soil.
Located in hemlock-dominated areas, studies S3 and S14 examined the impacts of harvesting or of an invasive insect, the hemlock woolly adelgid (Adelges tsugae Annand), on R s .
The treatment plots consisted of girdled hemlock trees (i.e., the removal by chainsaw or knife of a strip of bark and cambium that kills the tree without cutting it down) or hemlock logging simulating a management decision to harvest trees before adelgid infestation. Finally, S21 was located in a selectively harvested deciduous stand in which 27% of the tree stems and basal area was removed for the production of saw timber and firewood.

Ecosystem-scale CO 2 measurements
The EMS tower ( Fig. 1) has been in operation since 1990. It uses EC to make nearly continuous measurements of CO 2 , H 2 O, and energy fluxes between the surrounding forest and the atmosphere (Wofsy et al. 1993, Goulden et al. 1996, Urbanski et al. 2007). Air and soil temperature, photosynthetic photon flux density (PPFD), net solar radiation, and other environmental measurements are taken concurrently.
Red oak and red maple trees dominate the 75-to 110-year-old forest surrounding the tower (Urbanski et al. 2007). Small stands of eastern hemlock, white pine and red pine (Pinus resinosa Aiton) are also present. In 2006, an extensive survey found that the basal area of trees and shrubs (>1 cm DBH) was 38.7 m 2 ha -1 around the EMS tower (Goldman et al. 2006).
Eddy covariance measurements were used to calculate net ecosystem exchange (NEE), the difference between the amount of CO 2 fixed by the ecosystem and the amount released to the atmosphere. Power outages, equipment failures, and invalid or out-of-range data caused gaps in the two series of half-hourly or hourly NEE used in this study (Fig. A1). These factors caused the loss of 60% of the NEE data at the EMS tower, and 81% at the HEM tower. At EMS, 31% of the lost data was caused by gaps less than 24 hours long, 34% by 1 to 7-day long gaps and 35% by gaps longer than 7 days. At HEM, 25% of the lost data was caused by gaps less than 24 hours long, 50% by 1 to 7-day long gaps and 25% by gaps longer than 7 days. A larger proportion of the dataset had to be discarded at the HEM site because only EC measurements for winds from the southwest are representative of the hemlock stand; observations for other wind directions were not used. Generally, gaps were evenly distributed throughout the year at both sites. 13 We used the method and algorithm of Urbanski et al. (2007) to partition NEE into gross ecosystem exchange (GEE) and ecosystem respiration (R e ) and to gap-fill the EMS-tower dataset. Gaps in HEM data were filled using non-linear regression . For those times when neither partitioned nor gap-filled NEE data were available for the HEM tower, we used the Fluxnet-Canada Research Network (FCRN) gap-filling procedure (Barr et al. 2004, Amiro et al. 2006 to estimate R e because it gave good agreement with available gap-filled values from HEM (Fig. A2). For both EMS and HEM data, we summed the gap-filled halfhourly or hourly averages of R e to obtain daily and monthly fluxes.
It is important to keep in mind that R e values determined from eddy covariance are a model-based estimate of ecosystem respiration assuming that observed NEE at night can be scaled to the daytime using its relationship to temperature. Calm periods are excluded to avoid a low bias in the fluxes due to advective losses and transport not associated with turbulent eddies.
For EMS tower data, we fit a linear dependence of nighttime NEE against the difference in temperature from the mean over short (10-20 day) intervals. Ecosystem respiration during daylight was predicted by assuming that the nighttime dependence of R e on temperature applied equally to daytime R e .
We estimated the spatial extent of the flux-tower footprints using inverse Lagrangian modeling  to estimate the proportion of the footprint area represented by the different vegetation cover types. Because the footprint varies with season, we computed it separately for the snow-free, intermittent, and permanent snow cover seasons. For each flux tower in each season, we computed the average footprint contributing 90% of the measured fluxes.

Tree phenology
To link soil and ecosystem respiration to annual aboveground phenology, we used phenological data collected at Prospect Hill from 1992-2010 (O'Keefe 2011). The date of bud break was defined as the first day when at least 50% of the buds on a tree had recognizable leaves. Full leaf out was estimated as the day when >90% of the leaves on a given tree reached at least 95% of their final size. In autumn, the process of leaf abscission was noted as "leaf coloration" and was estimated as the day when at least 20% of the leaves on a given tree had changed color. We computed the average date of occurrence of bud break, leaf out, and leaf coloration for four red oak trees and five red maples, the two dominant tree species present in the EMS-tower footprint, or for five hemlock trees, the dominant species in the HEM-tower footprint, and averaged the results across years.

Snow cover
The presence or absence of a snow cover was used in our analyses as a potential driver of seasonal R s patterns and to adjust the extent of the flux-tower footprint. Since snow depth or cover was not routinely measured before 2010, we identified days with snow cover by calculating the daily ratio of daytime upward to downward PPFD measured at the top of the EMS tower (Coursolle et al. 2012). This method is based on the principle that snow has a higher albedo than the soil surface, so snow cover increases the ratio of upward-to-downward PPFD.
Data were available for 1992-2007. Because PPFD data are noisy, we combined data across years and identified a day of year (DOY) as having "persistent snow cover" when the daily ratio of upward to downward PPFD was at least 0.06 in at least 8 out of the 16 years. We identified the DOY as having "intermittent snow cover" when 4-7 years were above the threshold, and as "snow-free" when 3 years or less were above the threshold.  Fig. 2A). Temporal resolution was also widely variable; the studies using autochambers produced R s measurements every 30 minutes (S1) or 4 hours (S23) but R s measured manually was usually available only once every few days/weeks for any given study. Similarly, very few measurements were made during winter because of the difficulty of measuring R s through a snowpack. Wintertime estimates of R s were based on the apparent temperature sensitivity of R s derived from snow-free periods. Besides temperature, we did not include other drivers, such as soil moisture content, in the extrapolation because these data were not available for all studies.
We assessed the response of R s to T s using the linearized Q 10 function in Humphreys et al. (2005): The logarithmic transformation yielded a linear function with homoscedastic errors. The parameters of this equation were then used to estimate the apparent Q 10 of soil respiration, which measures the factor of increase in soil respiration associated with an increase of 10°C in soil temperature: We then calculated R 10 , the rate of soil respiration at 10°C: Because Eq. 1 represents the apparent, and not actual, temperature response when used on data at the seasonal to annual time scale (Davidson et al. 2006b, Subke andBahn 2010), we used the temperature model to capture seasonal trends in R s . In this case, T s is effectively used as a driver of R s but also as a proxy for other drivers that correlate with temperature, such as plant phenology, soil water content and substrate supply, among others.

Estimation of R s at eddy-covariance sites
The exceptionally large number of observations in the R s dataset created the opportunity to compare seasonal variations in the magnitude and timing of R s to that of R e . We created an 18- year time series of R s for the EMS and HEM tower sites (R s EMS and R s HEM , respectively) using a "reference" soil temperature at 10-cm depth (herein T s ref ), the depth of most T s measurements.
Because no single site has 18 years of continuous measurements of soil temperature at 10-cm depth, we combined T s data from all available sources. We first computed T s ref and then calculated site-specific temperature records for each vegetation type of each study listed in Table   1 based on measured relationships between T s ref and site-specific T s .

17
The longest time series of soil temperature at 10-cm depth in a forested area was We recreated Bahn et al.'s (2010) approach by producing a relationship between R s annual and modeled R s MAT at the Harvard Forest and tested its validity using observed R s data.

Calculation of treatment effects
To compare the effects of experimental treatments on R s within and among studies through time, we computed a "response ratio", or effect size, of R s in the treatment plots (R s treatment ) and the corresponding control plots (R s control ) of each study. We used an approach similar to the one described in the section Estimation of R s at eddy-covariance sites to obtain series of estimated R s , but the temperature response of R s was calculated for each measurement year (instead of all years together) to avoid masking variability that might have been caused by interannual variations in environmental conditions. Since very few R s measurements were made during winter in any study, we used data from April to October only, the period with the best data coverage. To estimate the uncertainty in the response ratio, we randomly removed 20% of the collars from each experiment and treatment. Using the new dataset, we rescaled the T s ref series to T s of the study, recomputed the relationship between R s and T s using Eq. 1 and used the resulting Q 10 and R 10 coefficients to produce series of estimated R s control and R s treatment for each measurement year. We then computed the new response ratio. This process was repeated 200 times and we used the results to calculate non-parametric confidence intervals.

Spatial variability of R s
To study the spatial variability of soil respiration at Harvard Forest, we examined the correlation of small-scale fluctuations at neighboring collars. For example, we analyzed R s data from a transect of nine collars that were measured on multiple days. On each measurement day, every collar was measured. A linear model of log(R s ) on log(T s ) was fit separately at each collar, and correlation among collars was examined with a scatterplot matrix of residuals (Fig. A5). We observed no more correlation among neighboring collars than among distant collars, suggesting that non-modeled effects are not spatially correlated at this fine scale.

Statistics
We analyzed variations in R 10 and Q 10 among vegetation types using one-way ANOVA with post-hoc comparisons using Tukey's Honestly Significant Difference test in R (R 20 Development Core Team, V0.96.230). In this analysis, the unit of replication was the soilrespiration collar (Fig. 3). Data from all years collected at a collar were used to fit the linearized Q 10 function, Eq. 1. Gap-filling and regression analyses were conducted in Matlab V7.11.0 (The MathWorks Inc., Natick, Massachusetts, USA).
The apparent temperature sensitivity of soil respiration (Q 10 ) estimated for each collar varied from about 1 to 9, with most values between 2 and 5 ( Fig. 3C-D). In general, the Q 10 estimates were more broadly distributed than the R 10 estimates (Fig. 3B, D). Mean Q 10 was lowest at 2.53 ± 0.11 for red pine, intermediate at 2.93 ± 0.11 and 3.04 ± 0.12 for mixed and hemlock stands, respectively, and was highest at 3.83 ± 0.09 in deciduous and 3.97 ± 0.15 in wetland sites (Fig. 3C). The mean Q 10 of deciduous stands was slightly higher and that of hemlock stands and slightly lower than the average calculated from Bond-Lamberty and Thomson (2010b; global means = 3.46 ± 0.10 and 3.44 ± 0.16).
Among the five different vegetation types, annual soil respiration (R s annual ) varied from 469 to 951 g C m -2 yr -1 (Table 3). On average, R s annual of wetlands was almost 200 g C m -2 yr -1 lower than for other forest types.

Annual soil respiration estimation
We found a robust relationship between R s annual and modeled R s MAT when data from all studies and vegetation types were used (Fig. 4A)

Respiration and phenology
There was large inter-annual variability in soil and ecosystem respiration (Table 4; Fig.   2A (Table 4).

22
To compare the annual cycle of R e and R s , we computed the median daily fluxes using our 6-year HEM dataset and the last 14 years of our EMS dataset (Fig. 2C-D). We did not use the first 4 years of the EMS dataset because only one study took place during these years and using data from later years to estimate R s in the first four years resulted in unrealistic fluxes such as higher monthly R s than R e (Fig. 2A).
Although both R s and R e followed annual cycles, in which respiration was lowest during winter and highest during the warmest months of the year, there were marked differences in fluxes within and between tower sites. The largest relative differences between daily R e EMS and R s EMS occurred during winter, the first half of the growing season, and late in the fall (Fig. 2C).  (Fig. 2D). R e HEM and R s HEM both peaked in early August, whereas R e EMS attained its maximum approximately four weeks earlier than R s EMS (Fig. 2C-D).
At the EMS site, aboveground plant respiration (R aboveground EMS )-the difference between R e EMS and R s EMS -was ~0.8 g C m -2 d -1 during winter and started increasing immediately before snowmelt (Fig. 2C). R aboveground EMS reached its highest value between late May and early July, when leaf development reached completion, and decreased thereafter, attaining its minimum in August. R aboveground EMS later slowly increased, reaching its winter average rate of ~0.8 g C m -2 d -1 at the beginning of October, coincident with the emergence of leaf coloration in the autumn. At the HEM site, R aboveground HEM was near zero during winter, increased rapidly during snowmelt, and reached its peak in June, at the time of full leaf out (Fig. 2D). R aboveground HEM declined 23 slightly during mid-summer, increased again until it reached almost as high as the annual maximum in early September, and decreased sharply thereafter until it reached zero at the onset of the winter snowpack. The annual cycle of R aboveground presented some oddities that will be discussed in detail in sections Seasonal variation in R s is linked to temperature and phenology and Methodological advances are needed to reduce uncertainty in R s , R e and NEE.

Effect of experimental treatments on soil respiration
Two main categories of experimental treatments have been used: treatments mimicking different aspects of global change and experiments aimed at partitioning R s into component fluxes. Some of these manipulations had large effects on annual R s ranging from −70% to +52% of that in control plots (Fig. 5). Not surprisingly, the direction of the effect was generally related to the change in C inputs, with girdling, logging, trenching, diminution of litter inputs, and removal of O and A horizons resulting in lower CO 2 emissions, while increasing litter inputs caused an increase in emissions. N additions and soil warming also increased CO 2 efflux, with the largest effects of these manipulations in the first few years of the treatment.

Variability of soil respiration
To assess variation in R s due to spatial variability, interannual variations, and experimental treatments, we calculated the coefficient of variation (CV) of soil respiration totals from April to October (R s Apr-Oct ) in all studies ( The impact of spatial variability on R s was represented by the CV of the average R s Apr-Oct in control plots for each vegetation type in each study. The effect of spatial variability was similar to the variability introduced by interannual variations in climate and biological processes in control plots for the three studies with at least 11 years of measurements (Table 5).

DISCUSSION
We analyzed more than 100,000 individual measurements of soil respiration (R s ) from 23 observational or experimental studies executed over nearly a quarter century in five different forest types at the Harvard Forest. These data were coupled with 24 site-years of net ecosystem exchange (NEE) data collected using eddy covariance (EC) measurements-including the longest time-series of EC data in the world-that allowed us to examine in detail the relationship between R s and ecosystem respiration (R e ). These data and the relationships they reveal are especially valuable in light of a recent analysis suggesting that measurements of R s are among the most important data for reducing the uncertainty of process-based models of forest carbon dynamics (Keenan et al. 2013).
We draw five main observations from our analyses and synthesis: 3. Seasonal variations in R s and R e are linked to variations in temperature and vegetation phenology, with the majority of R e driven by aboveground respiration from bud break through leaf out followed by the continued increase in soil respiration and its dominance of R e throughout the remainder of the growing season. On average, the peak in aboveground respiration occurs 38 days earlier than the peak in belowground respiration.

Variations in R s caused by inter-annual variations in weather and phenological events are
of the same order of magnitude as responses caused by experimental manipulations. Thus it appears that climatic controls over R s are of similar importance as other drivers of global change (i.e., invasive insects, forest management practices, N deposition).
5. It remains difficult to partition R e into above-and belowground components, in part because of the different spatial scales of R s and R e measurements and possible errors associated with the two techniques. Progress in making the "hard" measurements, such as R s during winter, properly dealing with stable conditions in eddy covariance measurements, daytime vs. nighttime canopy respiration and its impacts on estimates of R e , and independently verifying the partitioning of NEE into R e is likely to lead to increases in the confidence of estimates of R s , R e , and NEE.

Experimental manipulations appear to influence R s through substrate availability
Overall, R s increased in response to soil warming, nitrogen fertilization and doubling of litter inputs, and declined because of simulated drought, logging, girdling, trenching, diminution 26 of litter inputs, and removal of O and A horizons (Fig. 5). Both responses appear to arise from experimentally induced changes in substrate availability that are caused by changes in the size of soil C pools (e.g., addition of labile litter, removal of soil horizons), belowground C fluxes (e.g., N fertilization, trenching, girdling or logging), or environmental conditions (e.g., drydown, warming  Rustad et al. 2001). In some cases (e.g., logging, warming studies S19 and S20), experimental effects on R s were clearly transient, but in other cases the duration of observations following single or repeated manipulations ("pulse" and "press" experiments, respectively, sensu Bender et al. 1984) were too short to distinguish between transient dynamics and permanent change in R s . It remains difficult to pinpoint the cause of differences in the effects of similar manipulations in different studies (e.g., the greater impact of warming in study S15 than in S19 and S20, Fig. 5) since the experiments were not all designed to be compared to each other and not all environmental parameters and carbon pools were measured.

Small-scale spatial variation in R s can exceed variation among forest types
Our data clearly illustrate that R s is highly variable at all spatial and temporal scales of measurement (Figs. 2-3; see also Norman et al. 1997, Rayment and Jarvis 2000, Drewitt et al. 2002. Variability among R s measurements made at collars within a single observational or experimental study was as large as or larger than inter-annual variability in estimated R s annual (cf. Raich et al. 1990). For example, R s measured between July 1-10 during a single year on unique collars in undisturbed plots varied by up to 1426% (median: 31%, mean: 99%) over those 10 27 days, whereas R s annual varied by a maximum of 127% within studies (median: 39%, mean: 47%) and 197% among all studies and vegetation types. This suggests that unquantified heterogeneity in substrate or activity by roots or microbes is a critical factor that needs to be explored in more detail. Deciduous and hemlock forests, the main types of vegetation studied, had similar R 10 rates ( Fig. 3A) that were lower than predicted from Bond-Lamberty and Thomson's global soil respiration database (version 20120510a) for mature deciduous and coniferous temperate forests (2.04 ± 0.05 and 2.61 ± 0.11 µmol C m -2 s -1 , respectively; data available at http://code.google.com/p/srdb; Bond-Lamberty and Thomson 2010b). R 10 and R s annual were lowest in wetlands, likely because of lower plant productivity and reduced C inputs to the soil ) and lower decomposition under anoxic conditions (Skopp et al. 1990).
Although spatial variation in fluxes is large, it does not preclude understanding and statistically resolving important temporal variations in R s at sub-seasonal to inter-annual time scales or variations in response to properly designed experimental treatments.

Predicting annual rates of soil respiration
Similar to Bahn et al. (2010), we were able to predict annual soil respiration (R s annual ) from soil respiration at mean annual temperature (R s MAT ), but only when we used a large number of R s measurements taken over a wide range of temperatures to estimate R s MAT (Fig. 4A), not when using only actual R s measurements made at MAT (Fig. 4B). Our analysis suggests that estimates of R s annual based on only a small number of measurements of R s at MAT will have high uncertainty, probably driven by spatial and temporal variations in R s .

Seasonal variation in R s is linked to temperature and phenology 28
At both the deciduous EMS and hemlock-dominated HEM sites, R s was correlated with phenological events driven by abiotic factors such as soil and air temperature (Fig. 2C-D). When estimated at the seasonal or annual time scale, the response of R s to temperature using Eq. 1 represents the apparent rather than intrinsic temperature sensitivity (Davidson and Janssens 2006). This occurs because field-based measurements of R s provide an integrated measure of various factors including the intrinsic temperature sensitivity of the various C pools metabolized by microbes and plant roots in addition to the effects of substrate supply and diffusion, plant phenology and C allocated belowground (e.g., Davidson et al. 2006a, b, Subke andBahn 2010).
Plant phenology drives seasonal R s rates through above-and belowground litter inputs, root respiration, and root exudates. Hence, seasonal variations in R s are correlated with both aboveground plant phenology and seasonal temperature changes (Curiel Yuste et al. 2004, Savage et al. 2013, as our analysis reiterates (Fig. 2C-D). R e and R s were at their lowest in winter, when deciduous trees are leafless and soil temperature (T s ) is lowest. As soon as snow started melting, T s increased rapidly, leading to a sharp increase in respiration (Fig. 2C-D).
Furthermore, the rapid fine-root growth, which occurs mainly in April and May in Harvard Forest's hardwoods and red pine stands (McClaugherty et al. 1982), also contributed to the increase in R s at that time of the year.
Not surprisingly, R s and R e in conifer-and hardwood-dominated stands responded differently to climatic drivers as suggested by the varying rank order of the R s /R e ratios at the two sites (Table 4). At both sites, R s followed changes in soil temperature (see also Davidson et al. 1998, Subke and Bahn 2010. The peak of R e , however, seemed to better correspond with the timing of maximum air temperature than that of soil temperature (Fig. 2C-D). The earlier peak in R e EMS was apparently the result of earlier and greater quantity of leaf and 29 shoot development compared to the growth of new shoots and leaf biomass in the conifer site (cf. Phillips et al. 2010). Indeed, bud break and complete leaf expansion occurred two weeks earlier in the deciduous stand compared to the hemlock forest ( Fig. 2C-D). The timing of maximum ecosystem and soil respiration at the HEM site is comparable to C-flux measurements from a spruce-hemlock forest in Maine (Davidson et al. 2006b).
At both sites, R aboveground started increasing just before snowmelt (Fig. 2C-D). At the HEM site, it reflected increasing metabolic activity in conifers, as has been reported elsewhere (Davidson et al. 2006b). At the hardwood-dominated EMS site, R aboveground was more likely initially driven by pre-leaf out metabolic activity associated with bud break, branch elongation, and wood production in ring-porous species such as oak that dominate this site (Hadley et al. 2009). We estimated aboveground metabolic activity and growth during the early growing season represents ~60% of R e EMS but only ~33% of R e HEM from snowmelt until the end of May.
At EMS, after full leaf expansion, the relative contribution of R aboveground to R e decreased rapidly and substantially until it was ~10% in August. Thereafter, R aboveground slowly increased until leaf-fall in late September and October (Fig. 2C), possibly reflecting increasing metabolic activity associated with the breakdown and translocation of carbohydrates, nucleic acids, and nutrients during the senescence process (Chapin and Kedrowski 1983). In addition to the surprisingly low mid-to late-summer values of R aboveground at EMS noted above, we also observed surprisingly high R aboveground in winter at EMS, a time when most respiration is expected to occur in the soil with little coming from aboveground vegetation (Davidson et al. 2006b).
Despite the large volume of data brought to bear in this analysis, we cannot clearly attribute the unusual patterns in respiration to uncertainty in R e or spatial and temporal extrapolations associated with the measurements of R s . The inability to attribute uncertainty may result from "irreconcilable differences" in methodology (sensu Strand et al. 2008); the spatial and temporal scales of measurements may simply not allow for robust cross comparisons. The substantial differences between R aboveground at HEM and EMS may also reflect methodological challenges and uncertainties in the dataset and estimates (see Methodological advances are needed to reduce uncertainty in R s , R e and NEE).
Finally, it appears that spatial variability and temporal variations in weather and phenology induced variation among annual R s estimates that was similar to differences in R s among the experimental treatments, with the exception of some of the R s partitioning manipulations (Table 5). Our results imply that R s is regulated simultaneously by several biotic and abiotic factors, and that any factor can have a large impact on R s at a given time through its direct or indirect effect on substrate availability.

Methodological advances are needed to reduce uncertainty in R s , R e and NEE
The dataset analyzed here includes 109,444 measurements of R s taken over two decades in different vegetation types found within two EC tower sites, and 24 site-years of EC data.
Before computing seasonal or annual estimates of R s , data were adjusted to account for soil surface area covered by trees or rocks and seasonal variation of tower footprint size. Despite these adjustments, we observed unusual patterns in R aboveground at the EMS site that cannot be explained by ecosystem processes and physiology alone, and differences between our observations and estimates of R s and R e at other temperate sites.
For example, from December through March, a time of year when the ground is generally covered by snow, mean daily R e EMS ranged from 0.79 to 2.70 g C m -2 d -1 depending on the year (median: 1.47; mean: 1.48; SD: 0.40 g C m -2 d -1 ) and was on average more than twice as high as 31 R s EMS (median: 0.59; mean: 0.58; SD: 0.05 g C m -2 d -1  Schmid et al., 2000), which is similar to the aboveground biomass at the EMS site (~10 kg C m -2 ; Urbanski et al., 2007). Aboveground biomass at UMBS is 7.23 kg C m -2 (AmeriFlux online database, http://public.ornl.gov/ameriflux), approximately 25% lower than EMS, but the difference in wintertime R e was much larger than that. We did not find biomass data for the Park Falls and Creek (AmeriFlux online database, http://public.ornl.gov/ameriflux). Given that aboveground biomass is essentially the same at the EMS and Morgan Monroe sites and that temperature is higher at the latter, EMS should not show much higher wintertime R e . Park Falls, Willow Creek, and UMBS are all colder than EMS during the winter, but at these low temperatures the exponential relationship between temperature and respiration is almost flat-an increase of 2-32 4°C in low temperatures does not induce a large absolute change in respiration. In conclusion, we did not find a satisfying explanation of why wintertime R e is higher at the EMS site than elsewhere.
Underestimation of R s EMS or overestimation of R e EMS could explain the high apparent rate of R aboveground EMS during the winter period. Uncertainties in the estimates of both fluxes make it difficult to determine which process is contributing more to the high estimate of wintertime ecosystem respiration values for the southwest sector, which is entirely uplands, were at most 10% lower than R e estimated for the entire dataset that includes wetlands in the northwest sector. Goulden et al. (1996) previously noted enhanced R e during periods of high wind in winter at Harvard Forest, and this accounts for some of the extremely large values of R e , but excluding them does not bring R e estimates down to the range of R s .
Our estimates of R e are based on excluding periods of low turbulence (u*<0.2 m s -1 ) based on the premise developed for summer that CO 2 fluxes are biased low during stable atmospheric conditions due to advective losses (Staebler andFitzjarrald 2004, Barr et al. 2013).
Whether or not a friction velocity (u*) filter is appropriate during wintertime may need to be reexamined. Deep snowpacks are a diffusion barrier allowing CO 2 from soil respiration to accumulate. If high winds are ventilating the snowpack where CO 2 has been accumulating, then averaging the high and low u* data together, which would bring down the estimate of R e , may be necessary to get an unbiased estimate of R e .

33
On the other hand, R s EMS could be systematically underestimated because there are very few wintertime R s measurements through a snowpack. As such, the wintertime R s estimate is based on an extrapolation of data beyond the range of values measured: the temperature-response relationship used to estimate R s for cold soil during winter was established using data collected mainly when the soil was warm, and the influence of snow cover is not accounted for. The response of R s EMS to temperature during winter might differ from that during the warmer months of the year due to shifts in soil microbial assemblages that have higher temperature sensitivity at cold temperatures than growing-season-adapted microbial communities (Monson et al. 2006, Bradford et al. 2008. Importantly, however, modeled R s generally overestimated the available wintertime measurements (Fig. 6). Another possibility is that scaling R s to the landscape level introduced a bias. Since there is large variation in R s within each vegetation type (Fig. 3), predominance of a given vegetation type within an EC footprint does not mean that R s is uniform within the footprint. The deciduous stands located to the south and west of the EMS tower are on a soil series different from (deeper and less rocky, with higher R s ) than beneath the deciduous stands to the north and east. When the footprint includes stands south and west of the tower, actual R s within the footprint may be higher than our weighted R s . As fine-tuned as our scaling of R s to the EC tower footprint is, it remains difficult, if not impossible, to scale it perfectly. Hence, a mismatch between the footprints of R e and R s cannot be ruled out.
R s accounts for the majority of R e late in the growing season when soils reach their maximum temperature ( Fig. 2; see also Curiel Yuste et al. 2005, Davidson et al. 2006b, Bergeron et al. 2009. The majority of the soil respiration measurements at Harvard Forest were made during the growing season when soil temperature was between 5-20°C, suggesting that the estimate of R s EMS during summer is robust (Fig. 7). In contrast, during summer, mean wind speed and friction velocity decline substantially from that observed in the other seasons (Fig. 8).
Although our EC estimates were based on fluxes when u* was >0.2 m s -1 , the minimum value when EC fluxes are considered valid at the EMS site (Urbanski et al. 2007), low wind speeds during the summer are likely to exacerbate advective losses of CO 2 at this site even when friction velocity is above the minimum threshold (Staebler and Fitzjarrald 2004). Furthermore, NEE values are dependent on the u* threshold selected (Barford et al. 2001, Barr et al. 2013. A bias in NEE would bias the estimate of R e . The net effect may be low estimates of R e and seemingly very low R aboveground EMS during the late summer months. Intermittent transport of CO 2 or its transport too fast or too slow to be captured by the EC system may also result in the underestimation of R e (Staebler and Fitzjarrald 2004). It has been suggested that the HEM site may be less subject to advection than the EMS site because of the site topography .
Another important issue is that the NEE partitioning method assumes that nighttime NEE when u* is high can be used to define the dependence of R e on temperature and predicts daytime R e . However, if ecosystem or soil respiration is not adequately predicted by temperature alone, the daily sums may be incorrect. It might be the case, for example, if canopy dark respiration is inhibited during the day as recent studies suggest (e.g., Heskel et al. 2013), implying that daytime R e is overestimated when the nighttime relationship between NEE and temperature is used to do the partitioning. Although the observation scales may not always be well matched, comparisons between R s and estimated R e provide a useful constraint for evaluating the validity of NEE partitioning models.
Additional research on the hard-to-measure fluxes (e.g., wintertime R s , non-turbulent transport of CO 2 ) and independent measurements confirming flux partitioning (e.g., aboveground plant respiration, isotopic partitioning of NEE) might yield the greatest insights into partitioning R e between above-and belowground components. Such an approach may be necessary to both resolve current uncertainties as well as to link remotely sensed products of vegetation phenology (e.g., satellite-and tower-based camera observations) with fluxes of C on the ground (see also Keenan et al. 2013).

CONCLUSIONS
Using one of the largest site-specific collections of R s measurements in the world, we found strong seasonal and inter-annual variations in R s that were linked both to temperature and vegetation phenology and that experiments intended to simulate aspects of global and environmental change influenced R s to the same extent as that found at seasonal to annual time scales. We then used this robust dataset to partition R e into above-and belowground fluxes.
Given the number of R s and R e observations brought to bear, our partitioning estimates of abovevs. belowground respiration are as robust as currently possible. We found a distinct pattern of ecosystem respiration dominated by aboveground processes early in the growing season and belowground processes after the time of full canopy development in deciduous and conifer forests. While the absolute magnitude of the partitioning above-vs. belowground remains in question, the temporal variation is clear. This analysis suggests a greater emphasis be placed on accurately characterizing wintertime R s fluxes, the size of eddy-covariance tower footprints, the scaling up of the soil respiration chambers measurements, and accounting for C flux bias during stable periods throughout the year and particularly in the late summer. An in-depth evaluation of C flux partitioning is also needed, possibly based on a comparison with reliable and representative soil and aboveground plant respiration measurements.         indicates the 25 th percentile and the boundary farthest from zero, the 75 th percentile. Whiskers above and below the box indicate the 10 th and 90 th percentiles while the black points above and below the whiskers indicate the 5 th and 95 th percentiles. The horizontal dashed line represents a ratio of 1. A ratio above 1 indicates an increase in R s caused by the treatment, while a ratio lower 58 than 1 indicates a decrease. Red asterisks denote years when data were available for the treated plots before/after the treatments were applied. Boxes without asterisks represent years during which the plots were treated. The categories of treatments are indicated on the x-axis. Treatments codes are as in Table 1. In study S16, treatments were applied to hardwood (HW) and red pine (P) plots.