Seasonal grassland productivity forecast for the U.S. Great Plains using Grass-Cast

. Every spring, ranchers in the drought-prone U.S. Great Plains face the same dif ﬁ cult challenge — trying to estimate how much forage will be available for livestock to graze during the upcoming summer grazing season. To reduce this uncertainty in predicting forage availability, we developed an innovative new grassland productivity forecast system, named Grass-Cast, to provide science-informed estimates of growing season aboveground net primary production (ANPP). Grass-Cast uses over 30 yr of historical data including weather and the satellite-derived normalized vegetation difference index (NDVI) — combined with ecosystem modeling and seasonal precipitation forecasts — to predict if rangelands in individual counties are likely to produce below-normal, near-normal, or above-normal amounts of grass biomass (lbs / ac). Grass-Cast also provides a view of rangeland productivity in the broader region, to assist in larger-scale decision-making — such as where forage resources for grazing might be more plentiful if a rancher ’ s own region is at risk of drought. Grass-Cast is updated approximately every two weeks from April through July. Each Grass-Cast forecast provides three scenarios of ANPP for the upcoming growing season based on different precipitation outlooks. Near real-time 8-d NDVI can be used to supplement Grass-Cast in predicting cumulative growing season NDVI and ANPP starting in mid-April for the Southern Great Plains and mid-May to early June for the Central and Northern Great Plains. Here, we present the scienti ﬁ c basis and methods for Grass-Cast along with the county-level production forecasts from 2017 and 2018 for ten states in the U.S. Great Plains. The correlation between early growing season forecasts and the end-of-growing season ANPP estimate is > 50% by late May or early June. In a retrospective evaluation, we compared Grass-Cast end-of-growing season ANPP results to an independent dataset and found that the two agreed 69% of the time over a 20-yr period. Although some predictive tools exist for forecasting upcoming growing season conditions, none predict actual productivity for the entire Great Plains. The Grass-Cast system could be adapted to predict grassland ANPP outside of the Great Plains or to predict perennial biofuel grass production.


INTRODUCTION
Operational flexibility by grassland owners and managers in their ability to match animal demand with available forage is limited within and across years by the lack of predictive tools that accurately forecast aboveground biomass for the growing season (Kachergis et al. 2014). This flexibility is most limited during dry/drought conditions as adaptive management decisionmaking is constrained (Derner and Augustine 2016), and livestock production and net revenue decline with increasing precipitation variability (Hamilton et al. 2016, Bastian et al. 2018, Irisarri et al. 2019) which influences decision-making by land managers. Advances in remote sensing (Gaffney et al. 2018) and modeling (Derner et al. 2012, Fang et al. 2014, Del Grosso et al. 2018 to predict plant productivity, and multisite analyses of productivity responses to precipitation across decadal scales (Chen et al. 2017, Petrie et al. 2018, provide more synthetic knowledge and understanding to advance site-and regionallevel forecasting of aboveground biomass. Practical and functional applications of forecasting efforts, however, are still limited and may be improved by integrated, multidisciplinary approaches to fuse near real-time remote sensing, short-term and seasonal weather forecasts, process models, uncertainty, and web technology to visually display spatial and temporal patterns of grassland productivity. Although some predictive tools exist for forecasting upcoming growing season conditions or estimating grassland productivity, each has limitations for practical and functional applications by grassland managers for decision-making. The U.S. Drought Monitor (https://droughtmonitor. unl.edu/) has current drought intensity conditions (from none to exceptional drought) across the United States, and associated U.S. Monthly and Seasonal Drought Outlooks (https://www.c pc.ncep.noaa.gov/products/Drought/) provide depictions of large-scale geographical trends based on derived probabilities guided by shortand long-range statistical and dynamic climate forecasts that are partially subjective. These tools are limited for decision-making by land managers at relevant spatial and temporal scales by (1) lack of translation of drought intensity conditions into reductions of grassland productivity, and (2) the challenge of detecting flash droughts (sensu Mo and Lettenmaier 2016) such as occurred during 2012 (Otkin et al. 2016). The National Weather Service Climate Prediction Center (CPC; https://www.cpc.ncep.noaa.gov/) provides a series of weather/climate probability outlooks ranging from 6 to 10 d to three months for both temperature and precipitation across the United States. Like the U.S. Drought Monitor, these weather/climate probability outlooks are limited for decision-making by grassland managers by the lack of translation of these probability outlooks to grassland productivity responses. Third, the South Dakota Drought Tool (https:// www.nrcs.usda.gov/wps/portal/nrcs/main/sd/ technical/landuse/pasture/) projects peak forage production (1 July) in South Dakota as percent of normal forage production using current drought conditions and future average precipitation (30-50 yr average). Limitations of the South Dakota Drought Tool include (1) geographical constraints beyond South Dakota, (2) projected peak production based only on assumption of future average precipitation, (3) upper ceiling of projections is 100% of normal forage production, and (4) lack of verification and accuracy of the projected peak production. Fourth, the Australian Grassland and Rangeland Assessment System (Aussie GRASS; Carter 1996, Carter et al. 2000) is a spatial modeling framework for Australia, maintained by the Queensland State Government Department of Environment and Science (https:// v www.esajournals.org www.longpaddock.qld.gov.au/aussiegrass). The forecasted pasture growth is derived using a point-scale pasture growth model known as GRASP (GRASS Production;McKeon et al. 1990, Day et al. 1997) and climate inputs dictated by historical year-types aligning with the current state of the climate system (e.g., El Niño Southern Oscillation [ENSO] and Pacific Decadal Oscillation [PDO] phase). GRASP calculates grassland production on a 0.05 degree (~5 km) grid across Australia each month, following the updating of interpolated climate inputs (rainfall, minimum and maximum temperature, solar radiation, evaporation, and vapor pressure deficit). Satellite imagery is used both to assess tree cover (Landsat imagery) and to calibrate grass production parameters (normalized vegetation difference index (NDVI) imagery), the latter by assessing model calculated green and dead cover against NDVI. Limitations of applying Aussie GRASS to the U.S. Great Plains include (1) the required data input intensity, and (2) the lack of PDO/ ENSO links to productivity across the entire Great Plains region.
Overcoming aforementioned limitations in the existing predictive tools for grassland productivity could be addressed by an approach with end user considerations incorporated to produce a visually appealing and informative product that has scientific underpinnings based on established ecological relationships. Moreover, the end product for use by grassland managers in decisionmaking needs to be updated frequently during the growing season. To accomplish this, we developed an innovative new grassland productivity forecast system, named Grass-Cast, to provide science-informed estimates of growing season vegetation productivity in the U.S. Great Plains. This decision tool for grassland managers is underpinned by statistical relationships derived from over 30 yr of historical data including spatially continuous daily weather and satellite-derived NDVI (Tucker 1979) across the Great Plains, combined with county-specific growing season precipitation forecasts and actual evapotranspiration estimated by the DayCent model (Parton et al. 1998). Fusing this information provides the potential to generate estimates every 10-14 d of vegetation productivity in the spring (beginning in April) and early summer, with three scenarios: below-normal, near-normal, or above-normal amounts of precipitation from the forecast date to the end of the primary growing season (end of July). Focus group interactions with ranchers in the Great Plains suggested that Grass-Cast provide these three rangeland production forecasts because of uncertainties associated with seasonal weather forecasts. Visual representation of the predictions (http://grassca st.unl.edu/) provides a regional perspective of grassland productivity to inform decisionmakers geographically where forage supplies are limited, near-normal, or in excess.
Grass-Cast is based on previous work where we established that cumulative April-July actual evapotranspiration (iAET) was the precipitation-related variable most highly correlated to cumulative growing season (May-September) NDVI (iNDVI) and aboveground net primary production (ANPP; Del Grosso et al. 2018, Chen et al. 2019. Grass-Cast predicts ANPP as a function of simulated iAET using previously established county-level regressions between simulated iAET and iNDVI and regional statistical relationships between iNDVI and measured ANPP (Chen et al. 2019).
Grass-Cast was developed in 2017 for use in Northern Plains states (Montana, North Dakota, South Dakota, Wyoming, Nebraska, and Colorado). In 2018, the southern Plains states-Kansas, New Mexico, Oklahoma, and Texas-were added. For both 2017 and 2018, the spatial scale of grassland productivity estimates occurred at the county level; for 2019, the spatial scale was reduced to a 10 × 10 km grid level following stakeholder requests to have the predictions at finer spatial scales.
Here, we showcase the development and refinement of Grass-Cast for U.S. Great Plains grassland ecosystems. We provide the ecological and statistical relationships that provide the underpinning to Grass-Cast. We present countylevel production forecasts from 2017 and 2018 that include grassland regions for ten states in the U.S. Great Plains. We assess the accuracy and skill of the Grass-Cast grassland productivity estimates at the county scale using satellitederived NDVI from independent data. Further, we show that near real-time 8-d NDVI in the early part of the growing season can also be used to predict cumulative growing season NDVI and ANPP to supplement Grass-Cast and reduce uncertainty with seasonal precipitation forecasts.

Weather and remotely sensed data sets
Grass-Cast requires a multi-decadal history of daily observed minimum and maximum air temperatures, and precipitation at the geographic scale of grassland productivity estimation (county level in 2017 and 2018). 1-km resolution Daymet weather (Thornton et al. 2016) was used for 1980-2016 for historical purposes using the latitude and longitude of the scale of productivity estimate. Weather data for 2017 and 2018 were obtained from the gridded climate dataset (GridData) available from Applied Climate Information System (ACIS) Web Services (http://data. rcc-acis.org/). GridData provides users with access to several gridded datasets, including daily observed temperature and precipitation fields generated using the methods of DeGaetano and Belcher (2007) for temperature and DeGaetano and Wilks (2009) for precipitation.
To compute April-September iNDVI (described in Eq. 1 below), we used bimonthly, 8-km resolution third-generation Global Inventory Modeling and Mapping Studies (GIMMS) NDVI3g data for 1982-2015 (Pinzon and Tucker 2014), which we downloaded from NASA's ECO-CAST archive (https://ecocast.arc.nasa.gov/data/ pub/gimms/3g.v1/); in this paper, this will be referred to as AVHRR NDVI. The effects of possible snow and/or cloud cover were accounted for by removing all pixels that were not classified as either good (QA = 0) or marginal (QA = 1) quality in the QA layer. We further isolated grasslands by excluding pixels identified as barren, forest, or crop using MODerate resolution Imaging Spectroradiometer (MODIS) land cover data (Friedl et al. 2010). For retrospective evaluation of Grass-Cast ANPP estimates, we used the 16-d composite 1-km resolution MODIS NDVI (MOD13QA2 V006) data product available via the NASA Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/). The 16-d pixel reliability data layer that accompanied the MODIS NDVI data was used to identify and remove lower-quality data. Following the methodology we used to compute AVHRR iNDVI, we calculated MODIS-based cumulative NDVI, used for Grass-Cast evaluation, by isolating grasslands, aggregating grassland NDVI to the county level, and integrating over a two-month period when peak biomass occurs (June-July; described in Eq. 6 below).

DayCent model description
The Grass-Cast system computes cumulative April-July AET for the grasslands in each county from simulated daily AET using the DayCent model (Parton et al. 1998, Del Grosso et al. 2001, 2011. DayCent is an ecosystem-level biogeochemical model that simulates the fluxes of water, carbon, and nitrogen (or nutrients) among the atmosphere, plants, and soil in crop/grassland, forest, and savanna ecosystems. The primary inputs to DayCent include site location, daily weather data (e.g., maximum and minimum temperature, precipitation), soil profile texture and hydraulic properties, vegetation parameters, and land management practices (e.g., grazing intensity). The functionality of Day-Cent is implemented in four primary submodels: plant growth and production, soil organic matter cycling, land surface hydrology, and trace gas emissions (Parton et al. 1998, Del Grosso et al. 2001, 2011. DayCent was calibrated for grasslands and croplands in 476 Great Plains counties in previous work (Hartman et al. 2011, Parton et al. 2015.
In DayCent, daily AET is simulated in its land surface hydrology submodel (Parton et al. 1998). AET from two sources was simulated. First, water inputs (i.e., precipitation and melt snow) intercepted by plants and litter were evaporated at the rate of potential daily evapotranspiration rate (PET). Then, remaining water either passes through soil layers as saturated or unsaturated flow or runs off the soil surface. During the processes, water is evaporated from the top of the soil profile at the potential soil evaporation rate. In addition, the transpiration at each soil layer is simulated as a function of plant live leaf biomass and root distribution. Previous work (Chen et al. 2016(Chen et al. , 2017 demonstrated that the DayCent simulated AET agreed with the observations in both the semiarid shortgrass steppe and the mesic tallgrass prairie.

Grass-Cast equations
The annual observed integrated growing season NDVI for a county for each year 1982-2015 (iNDVI obs.yr ) was calculated from eleven biweekly county-level AVHRR NDVI v www.esajournals.org 7values (NDVI obs,wk ) that occurred from 1 May to 30 September of that year. A county-specific base NDVI value (NDVI base ) that represents NDVI at the very beginning of the growing season was subtracted from each biweekly measurement. iNDVI obs,yr ¼ ∑ Sept:30 wk¼May1 ðNDVI obs,wk À NDVI base Þ: (1) NDVI base was a function of longitude. It was 0.10 is the western-most county and 0.15 in the eastern-most county. NDVI base values for other counties were linearly interpolated between these values as a function of longitude. This pattern of base NDVI was typical of NDVI during the non-growing season (November--March). Although we defined 476 Great Plains counties in previous work (Hartman et al. 2011), grassland NDVI was not available for all the counties. For some counties, the grassland area was so small that grassland observed NDVI could not be determined. For other counties, particularly ones that were close to mountainous areas, the NDVI signal was influenced by tree or shrub biomass and determined to be too high to be a grassland ecosystem. Filtering these out resulted in 366 counties for the Grass-Cast forecasts.
The annual cumulative April-July AET value for a county (iAET yr ) was the sum of all daily AET values simulated by DayCent from 1 April to 31 July during year (yr): The DayCent-derived values of iAET yr for years 1982-2015 and the values of AVHRR iNDVI obs,yr for the same 34 yr were calculated prior to running the Grass-Cast model in order to establish a county-specific linear regression of iNDVI obs,yr to iAET yr , giving slope m county and intercept b county.
During the forecast procedure, we ran Day-Cent 36 times for each county to compute 36 values of iAET for the current growing season based on daily weather to date and 36 scenarios of weather for the remainder of the growing season. For the weather scenarios, we used historical weather records from the most recent 36 yr to get a long-term sampling of recent meteorological conditions in the Great Plains. The scenarios were ranked from 1 to 36 (from driest to wettest) according to the amount of precipitation in the remainder of the growing season. We then calculated 36 predictions of integrated growing season NDVI. The predicted integrated growing season NDVI for a county and weather scenario i, where i = 1, 2, . . ., 36) (iNDVI pred,i ) was based on the county-specific slope m county and intercept b county .
The predicted annual ANPP for each county and scenario (ANPP pred,i , g biomass m −2 ) was based on a regional quadratic regression with iNDVI pred,i . This relationship was determined in previous work that compared pasture-level iNDVI to measured ANPP for five sites in the Great Plains that span a precipitation gradient and had long-term ANPP datasets (Chen et al. 2019). During the forecast procedure, Grass-Cast calculated 36 values of ANPP from the i = 1, . . ., 36 predictions of cumulative growing season NDVI using the following relationship: ANPP pred,i ¼ 24:04ðiNDVI pred,i Þ 2 À 24:36 ÂiNDVI pred,i þ 16:24 (4) The equation above, used for the forecasts in this paper, was similar to that of Chen et al. (2019). Within the Great Plains, the correlation of ANPP to iNDVI was similar for the warm-season C 4 -dominated southern grassland counties and the cool-season C 3 -dominated northern grassland counties (Chen et al. 2019).

Executing the forecast
The 2017 and 2018 Grass-Cast ANPP forecasts were provided approximately every two weeks beginning in early April and continuing to late July. Year-to-date weather was used as an input into DayCent model runs as well as 36 weather scenarios for each county which were created for the remainder of the growing season using past observed weather (years 1981-2016 for the 2017 forecast and years 1982-2017 for the 2018 forecast) as analogs (Fig. 1).
Once DayCent model runs for each county and weather scenario i = 1, . . ., 36 were conducted, we (1) aggregated daily simulated AET to compute iAET i (Eq. 2), (2) predicted iNDVI i from iAET i (Eq. 3), and (3) computed ANPP i from iNDVI i (Eq. 4). Next, for each county, we computed three ANPP means, ANPP dry , ANPP normal, and ANPP wet , based on the 12 driest, 12 average, and 12 wettest precipitation scenarios. Finally, we computed the percent differences between ANPP dry , ANPP normal, and ANPP wet compared to ANPP mean (Eq. 5). These percent differences were displayed in three maps for Grass-Cast forecast date: percent change in ANPP compared to the mean ANPP when precipitation from the forecast date through 31 July is (1) below-normal, (2) near-normal, and (3) above-normal (see http://grasscast.unl.edu/). It is left to the user to choose the appropriate precipitation range, which may be selected according to the most likely county-level precipitation (for example, seasonal outlooks provided by the Climate Prediction Center (CPC) (http://www.cpc. ncep.noaa.gov/products/forecasts/). The end-ofgrowing season (31 July) ANPP forecast resulted in a single map since the calculation of any cumulative April-July AET value used only observed weather.

Accuracy assessments comparing Grass-Cast ANPP forecasts to MODIS NDVI
We used yearly cumulative June-July MODIS NDVI to evaluate Grass-Cast, first producing a time series of Grass-Cast compared to MODIS NDVI for 2017 and 2018, and secondly comparing Grass-Cast end-of-growing season ANPP forecasts against MODIS NDVI for years 2000-2018. Cumulative June-July MODIS NDVI data were selected for this evaluation since they were independent data that represent peak biomass during the growing season. Yearly values of cumulative June-July MODIS NDVI for each county is referred to as MODIS JJ,yr , (Eq. 6) and the mean of MODIS JJ,yr for yr = 2000, . . ., 2018 is referred to as MODIS JJ,mean. The values v www.esajournals.org of MODIS JJ,yr were calculated from biweekly MODIS NDVI (MODIS wk ) and the same base NDVI that was used in Eq. 1: For the 2017 and 2018 time series accuracy assessments, we compared all nine of the April through July Grass-Cast normal precipitation forecasts for each county to MODIS NDVI from the same year. For Grass-Cast, we computed the percent difference of the ANPP from normal precipitation forecast compared to ANPP mean , and for MODIS NDVI, we computed the percent difference between MODIS JJ,yr and MODIS JJ,mean . The percent differences for both Grass-Cast ANPP and MODIS NDVI were grouped into three categories: Percent differences <−8% were below the mean, percent differences >8% were above the mean, and percent differences between −8% and +8% were near the mean. The threshold of AE8% was chosen so that each of the three categories had approximately the same number of counties. When the county's Grass-Cast ANPP percent difference and the MODIS NDVI percent difference were in the same category, they were said to be the same. When they were one category apart, such as when one was near the mean and the other was either below the mean or above the mean, they were off by 1 category. When they were two categories apart, such as when one was below the mean and the other was above the mean, they were off by 2 categories. For each forecast date, we computed the percent of counties in the same, off by 1, and off by 2 categories. If the forecast accuracy improves over time, and if the final Grass-Cast forecast is similar to MODIS JJ,yr , then we would expect the counties off by 2 at the beginning of the growing season to transition to off by 1 then the same; similarly, we would expect the counties that were initially off by 1 to transition to the same. Ideally, the Grass-Cast ANPP and MODIS NDVI categories for all counties should be in the same at the end of the growing season.
In the second accuracy assessment, we compared the county-level Grass-Cast 31 July ANPP forecasts (ANPP GC,yr ) for years 2000-2018 to cumulative June-July MODIS NDVI (MODIS JJ.yr ) for the same years. Again, we calculated percent differences as described above and then the coefficient of determination (R 2 ) of Grass-Cast ANPP percent differences to the MODIS NDVI percent differences using all county value pairs (see Table 1). Using the categorical percent differences described above (the same, off by 1, and off by 2), we also computed yearly Heidke Skill Scores (HSS; Eq. 7) for 2000-2018 to quantify how well Grass-Cast ANPP corresponds to cumulative June-July MODIS NDVI. The HSS indicates how often the forecast category correctly matches the observed category, over and above the number of correct hits expected by chance alone. This score utilizes the number of correct and incorrect category hits. The values range from −50 to 100; a score of 100 indicates a perfect forecast and a score of −50 indicates a perfectly incorrect forecast. Scores greater than 0 indicate improvement compared to a random forecast and indicate skill. The equation for the score is: where H is the number of correct forecasts (the ANPP GC,yr percent difference and the MODIS JJ,yr percent difference were in the same category), E is the expected number of correct forecasts (1/3 of total when assuming equal chances), and T is the total number of valid forecast-observation pairs. The HSS results are summarized in Table 1.

Correlation between 8-d NDVI and growing season mean NDVI
We calculated Pearson's correlation coefficient (r) between 8-d NDVI and growing season cumulative April-October NDVI to look for relationships between early growing season NDVI and cumulative growing plant production. All processing steps for calculating NDVI correlations were performed in Google Earth Engine (GEE; Gorelick et al. 2017). NDVI values were calculated on the MOD09Q1 MODIS, 250 m, 8-d surface reflectance image collection (Vermote 2015) from year 2000 to 2018. The image collection was filtered to retain only images collected between the months of April and October and clipped to counties of interest. Quality control band bitmask values were used to retain only pixels with values of clear for cloud state and no cloud for the internal cloud algorithm flag. The USGS National Land Cover Database (NLCD) v www.esajournals.org image collection was used to create a grassland areas mask. A total of seven NLCD images (years: 2001, 2004, 2006, 2008, 2011, 2013, and 2016) were reduced to create a single grassland mask consisting only of pixels identified as the grassland land cover class in all NLCD images. The grassland mask was applied to all NDVI images.
A growing season total NDVI image collection was constructed from the NDVI images in which pixel-wise sums were calculated for each year and stored as an image in the collection (i.e., one image per year). Similarly, individual day-of-year (DOY) image collections were constructed for each DOY represented in the 8-d MODIS data that occurred between April and October (approximate DOY range of 96-296) and stored in a single image collection (i.e., a collection of collections). Each DOY collection consisted of 19 images representing the NDVI values for that given DOY in each year from 2000 to 2018.
As an example, the first resulting image collection contained images for DOY = 96, with one image for DOY 96 from each year. Thus, the first image in the collection was for DOY 96 in year 2000, the second image was for DOY 96 in year 2001, etc. Each image had two bands with the first band containing NDVI data for the given DOY, and the second containing the cumulative NDVI for the appropriate year. Pearson's correlation coefficient (r) was calculated for each DOY image collection to obtain the pixel-wise correlation between the NDVI and cumulative NDVI bands. This resulted in a single correlation image for each DOY. Mean r values were calculated for each county and exported as shapefiles from GEE for visualization as maps.

Biweekly Grass-Cast forecasts for 2017 and 2018
The primary product of each Grass-Cast ANPP forecast was three maps presented as the percent change in ANPP compared to the long-term mean ANPP (Eq. 5). Here, we show four Notes: Both Grass-Cast and cumulative June-July MODIS NDVI are expressed as percent difference from their means, and both are placed into the below the mean, near the mean, and above the mean categories using the AE8% threshold as described in the methods. The columns same category, off by one, and off by two are number of counties where Grass-Cast/MODIS pairs are in the same category, differ by one category, and differ by two categories, respectively. Columns H, T, E, and HSS (%) are described in Eq. 7. v www.esajournals.org forecasts during the year, approximately one per month starting in early May. Maps from all forecast dates for 2017 and 2018 can be found in the Supplemental Information (Appendix S1: Fig. S1, Appendix S2: Fig. S1).
Summary of 2017 Grass-Cast maps.-The three ANPP forecasts at the beginning of the 2017 growing season in April were very different and strongly related to the precipitation forecast (Appendix S1: Fig. S1). As the growing season progressed and more observed weather was incorporated into the ANPP forecast, the similarity between the three maps increased (Fig. 2). The final 31 July 2017 forecast showed a vast area of below-normal production (red and orange colors) in Montana, North Dakota, and South Dakota and highly productive regions (blue colors) in Colorado and Nebraska. The 8 May forecast did not indicate a potential for severely reduced production in the three northern states unless precipitation was below-normal. However, by 1 June, the NPP maps reflected the development of a flash drought (a dry period that has come on very suddenly) in the north. The 12 June maps showed that below-normal production was almost certain in Montana, North Dakota, and South Dakota despite the precipitation forecast (Appendix S1: Fig. S1). Indeed, the northern Great Plains suffered a severe flash drought in 2017 (Gerken et al. 2018, Otkin et al. 2018) as shown in the final 31 July map which was not indicated on the Climate Prediction Center seasonal forecasts (Appendix S3: Fig. S1a,b). The 1 June forecast also showed that production was likely to be abovenormal in Colorado and Nebraska regardless of the precipitation scenario, and those results were consistent with the final forecast.
For 2017, the percent differences in the initial growing season forecast (1 April) were poorly correlated with the final growing season percent differences (R 2 < 0.2; Fig. 3a). The R 2 values for the below-normal and near-normal forecasts surpassed 0.5 by 8 May, while the above-normal forecast surpassed 0.5 sometime between 22 May and 1 June. With each two-week forecast from 8 May to 12 June, the R 2 value increased about 0.1 to 0.3; after 30 June the increase in R 2 was less pronounced. The R 2 values for all three forecasts were 0.8 or greater by 12 June.
Summary of 2018 Grass-Cast maps.-The 22 April forecast showed that there was a high potential for the southern Great Plains to have below-normal production given growing season precipitation that was below-normal or near-normal; this forecast likely reflected dry soil conditions at the beginning of the growing season (Appendix S2: Fig. S1). On the other hand, the 22 April maps showed that western Montana should expect normal or above-normal grassland production despite the precipitation forecast; this forecast likely reflected wet soil conditions at the beginning of the growing season (Appendix S2: Fig. S1). The final 2018 forecast on 31 July showed mostly below-normal ANPP in the southern Great Plains (New Mexico, Texas, and Oklahoma; Fig. 4). For most counties in Colorado, Nebraska, and the Oklahoma Panhandle, ANPP was within AE 15% of normal with a few counties below and above those percentages. In eastern North Dakota and South Dakota, ANPP was near-normal (light green, AE5%). In Montana, Wyoming, western Nebraska, North Dakota, and South Dakota most counties showed ANPP >15% above-normal.
The regional dichotomies in the final 2018 forecast were predicted 6-10 weeks ahead of time. By the 21 May forecast, it was largely evident that the southern Great Plains should expect ANPP values to be below-normal even if nearnormal or above-normal precipitation was to occur in the coming two months (Appendix S2: Fig. S1). The 1 July forecast showed that Texas should expect below-average ANPP even if that area experienced above-normal precipitation during the next month (Fig. 4). In contrast, the 18 June forecast showed that the northwest section of the Great Plains should expect above-average ANPP for the growing season despite the amount of precipitation for the next six weeks (Appendix S2: Fig. S1). In contrast to spatial variation in Grass-Cast forecasts throughout 2018, the Climate Prediction Center seasonal precipitation forecasts showed near-normal precipitation for the vast majority of the growing season, with little regional variation (Appendix S3: Fig. S1c,d).
For 2018, the percent differences in the initial growing season forecast (1 April) were poorly correlated with the final growing season percent differences (R 2 < 0.4; Fig. 3b). The R 2 values for all forecasts surpassed 0.5 sometime between 21 May and 4 June. With each two-week forecast from 21 May to 1 July, the R 2 value increased v www.esajournals.org about 0.1-0.25; after 1 July the increase in R 2 was less pronounced. The R 2 values for all three forecasts were 0.7 or greater by 18 June.

Additional Grass-Cast output and historical productivity
Grass-Cast produces a number of results in addition to the percent differences presented in the three maps. First, for each forecast date Grass-Cast creates a summary spreadsheet that includes the following information for each county: mean 36-yr iNDVI and ANPP (lbs/ac/yr), iNDVI and ANPP estimated for each of the three precipitation categories (below-normal, normal, and above-normal) from the current forecast, and the standard deviation of ANPP within each  -yr (1982-2015) average. Left, middle, and right maps: expected percent change in grassland production if precipitation for the remainder of the growing season is below-normal, near-normal, or above-normal, respectively. v www.esajournals.org of the three precipitation categories for the current forecast. The standard deviation relates to the variability of ANPP outcomes among the 12 weather scenarios within each precipitation category. The summary spreadsheet (.csv file) from any previous forecast from 2017 to the present can be downloaded from https://grasscast.unl.ed u/Archive.aspx. Second, Grass-Cast provides a table of historical ANPP estimates from the previous 36 or more years (Appendix S7: Table S1). Using these historical productivity estimates, a land manager can compare the current forecasted percent differences in ANPP for each county to those of former growing seasons for a relative sense of how current forage amounts relate to prior years.

Time series accuracy assessments for 2017-2018
Using categorical classifications, the time series accuracy assessments showed how each of the ANPP forecasts that use near-normal precipitation compared to the MODIS JJ,yr (Fig. 5a). For 2017, the number of counties that differed by two categories was a maximum of 7% on 10 April declining to 1% by 31 July. For the first three ANPP forecasts of 2017 (10 April, 25 April, and 8 May), the number of counties in the same category and the number of counties that differed by one category were both just under 50%. On 22 May, the number of counties in the same category began to increase and subsequently the number that differed by one category began to decline. By 31 July 2017, the number of counties in the same category was over 70%. The 2018 forecasts started off with about 40% of counties in the same category and about 50% of counties off by one category (Fig. 5b). For 2018, the number of counties in the same category did not exceed other categories until 18 June, a month later than in 2017. The percent of counties that differed by two categories was >10% up to 18 June and afterward dropped to zero for the final three forecasts. After 18 June, the accuracy of the forecast climbed rapidly and reached almost 70% by 31 July.
Retrospective evaluation: end-of-growing season Grass-Cast ANPP predictions compared to cumulative June-July MODIS NDVI for [2010][2011][2012][2013][2014][2015][2016][2017][2018] Correlations between the end-of-growing season (31 July) Grass-Cast ANPP percent differences and MODIS JJ,yr percent differences for 2000-2018 are listed in Table 1. R 2 values ranged from 0.15 to 0.88 and were >0.50 for 12 of the 19 yr. Maps of Grass-Cast percent change in ANPP show very similar patterns to maps of percent change in MODIS JJ,yr (Fig. 6, Appendix S4:  Fig. S1). Bright red colors show where ANPP GC,yr or MODIS JJ,yr were >15% below-normal and dark blue colors show when they were >15% above-normal. In general, each year the color patterns for ANPP GC,yr and MODIS JJ,yr were Fig. 3. The coefficient of determination (R 2 ) between the percent difference in ANPP from each forecast during the growing season compared to the percent difference in ANPP for the final forecast for all counties in the Great Plains, (A) 2017, (B) 2018. Percent differences for any date are computed using the 34-yr average ANPP. The legend values R2_pct_below, R2_pct_avg, and R2_pct_above refer to the precipitation forecasts that were used at each date (below-normal, near-normal, or above-normal, respectively). strikingly similar for the majority of Great Plains counties. For unknown reasons, a few pairs of maps showed some regional discrepancies. For example, for 2014, Grass-Cast showed below-normal productivity in most counties in the southern half of the Great Plains where the MODIS data showed quite a few counties with above-average greenness. For 2015, Grass-Cast showed below-average productivity in northern Montana that was not seen in the MODIS data. For 2016, Grass-Cast showed an area of low productivity in the southwest portion of the Great Plains where the MODIS data did not. Overall, when the two maps in a given year showed a Fig. 4. Grassland Productivity Forecast (Grass-Cast) maps for 2018, produced on 1 May, 4 June, 1 July, and 31 July. The three maps for each date show the forecasted percent change in aboveground grassland production compared to a county's 34-yr (1982-2015) average. Left, middle, and right maps: expected percent change in grassland production if precipitation for the remainder of the growing season is below-normal, near-normal, or above-normal, respectively. v www.esajournals.org 13 November 2020 v Volume 11(11) v Article e03280 difference in patterns, it was usually because Grass-Cast predicted lower productivity than the MODIS NDVI indicated.

Heidke skill scores
The annual HSS from 2000 to 2018 ranged from 33% to 84% (Table 1). All annual scores were greater than 0 indicating an improvement compared to a random forecast; the 19-year cumulative score was 53.4. The four highest scores, where HSS were greater than 70, occurred when the vast majority of the Great Plains had below-average AET and ANPP (years 2002, 2006, and 2012), and when there was a strong dichotomy in productivity between the northern and southern Great Plains in 2011 (high in northern Great Plains and low in the southern Great Plains; Appendix S4: Fig. S1, Appendix S5: Fig. S1, Appendix S6: Fig. S1). The five lowest scores, where HSS were less than 40, occurred when AET, ANPP, and NDVI in most of the Great Plains was close to normal or above- normal with no large dry (bright red) regions (years 2001, 2003, 2005, 2010, and 2014; Appendix S4: Fig. S1, Appendix S5: Fig. S1, Appendix S6: Fig. S1). This suggests that Grass-Cast was most skilled at recognizing below-normal AET and ANPP associated with drought. The frequency of being off by two categories (when Grass-Cast predicted above-normal with MODIS NDVI showed below-normal, or vice versa) occurred at most 4.7% of the time (in 2014), while in 13 of the 19 yr, Grass-Cast and MODIS NDVI were in the same category >60% of the time. On average, Grass-Cast and MODIS were in the same category 69% of the time and were off by two categories 1.4% of the time (Table 1).

Correlation between weekly NDVI and growing season mean NDVI
We examined the correlation between mean 8d MODIS NDVI and the mean cumulative April-October MODIS NDVI at the county level to determine how soon early growing season NDVI was indicative of the growing season production (Fig. 7). Strong positive correlations (Pearson's r > 0.6) were first observed in the southern-most Great Plains by late April. This was interpreted as being due to warmer weather in the south. Correlations were weak, and even negative in more northern counties at this time, but as the growing season progressed, correlations gradually became more positive going Grass-Cast ANPP forecasts used here used near-normal precipitation forecasts. Both Grass-Cast and cumulative June-July MODIS NDVI are expressed as percent difference from their means, and both are placed into the below the mean, near the mean, and above the mean categories using the AE8% threshold as described in the methods. The blue line shows when Grass-Cast is in the same category (the same) as the MODIS cumulative June-July NDVI percent difference. The orange line shows when they differ by one category (Off by 1), and the gray line shows when they differ by two categories (Off by 2). northward. By the fourth week of May, the r values were >0.6 for more than half of the Great Plains counties. By the second week in June, the r values were >0.6 for more than two-thirds of counties. For a few counties in the northeast and southeast, negative correlations persisted throughout the growing season; we determined that these counties had very few grassland pixels and/or they had discrepancies between the NLCD grassland mask and actual land cover (e.g., irrigated fields and riparian areas). The strength of the correlations diminished by August as grasses senesced (not shown).

DISCUSSION
Grass-Cast provides ranchers and land managers with prediction of vegetation productivity in the upcoming growing season relative to their county's three-decade history. Grass-Cast grassland production forecasts incorporate the precipitation and evapotranspiration to date as well as scenarios of precipitation for the remainder of the growing season. Grass-Cast's accuracy improves as the growing season progresses (Figs. 3,5), so it should be consulted more than just once during the growing season. Confidence in Grass-Cast forecasts increased substantially by late May or early June, when forecasts had >50% correlation to the end-of-growing season forecasts and MODIS NDVI (Figs. 3,5).
By providing three forecasts per date based on a range of precipitation predictions, Grass-Cast can inform ranchers and land managers when grassland productivity will be highly sensitive to precipitation for the remainder of the growing season, or when the conditions to date have already determined a trajectory of below-or above-average ANPP. To determine which of the three ANPP forecasts is most appropriate for a region, we suggest consulting additional precipitation forecasts. One resource is the three-month seasonal precipitation forecasts for all of the United States from the Climate Prediction Center (CPC; http://www.cpc.ncep.noaa.gov/). However, the CPC three-month precipitation outlooks for May-July and June-August in 2017 and 2018 did not indicate below-normal precipitation anywhere in the Great Plains (Appendix S3: Fig. S1). Grass-Cast detected rapidly declining productivity related to drying soil conditions that were not evident from the seasonal precipitation forecasts. In early June 2017, Grass-Cast revealed belownormal production indicative of a flash drought in the northern Great Plains (Fig. 2), and in early June 2018, it revealed below-normal production in the southern Great Plains (Fig. 4).
Maps of Grass-Cast 31 July ANPP forecasts were compared to an independent spatial data set, namely cumulative June-July MODIS NDVI, and the two showed similar patterns (Appendix S4: Fig. S1). For years 2000-2018, the number of counties where Grass-Cast and MODIS NDVI were in the same category (below the mean, near the mean, or above the mean) ranged from 55.3% to 89.3% (average 69%) and the percent of counties where Grass-Cast and MODIS NDVI had opposing categories was small, ranging from 0.0% to 4.7% (average 1.4%; Table 1). Heidke Skill Scores based on Grass-Cast comparisons to MODIS NDVI also showed that Grass-Cast was more skilled at forecasting drought-induced declines in ANPP (Table 1; Appendix S4: Fig. S1).
Grass-Cast is one of several tools available for ranchers and land managers. The U.S. Drought Monitor (https://droughtmonitor.unl.edu/) and the South Dakota Drought Tool (https://www.nrc s.usda.gov/wps/portal/nrcs/main/sd/technical/la nduse/pasture/) provide assessments of drought intensity and duration instead of forecasts of grassland productivity, and the latter has a limited range within the Great Plains. Grass-Cast and the U.S. Drought Monitor are non-interactive, providing static maps for their assessments, while the South Dakota Drought Tool allows users to develop a customized drought contingency plan for livestock operations. The Aussie GRASS approach is fundamentally different to Cast ANPP forecast, paired with the maps of percent difference in cumulative June-July MODIS NDVI relative to the mean (2000-2017) June-July cumulative NDVI at the county level for years 2011-2014. Red (<−15%), pink (−15% to −5%); white (−5% to + 5%); light blue (+5% to +15%); dark blue (>+15%).
( Fig. 6. Continued) v www.esajournals.org that employed in Grass-Cast, which is far less input intensive, relying solely on estimates of accumulated NDVI for the calculation of near real-time pasture production. There is no way to provide direct comparisons among these tools since they vary in their application and spatial extents.
Grass-Cast outputs, like those based on the national-or county-scale monitoring and forecast tools, must be carefully interpreted at more local scales, particularly for decisions related to setting stocking rates, determining turnout dates, or other aspects of lease agreements, grazing allotments or grazing permits. In such cases, it is intended that ranchers and public land managers use Grass-Cast and precipitation forecasts as an aid to decision-making, and adjust Grass-Cast's county-level productivity estimates according to their local knowledge of soils, plant communities, topography, and management. For example, Grass-Cast does not directly account for local management practices such as grazing intensity in the previous year and does not differentiate between desirable forage species and undesirable species. Thus, it is important for ranchers and land managers to know what proportion of aboveground biomass is produced by non-forage or undesirable species through the growing season, including how those species respond to rain (or lack thereof) compared to the desirable forage species. Grass-Cast also assumes that these grazing lands are neither irrigated nor fertilized, and such amendments could encourage greater local grassland productivity compared to the largerscale productivity that Grass-Cast considers.
Grass-Cast incorporates in its three precipitation categories weather conditions that have occurred in the prior 36 growing seasons. Within-category uncertainty is represented by the standard deviations in the summary spreadsheet that each Grass-Cast forecast provides along with the set of three maps. For a given county, for example, subtracting two standard deviations from the ANPP estimate of the belownormal category would provide the extreme lowest ANPP estimate for the current growing season. Comparing growing season temperature and precipitation to date to the same metrics from prior years (data not yet provided by Grass-Cast) could provide additional perspectives on how the current growing season may unfold. Long-term trends in ANPP, provided in the historical productivity estimates, can assist land managers in relating current year Grass-Cast forecasts to their local site and experiential knowledge.
In 2019, the Grass-Cast methods described in this paper were applied at an improved spatial resolution (10 km × 10 km) which is more relevant for grassland managers; this resulted in increasing the number of individual spatial areas simulated in Grass-Cast by a factor of 40. This finer spatial scale resulted in heterogeneity of grassland production within counties. However, 2019 results are not presented here since validating Grass-Cast ANPP forecasts against MODIS NDVI at a scale finer than the county level was challenging, primarily because there may be few grassland pixels in any individual grid cell. The availability of frequent (8-d) NDVI measurements offers an opportunity to adapt Grass-Cast to provide closer to real-time ANPP forecasts. Given the low skill of long-term precipitation forecasts, the county-specific regressions between weekly NDVI and growing season mean NDVI are valuable in providing an early growing season production forecast that supplements Grass-Cast. MODIS NDVI data for the most recent 8-d period is available within 10 d following the last day of the 8-d period. Starting in May of each year, shortly after seasonal Grass-Cast forecasts commence, we can use the current 8-d NDVI measurements to predict if growing season NDVI (and therefore grassland production) is likely to be below-normal, normal, or above-normal for the coming growing season. For any county, the confidence in the prediction will be determined by the Pearson's r value for that county for the week being examined. We found strong correlations of mean 8-d MODIS NDVI to cumulative growing season NDVI by mid-April, mid-May, and early June for the southern, central, northern Great Plains, respectively. These results are quite promising and show the potential to use near real-time Google Earth Engine MODIS NDVI to predict growing season vegetation production.
We plan to expand Grass-Cast into the southwestern United States (Arizona and western New Mexico) in 2020, and possibly into grasslands/rangelands of the western United States (e.g., Utah, Nevada, and California) in 2022.
v www.esajournals.org However, due to differences in plant phenology and timing of dry and wet seasons, applying Grass-Cast to other regions will require us to correlate different precipitation-related variables to NDVI, or to adjust the accumulation periods for these variables. For example, the correlation of growing season precipitation to iNDVI is stronger than the correlation of April-July AET to iNDVI in an eastern tallgrass prairie of the Great Plains (Chen et al. 2019). In a California grassland with a rainfall pattern typical of a Mediterranean climate where precipitation begins in autumn, extends through winter, peaks in spring, and ends in summer, annual AET calculated on hydrologic year (1 October-30 September) is well-correlated with ANPP (Asao et al. 2018). In the southwest United States, where grassland productivity is influenced by both winter-time precipitation and monsoonal moisture in mid-to late summer, grassland ANPP is correlated with January-September precipitation and with June-September precipitation (Martin andSeverson 1998, Khumalo andHolechek 2005). Additionally, Grass-Cast methods could be adapted to predict production of perennial biofuel grasses in the Great Plains and other regions.

CONCLUSIONS
Grass-Cast uses observed weather data, scenarios of growing season precipitation, and correlations of simulated AET to more than 30 yr of satellite-derived NDVI to produce visually appealing and informative maps of forecasted ANPP for the Great Plains of the USA. Other predictive tools developed for the USA can address drought or are applied to limited spatial scales, rather than forecast grassland productivity for the entire Great Plains region. A retrospective evaluation of Grass-Cast results compared to MODIS NDVI over the past two decades showed that the two agreed 69% of the time and that Grass-Cast was skilled at forecasting occurrences of below-average grassland productivity, including those associated with flash droughts. By late May or early June, the ANPP forecasts have >50% correlation to the final 31 July forecast. The correlations using 8-d Google Earth Engine MODIS NDVI suggest that a spring forecast of cumulative growing season NDVI and ANPP could be made as early as mid-April in the Southern Great Plains and by mid-May to early June for the Central and Northern Great Plains. However, Grass-Cast cannot indicate the difference between desirable and undesirable forage species so ranchers and land managers using Grass-Cast will need to incorporate local knowledge on plant communities and precipitation forecasts to make adaptive management decisions during the growing season. Grass-Cast methods may be adapted to expand grassland productivity forecasts to rangelands beyond the U.S. Great Plains or to other perennial grasses such as bioenergy crops.