Sophia Walther✉ 0000-0003-1681-9304
Department Biogeochemical Integration, Max-Planck-Institute for Biogeochemistry
Martin Jung
Department Biogeochemical Integration, Max-Planck-Institute for Biogeochemistry
✉ — Correspondence possible via GitHub Issues
or email to
Jacob A Nelson <jnelson@bgc-jena.mpg.de>,
Sophia Walther <swalth@bgc-jena.mpg.de>.
Abstract
Estimation of global carbon and water fluxes via up-scaling from in-situ eddy covariance measurements is a key method for diagnosing the Earth system from a data driven perspective. We describe the first global products (called X-BASE) from a new up-scaling framework, FLUXCOM-X. The X-BASE products integrate the most recent eddy covariance data providing improvements both in terms of bio-climactic coverage and data quality from over 12 million high quality site-hours. X-BASE estimates global net ecosystem exchange at -6.4 \(Pg \, C \cdot yr^{-1}\), which represents a marked change from previous FLUXCOM versions and compares considerably better with independent estimates. This was only possible thanks to the international effort to improve the precision and consistency of eddy covariance collection and processing pipelines. Next to net ecosystem exchange, X-BASE comprises estimates of several terrestrial carbon and water fluxes, including a novel fully data driven global transpiration product, at a higher spatial (0.05°) and temporal (hourly) resolution. Despite considerable improvements to the previous up-scaling products, many further opportunities for innovation still exist. The new FLUXCOM-X framework was specifically designed to have the necessary flexibility to explore, diagnose, and converge to more reliable products.
Introduction
Energy, water, and carbon fluxes between terrestrial surfaces and the atmosphere are key components of the Earth system with implications for weather, climate, water availability, and impacts on ecosystem services. Eddy covariance (EC) towers provide observations of fluxes at the ecosystem scale covering diurnal up to decadal variations [1]. However, as EC measurements are confined to individual locations in space and limited periods in time, broader analysis of regional and global patterns requires the coordination and consolidation of EC measurements in networks of EC towers and ultimately their up-scaling to the continental and global scales.
The basic flux up-scaling concept links predictor measurements at the tower level, particularly meteorological data or remote sensing, with corresponding globally gridded products via a machine learning model which is trained on the flux of interest at tower level and predicted globally based on the gridded input data. Early approaches focused on net ecosystem exchange of carbon (NEE) and utilized the growing flux networks in Europe [2] and North America [3], with other regions following shortly thereafter [4,5,6]. The release of the La Thuile Synthesis Dataset of harmonized EC data, as well as methodological improvements [7], led to the first global products of NEE at 0.5° spatial resolution and monthly temporal time step [8]. While good agreement with global GPP and energy fluxes [8] demonstrated the potential of the approach, remaining inconsistencies, in particular the difference between the global NEE compared to independent approaches, demonstrated a need to understand the key sources of uncertainty [9].
In an effort to better understand the uncertainties associated with up-scaling flux products, the FLUXCOM inter-comparison initiative built an ensemble of flux estimates as a type of factorial experiment. The ensemble consisted of multiple machine learning algorithms, data processing methods, model structures, and meteorological forcing data resulting in 120 individual members. These were summarized in two key ensemble configurations (RS and RS+METEO), which differed in the set of predictor variables and in terms of spatial-temporal resolution. Apart from the ensemble, the FLUXCOM evaluation included a consistent site-level cross-validation analysis as well as cross-consistency checks with independent global data streams. From a methodological point of view, the key insights from FLUXCOM were that: (1) the overall approach seems to be primarily limited by the input information given to the machine learning algorithms rather than to the ability of the algorithm to extract the information; (2) the largest qualitative differences among flux products were found between the two configurations; (3) the largest qualitative discrepancy with independent data was an unrealistically large tropical carbon sink that was shared among all ensemble members; and (4) the cross-consistency checks with global independent data is essential for evaluation in addition to site-level cross-validation.
Learning from these lessons implies striving for enhancing the information content of the training data with aspects related to coverage and quality of EC measurements as well as quality, complementarity, and completeness of predictor variables. This, in turn, requires: the flexibility to explore a large methodological space related to data treatment, ingestion, and methodological configurations; integration with the in situ data collection networks and processing pathways; as well as the ability to assess and monitor progress for the global products of related experiments in parallel to site-level cross-validation. We coin this path as FLUXCOM-X, and here we report on first progress by presenting and evaluating the first set of products using this pathway, which we refer to as FLUXCOM-X-BASE (or X-BASE for short).
X-BASE products were generated based on the same principle as in the original FLUXCOM using qualitatively similar predictor variables, i.e. remotely sensed vegetation indices and land surface temperatures from MODIS along with meteorological variables. Conceptually, it differs from FLUXCOM in unifying the previous RS and RS+METEO configurations by using concurrent and inter-annually varying satellite observations together with meteorological information. We made efforts to increase the information content of the training data in X-BASE by learning on more than 12 million hourly flux observations with improved coverage and quality, and by improving the processing of satellite predictor variables [10]. As a key innovation we are producing X-BASE products at 0.05° spatial and hourly temporal resolution globally. Figure 1 illustrates the increase in spatial and temporal detail in X-BASE compared to RS (0.083°, 8-daily) and RS+METEO (0.5°, daily) using the example of net ecosystem exchange (NEE). In this manuscript, we show results for X-BASE NEE, gross primary productivity (GPP), evapotranspiration (\(ET\)), and for the first time transpiration (\(ET_T\)) for the period 2001-2021.
Since X-BASE serves as a baseline for future FLUXCOM-X developments we are focusing here on the evaluation and cross-consistency checks of X-BASE with previous FLUXCOM products and independent data streams. Our specific objectives are: (1) to describe the production of X-BASE products; (2) to evaluate the X-BASE setup using site-level cross-validation; (3) to assess qualitative differences of global patterns compared to previous FLUXCOM products with reference to independent data where possible; and to (4) synthesize lessons learned from this exercise to guide future FLUXCOM-X developments.
Data and Methods
Eddy Covariance
All EC data was based on fully processed output from the ONEFLUX data processing pipeline [11] between 2001-2020 and available with a CC BY 4.0 license. Based on this criteria, data for each site came from one of five different sources based on most recent availability: FLUXNET 2015 [11], ICOS Drought 2018 [12], ICOS Warm Winter 2020 [13], or the most recent Ameriflux or ICOS release. Table 1 lists all sites included as well as the associated digital object identifier specific to the associated release. Meteorological data consisted of incoming shortwave radiation, air temperature and vapor pressure deficit, of which all data were gap filled using the Marginal Sampling Distribution method [14], as well as the computed potential shortwave incoming radiation for every hour. Carbon flux data consisted of gap filled net ecosystem exchange (NEE, variable ustar threshold 50) and the corresponding gross primary productivity (GPP, nighttime partitioning method [14]). Water flux data consisted of evapotranspiration (ET, no energy balance correction) which was converted from the latent energy and transpiration estimates based on the TEA algorithm [15,16]. All data was aggregated to a common hourly time resolution.
Training of the machine learning method was only conducted on hours where all targets and predictors passed quality control. The quality control procedure consisted of two levels, with the first being hours must have at least one value of good quality measured or gap-filled with confidence (i.e. at least one half hour was either 0 or 1 based on the OneFLUX _QC flags). Second, an empirical plausibility test removed entire days and/ or entire site variables if the relationship of a daily aggregated variable with other site variables strongly deviated from conceptual understanding (Jung et al., 2023). The amount of data included in the training dataset varied between ~12-14 million site-hours depending on the target variable.
Global meteorological data used as the corresponding predictor to each site level meteorological variable was derived from ERA5 global reanalysis products [236]. Units were converted to correspond to the site level measurements and the data was re-gridded to a 0.05° resolution using bilinear interpolation for every hour.
Satellite Earth Observation
The exploitation of satellite Earth observations (EO) is constrained to variables available both in cutouts over EC stations, and globally. FLUXCOM-X is set up to facilitate the flexible ingestion of a variety of EO data streams. The X-BASE products, however, use exclusively measurements of the MODerate Imaging Spectroradiometer (MODIS) of surface reflectance and land surface temperature.
Spectral vegetation indices
At site level we used surface reflectance in the first seven MODIS spectral bands from the MCD43A4 v006 reflectance data set (500m and daily, where each day represents an average over all valid observations within a 16-day window [237]). The spectral vegetation indices EVI [238], NIRv [239], and NDWI with MODIS band 7 as reference [240] were computed from the reflectance data. We followed the procedure of the FluxnetEO data sets version 2 [10] for data acquisition from Google Earth Engine for all pixels in a cutout of 4x4km² around each EC station, as well as for quality checks in terms of snow cover, land cover, index values outside the defined ranges, and outliers. An iterative approach then determines both, the set of pixels in a cutout that shall represent a given EC station, and the strictness of the inversion quality of the bidirectional reflectance distribution function (BRDF, based on the MCD43A2 data, [241]). The approach trades data quality versus data quantity at the cost of inconsistent cutout sizes and BRDF inversion quality among sites. All technical details of the dynamic procedure are outlined in the supporting information section 0.7.1.1.
Global data of BRDF-corrected surface reflectance stem from the MCD43C4 v006 data [237], available in a climate modelling grid of 0.05° with the same temporal sampling and subject to the same removal of snow and water pixels and outlier values like at site level. The BRDF quality control of the global data followed the same dynamic approach (see supporting information 0.7.1.1), which maximizes data availability especially in tropical regions.
Values in data gaps were estimated consistently in both the average time series per EC station and in the global gridded data following the procedures of the FluxnetEO data version 2 [10].
Land surface temperature
Satellite observations of land surface temperature (LST) were based on the MODIS v006 TERRA morning and evening observation
products which are available every day at 1km [242]. We selected the 1km pixel containing a specific tower and treated the two MODIS LST data streams as independent predictor variables which represent clear-sky LST at a specific time of the day. Quality checks and gap-filling followed the procedure described in FluxnetEO version 2 [10].
For the global spatialization of the flux estimates we rely on climate modelling grid LST from the MODIS TERRA data sets [242] and apply consistent quality control and imputation of missing values like at site-level.
Land cover
Land cover information used the IGBP global vegetation classification. Site level classification was as reported by the principle investigators. Global data was based on the MODIS MCD12Q1 product [243]. In order to ease the transition between site and global land cover classifications, an intermediate classification scheme was utilized which translated each classification into characteristics (e.g. trees, crops, needleleaf, deciduous, etc…) based on whether the classification has (value=1.0), might have (value=0.5), does not have (value=0.0) a specific feature, or is unknown (value=-1.0). A full description of this intermediate classification system can be found in Supplementary Section XX.
Machine Learning Method
While FLUXCOM-X is set up to flexibly work with a variety of machine-learning algorithms, all X-BASE products are based on gradient boosted regression trees using the XGBoost library [244]. XGBoost is known as a robust algorithm that is able to handle a variety of variable types (numeric, boolean, categorical). Training was conducted using a two-thirds training subsampling ratio and a 0.05 learning rate. Boosting was stopped once no model improvement (based on mean squared error) was seen in ten consecutive rounds, and the best performing model stored to generate predictions. The final number of rounds was between 80-230 depending on the flux.
Cross-validation
We split all sites with available good quality predictor and flux data into ten folds for cross-validation, of which we used eight folds for training, one for validation and the remaining one as the test fold for which the actual predictions are done. The folds were iterated such that each site was in the test set once. Two sites are assigned to the same fold if they are less than 0.05° apart to reduce overfitting.
Upscaling
In order to train a model for the final upscaling to the global scale we used nine of the ten folds for the training and validation was done on the remaining fold. One model per flux was optimized for flux estimates of the whole globe, i.e. no specific splitting of the training sites according to plant functional type or similar was done other than the criterion of a minimum distance of 0.05° between any site in the training to set to any site in the validation set.
Independent global flux estimates
Comparisons to FLUXCOM RS+METEO datasets always refer to the ensemble over multiple machine learning methods for all realizations driven by the ERA5 meteorology [236]. RS+METEO uses average seasonal cycles of MODIS v005 observations, and has a native daily resolution with 0.5° pixels. For the FLUXCOM RS set-up we use the ensemble over all machine learning methods at 0.0083° every 8 days. Please note that both the previous RS runs and the X-BASE runs presented here are driven by data from MODIS v006, but the processing in terms of quality control and gap-filling has changed.
For evaluating X-BASE NEE globally, in particular its seasonal cycle and for different regions, we used estimates from the OCO-2 v10 model intercomparison project [245] consisting of 13 different ensemble members covering the period 2016-2020 with monthly frequency and 1° spatial resolution (https://gml.noaa.gov/ccgg/OCO2_v10mip/index.php ). We used the LNLGISS experiment which combines satellite based XCO2 and station-based in-situ measurements as observational constraints in the assimilation. For comparisons of inter-annual variability, we also utilized the CarboScope inversion [246] version s99oc_v2022 [247] over the period from 2001 to 2020.
We compare temporal patterns of X-BASE GPP with global retrievals of sun-induced chlorophyll fluorescence (SIF) from the Sentinel-5P TROPOMI instrument [248]. For the comparison we average to a temporal resolution of 16~days and 0.5° for the common period 04/2018-12/2020.
X-BASE \(ET\) and \(ET_T\) were cross-compared with transpiration estimates from the complementary GLEAM data sets v3.6a [249,250].
Results
Cross-validation and data space
One important innovation in FLUXCOM-X compared to the previous FLUXCOM ensemble is the extended training data base, which shows an improved coverage of the environmental space. Taking daily NEE as an example, the distribution of training samples is considerably extended across the space between VPD and incoming shorwave radiation in X-BASE compared to the FLUXCOM ensemble (Fig. SI 3, the training data was the same for both the RS and RS+METEO versions). Furthermore, the number of unique sites contributing to a certain VPD-radiation bin has increased (Fig. 2). Sampling has improved in particular in the margins of the distribution, i.e. for high VPD along the full radiation spectrum, and vice versa for high radiation conditions along the full VPD spectrum. Remarkably, the number of sites contributing training samples for high VPD and high radiation has increased most, promising more and more varied information for dry conditions.
The results from the ten-fold cross validation show an overall high performance with the NSE with most fluxes and scales of variability having an NSE above 0.6 (Fig. 3). In terms of scales of variability across all fluxes, the monthly mean diurnal cycle (“diurnal”) and the daily median seasonal cycle (“seasonal”) are very regular patterns that the trained models reproduce best. Also, between-site changes (“spatial”) and monthly aggregated fluxes (“monthly”) are reliably predicted. Deviations from the median daily seasonality (“anom”) are only moderately reliable with NSE between 0.3 and 0.6. The XGBoost models do not succeed in accurately reproducing between-year changes (“iav”). Consistently across all scales, the component fluxes (i.e. GPP and \(ET_{t}\)) show higher performance than their respective net flux (i.e. NEE and ET).
Note that the cross validation results from Figure 3 cannot be directly compared to previous cross validation results as the feature set and training data are not the same. However, qualitatively the accuracy gradient among fluxes as well as along scales of variability correspond to patterns identified previously in FLUXCOM and comparable diagnostic modeling activities, and relate to the magnitude of fluxes to be reproduced and the suitability and completeness of predictor variables for each flux [6,8,251,252].
Global flux estimates
Net Ecosystem Exchange (NEE)
Compared to both the FLUXCOM RS and RS+METEO products, X-BASE shows a much more realistic globally integrated NEE of -6.4 PgC/yr, primarily due to a substantially smaller tropical sink (Fig. 4). In the X-BASE products, large parts of the Amazon appear as approximately carbon neutral while tropical regions in Africa and southeast Asia show a sink. In contrast to both RS and RS+METEO, India and some regions in central Sahel show prominent patterns of a mean carbon source in X-BASE, corresponding mostly to crop designated areas (Fig. SI 4).
Comparison with OCO-2 and CarboScope inversions indicates a substantial improvement of the global mean seasonal cycle of NEE (Fig. 5) in X-BASE compared to RS and RS+METEO. The systematic bias present in RS and RS+METEO has essentially disappeared in X-BASE. The shape, and in particular the amplitude, of the global NEE seasonal cycle of X-BASE is more consistent with the inversions. The larger and more realistic seasonal cycle amplitude of global NEE in X-BASE originates primarily from improved and increased amplitudes in boreal regions. Interestingly, X-BASE suggests slightly larger NEE seasonal cycle amplitudes in temperate regions compared to the inversions. In seasonally dry regions, the timing of maximum uptake is consistent between X-BASE and inversions, while the peak of maximum release is larger and delayed in the inversions. In Australia, the peak of \(CO_2\) source to the atmosphere at the end of the year present in both inversions is not evident in X-BASE, which instead shows a relatively consistent \(CO_2\) source throughout the year. In tropical regions, the patterns of seasonal variations are qualitatively consistent between X-BASE and the previous RS and RS+METEO products. The seasonal patterns in tropical regions are relatively weak overall and seem inconsistent both between the inversions and X-BASE as well as among the inversions.
As seen in Figure 6, the X-BASE product shows the same large underestimation of globally integrated NEE inter-annual variance as the previous RS and RS+METEO products. Furthermore, the inter-annual variability exhibited from X-BASE has a relatively low correlation with the CarboScope inversions. In terms of temporal trends, the X-BASE products show almost no change in annual NEE in time, which is in contrast to the RS+METEO (slight positive trend) and RS (slight negative trend) and more consistent with the CarboScope inversions (Table 2). However, as inter-annual variability was poorly reproduced even in the cross validation (Fig. 3), trends in the X-BASE products should be taken with caution and interpreted with careful scrutiny.
Table 2:Interannual variability of NEE. Column labeled corr. is the Pearson correlation with CarboScope, linear trend in time (per year), and std. is the standard deviation after the trend is removed.
corr.
linear trend
std.
CarboScope
1.000
0.007
0.828
X-Base
0.316
0.017
0.313
RS+METEO
0.310
0.093
0.225
RS
0.314
-0.126
0.550
Gross Primary Productivity (GPP)
In terms of magnitude, X-BASE estimated globally integrated GPP (123 \(Pg C \cdot yr^{-1}\)) is slightly higher than RS+METEO (119 \(Pg C \cdot yr^{-1}\)) and considerably higher than RS (110 \(Pg C \cdot yr^{-1}\)) over the period 2002-2020. In terms of regional patterns, X-BASE GPP consistently exceeds both RS+METEO and RS in temperate, boreal, and most subtropical ecosystems, but vice versa in sparsely vegetated (semi-)arid regions like southwestern North America as well as southeast Asian crop lands (Fig. 7). Only in the humid tropics is this qualitatively consistent pattern broken, when X-BASE GPP is higher than RS, but lower than RS+METEO.
Comparing the estimated trend over the last two decades, X-BASE GPP has a clear increasing linear trend of 0.35 \(Pg C \cdot yr^{-1} \cdot yr^{-1}\) which is slightly higher than the trend in RS (Table 3). In contrast, the RS+METEO product shows nearly no trend in annual GPP. The increases in both the X-BASE and RS products may be related to increases in surface greenness coming from variability in the remote sensing forcing data which are inter-annually dynamic in both products, whereas the remote sensing data were not inter-annually dynamic in the RS+METEO product which instead used only the mean seasonal cycle of the remote sensing data.
Table 3:Interannual variability of GPP. Column slope is the trend \(Pg C \cdot yr^{-1} \cdot yr^{-1}\) and std. is the standard deviation after the trend is removed in \(Pg C \cdot yr^{-1}\).
std.
linear trend
X-Base
0.590
0.346
RS+METEO
0.241
-0.052
RS
1.001
0.242
Comparing the temporal variability in GPP estimates against TROPOMI SIF as an independent proxy for GPP (Figure 8) shows that the variability of X-BASE GPP strongly agrees with that in TROPOMI SIF, with Squared Spearman correlation values above 0.85 across most of the vegetated land surface (Fig. 8 top left). The only exceptions are regions with no or very small variability in both GPP and SIF such as in either evergreen tropical ecosystems in South America, Africa and southeast Asia, or sparsely to non-vegetated areas due to aridity (e.g. inner Australia, Mexican, and African deserts) or cold conditions (e.g. Canadian and Siberian subpolar regions). \(R^2\) for the deviations from the average seasonality show the same qualitative spatial patterns (Fig. 8 top right), but are overall lower with \(R^2\) values between 0.55 and 0.8. Anomalies of X-BASE GPP and SIF agree best in eastern European temperate forests as well as grassy and shrub ecosystems in eastern South America. Overall we find the most notable declines in \(R^2\) compared to the actual time series in the southwestern Amazon, in large parts of India, as well as in central Siberia.
Comparison of the level of agreement of SIF and X-BASE with that of SIF and RS and RS+METEO, respectively, illustrates that X-BASE and RS GPP estimates have comparable accuracy both for the time series (global weighted mean \(R^2\) values of 0.72 and 0.73, respectively) and anomalies (global mean \(R^2\) values of 0.64 and 0.66, respectively), while the \(R^2\) between RS+METEO and SIF is lower in both cases (\(R^2\) values of 0.66 for the time series and 0.58 for anomalies). This is in contrast to the findings in [253], where RS+METEO agreed better with GOME2 SIF than RS. In the latter case, the average seasonality was compared at a monthly scale, in contrast to the actual temporal trajectory at 16-daily scale as we do here. In both cases, however, the common data period comprised less than three years resulting in limited representativity overall. In addition, GOME2 and TROPOMI are affected by data quality issues to a different extent, e.g. related to cloud shielding or viewing geometry. X-BASE GPP shows a higher agreement with SIF than RS both in terms of the actual trajectory and anomalies in evergreen tropical forests with no or only a very short dry season in the Amazon and Africa, as well as in fully humid parts of southeast Asia (Fig. 8 middle panel). Improvements in X-BASE GPP compared to RS are also consistent in the very continental and polar tundra areas in eastern Siberia, northern Canada and Alaska. Conversely, in arid steppe climates globally, X-BASE GPP variability is consistently and considerably less accurate than in RS. Compared to RS+METEO, improvements in the captured variability in X-BASE GPP are much more widespread, and most pronounced in arid to semi-arid ecosystems (large parts of the Caatinga and Gran Chaco regions on South America, steppe regions in Mexico, southern and eastern Africa, Australia and central Siberia) as well as in global crop regions, especially for the deviations from the seasonality (albeit the magnitude of \(R^2\) change is quite variable between regions, Fig. 8 bottom).
Water Fluxes
Figure 9 shows the spatial patterns of X-BASE mean annual ET and \(ET_{t}\), as well as the ratio of the two (\(ET_{t}/ET\)). The majority of areas show a dominance of transpiration with the highest \(ET_{t}/ET\) seen in the higher latitude regions of Europe and Asia, as well as in subtropical ecosystems. Arid regions with sparse vegetation show the lowest \(ET_{t}/ET\) overall, with values generally below 20%.
In arid regions with low vegetation (e.g. the Sahara region) the estimated annual ET from X-BASE exceeds annual precipitation (Fig. SI 5) indicating major overestimation in these areas which is likely due to a lack of eddy covariance data in similar ecosystems. Constraining the X-BASE estimates with precipitation (data not shown) suggests about 4-7x10³ km³ of water is overestimated globally. In contrast, the \(ET_{t}\) estimates from X-BASE do not commonly exceed precipitation estimates, which could indicate that because the water flux is more tightly coupled with vegetation the model is able to distinguish that no vegetation corresponds with no transpiration, which is not generally the case for abiotic evaporation.
In comparing the spatial patterns of differences between X-BASE and both the GLEAM and previous FLUXCOM ensembles (Fig. 10), apart from the general overestimation in arid regions, the X-BASE products show a consistently lower estimate of both ET and \(ET_t\) in the tropical and boreal regions. The pattern of differences is roughly similar across all compared products, with the highest degree of intensity for GLEAM and the lowest for RS+METEO. In the case of transpiration, X-BASE is consistently lower compared to GLEAM.
In comparing total global terrestrial ET (Fig. 11, upper panel), the X-BASE product is most similar to GLEAM which is lower than RS and slightly higher than RS+METEO, however when correcting for ET overestimation the value is closer to RS+METEO. The values for RS+METEO consist of estimates from the runs with ERA5, which are lower than those reported for the full ensemble in [254] (\(76x10{3} km^{3} yr^{-1}\) for the full RS+METEO ensemble compared to \(67x10{3} km^{3} yr^{-1}\) for the ERA5 members shown here). All these estimates tend to be higher than reported values from land surface models (TRENDY), which is consistent with other estimates (check references in [254]).
Global \(ET_{t}/ET\) from Fig. 11 (lower panel) show X-BASE to be slightly lower (57 %) than both GLEAM (70 %) and isotope based methods (65 %), but agreeing that transpiration is the dominant component of terrestrial ET (i.e. greater than 50 %). Correcting for the overestimation of ET gives slightly higher ratios between 60 to 63 %, which is more in line with both the isotope based methods and previous site level up-scaling approaches ([258];[259]).
Discussion
Key improvements
X-BASE is the first version of global flux estimates produced by FLUXCOM-X, and though the fundamental approach compared to FLUXCOM has not changed, we find considerable improvements to some of the key problems identified in the RS and RS+METEO products from the previous FLUXCOM [253,254]. In terms of technical advancements, the higher spatial (0.05°) and temporal (hourly) resolutions brings a richer information content, and inclusion of transpiration gives insight into plant controls on hydrology and carbon:water interactions. When compared to the previous FLUXCOM products, the most pronounced improvements are the more consistent magnitude of the mean carbon sink and its average seasonality when compared to independent estimates from atmospheric inversions.
Global means
The improvement in the global mean sink magnitude is likely due to the differences in the eddy covariance based NEE observations used for training. Other differences in methods and setup are unlikely to explain this large qualitative difference because the severely overestimated global sink was consistently present in all ensemble members of the previous FLUXCOM ensemble independently of the predictor set (RS vs RS+METEO), of the meteorological forcing data set, of the temporal resolution (8-daily, daily for FLUXCOM, half-hourly in [260]), and of which machine learning model was used.
FLUXCOM was primarily based on an earlier collection of flux tower data, the La Thuile data set, with comparatively loose strictness on data quality to maximize coverage. The looser strictness in the La Thuile dataset has led to the inclusion of flux tower sites that likely show systematic biases of measured NEE, e.g. due to missing storage corrections. As discussed and speculated in [253], obtaining an accurate carbon budget is particularly difficult for some tropical sites, and together with the sparsity of data in the tropics it has caused the propagation of measurement bias to FLUXCOM estimates. The lesson learned here emphasizes once more that it is crucial to control for and minimize systematic biases of in-situ eddy covariance measurements. However, the fact that bottom-up global eddy-covariance based NEE and estimates from the atmospheric inversions are qualitatively consistent is a major achievement of the FLUXNET community. For context: 1 \(PgC \cdot yr^{-1}\) over the global vegetated area (\(145 \times 10^{6} km^2\)) corresponds to ~7 \(gC \cdot m^2 \cdot yr^{-1}\), which marks a challenge for achieving such accuracy of mean NEE at any one flux tower site, much less across the entire network.
The current global land sink between 2012-2021 estimated by the global carbon budget amounts to around 3.1±0.6 \(PgC \cdot yr^{-1}\)[261]. The estimate of carbon uptake for the same period from X-BASE (5.9 \(PgC \cdot yr^{-1}\)) is larger, which may be explained by carbon source processes and fluxes that are not captured by the eddy covariance technique and/or measured in the network. The quantification of these secondary source fluxes such as VOCs, land use change and fire emissions, \(CO_2\) evasion from inland water bodies, respiration of crop harvest, etc. is individually very challenging and associated uncertainties are comparatively large.
Global water fluxes (\(ET\) and \(ET_{T}\)) showed overall a convergence to ~70,000 \(km^3\) after taking into account the precipitation overestimation, which is similar to both the estimates from GLEAM and other past estimates [255,262]. More intriguing are the \(ET_{T}/ET\) ratios at around 60 %, which is consistent with independent assessments both from isotope base methods [257,263] and past up-scaling estimates [258,259], and higher than most land surface model-based estimates [256,264]. This consistency with a more top down estimate comes without forcing the constraint from a fully data driven, bottom up approach.
Temporal patterns
Aside from the mean, X-BASE also shows overall improvements in the NEE seasonality, which likely originates from enhanced information content in the training data set, both in terms of spatio-temporal coverage and due to the hourly instead of daily temporal resolution. The latter is suggested to play an important role since [260] found a similar improvement compared to FLUXCOM when training on half-hourly data while using the same underlying flux tower observations. The remaining discrepancies of NEE seasonality between X-BASE and OCO-2 inversions under dry conditions may reflect issues in accounting for interactions between moisture and respiration processes that are due to “memory effects”. For example, [265] showed for Australia that the onset of rain after the dry season triggers a respiration pulse that shapes Australia’s NEE seasonal cycle. Such processes are not reflected in X-BASE and may play important roles in semi-arid regions.
Outstanding issues
Apart from these improvements in the global mean sink and its seasonality, X-BASE products still suffer from several of the limitations identified in the previous versions. In particular, the inability to accurately capture inter-annual variability (Fig 6), and the limited confidence in the spatial pattern of global mean NEE (as shown by the cross-validation, Fig 3). The conceptual and practical limitations remain unchanged compared to FLUXCOM (see [253] and [251] for discussion). For example, the regionally distinct patterns of agreement between X-BASE GPP and SIF compared to RS or RS+METEO are clearly indicative of the importance of including relevant and informative predictors in the model set-up.
A novel aspect of the X-BASE products is that it uses both EO and meteorological predictors, and both are fully dynamic in time.
The former are particularly informative in (semi-)arid ecosystems and crops where we most often find a decreasing order of GPP accuracy (RS > X-BASE > RS+METEO) compared to SIF. Flux responses may partly decouple from the immediate meteorological conditions – in crops due to management, and in arid steppe climates due to the fact that ecosystem functioning is spatially more unique than in temperate regions (as discussed e.g. in [252,266]) and strongly modulated by (deeper) soil moisture supplies [268].The decoupling between flux response and meterological conditions may explain the lower correlations for RS+METEO in these regions, especially for GPP anomalies. In addition, X-BASE only uses plant functional types (PFTs) as spatially static predictor variables that may help the models differentiate the more heterogeneous spatial responses in the drier regions. Using PFT alone to denote spatial variability is likely not sufficient, as shown by the lower correlations in X-BASE compared to RS, where in FLUXCOM, a number of static features have been engineered and included to better characterize variability in space. Interestingly, resolving the effects of surface water (in contrast to deeper moisture) through the use of inter-annually dynamic EO predictors instead of seasonal climatogies of EO variables may also be decisive for the consistent improvements in simulated X-BASE GPP in shrubby bog and swamp areas in central Siberia and northern Canada. While fully dynamic EO predictor variables may inform the model in water limited regions, the inclusion of meteorological data is especially informative and necessary for accurate GPP trajectories in evergreen tropical forests as suggested by the higher agreement in X-BASE with SIF than RS. In the case of the largely energy limited tropics the presence of meteorological information likely brings crucial information to the X-BASE model when changes in greenness are largely absent. Both hypotheses are corroborated by the fact that we do not find similarly strong improvements in the same areas when compared to RS+METEO (Fig. 8 bottom panel). Overall, we there are clear indications which underline the importance of relevant and informative features in the model set-up from arid to fully humid ecosystems.
Outlook
By building from the ground up, the FLUXCOM-X framework is designed with the flexibility to mitigate and improve on the current limitations to up-scaling from the site to global scale. FLUXCOM-X allows rapid experimental cycles to explore the importance of key methodological settings and decisions, and to understand and minimize the uncertainties associated with global flux estimates. The flexibility of the new framework opens new possibilities to tackle current issues such as the incomplete data coverage both in terms of feature space limitations from the EC network and lack of predictor variables able to capture differentiated responses, for example to drought. Future work can incorporate novel spatial and/or EO predictor variables, as well as methodological developments regarding the joint exploitation of complementary EO data sets with different life times. For X-BASE and all following product versions, the incorporation of new satellite products is an imminent challenge given the recent de-orbiting of the TERRA spacecraft, which has adverse consequences for observational consistency. While there exist new, and potentially better, satellite missions, these new products have less temporal overlap with the majority of available EC measurements that span over the last 30 years. Therefore, the inclusion of the most recently available and best quality EC sites and site-years is also a continuous effort in FLUXCOM-X and the eddy covariance community via FLUXNET. Going forward, FLUXCOM-X can facilitate a “ground up” approach in the most literal sense, bringing the ecological knowledge of experts on the ground directly to global problems. Aside from additional data inputs, the flexibility of the framework allows for development and testing of new machine-learning approaches able to better extract information as well as enforce more physically consistent constraints [269]. All approaches will further mitigate the information limitation in up-scaling which has been confirmed as the major bottleneck to accurate global flux estimates, leading to both more accurate flux estimates and increased understanding of the Earth system.
Data Availability
The data will be available in aggregated versions to ease data handling for common use cases, as well as in a full resolution version. The aggregated versions comprise monthly 0.05° and 0.5°, daily 0.25°, monthly mean diurnal cycle at 0.25°, and are available at the ICOS Carbon Portal under… The full resolution version can be accessed here…
Supplemental Information
Details on processing of Earth Observation Data
Dynamic quality control and cutout size
The conditions in the pixels around a given EC station should best represent the conditions of the land surface in the area where the actual fluxes originate from. Given that the actual flux footprints are not generally available or computable for lack of critical information, we assume that the pixel containing the actual EC station (the `tower pixel’) is most representative for the dynamics of the area of influence on a tower. However, data availability and quality in the tower pixel is often insufficient. An iterative approach therefore selects both the cutout size and the strictness of the BRDF inversion quality from within defined bounds in a way that maximizes data availability and that ensures representativeness of the spatially averaged time series for the given site at the same time.
In more detail, we start with a strict criterion for BRDF inversion quality (BRDF_Albedo_Band_Quality_Bandx flag in MCD43A2 <= 2, meaning only full inversions). Then three options regarding the cutout size are considered:
only the tower pixel,
those 20% of pixels within 4x4 km² around a tower that are best correlated with the tower pixel are linearly regressed against the tower pixel and subsequently spatially averaged,
the 25% of pixels within a 4x4 km² area that are closest to the tower are averaged with the inverse of the distance to the tower as weight.
The criteria for selection between options A-C is based on the number of available good quality observations n in the resulting spatial average time series per site as follows:
if (n_A >= 60 %) & (n_B <= 70 %):
select A
elif (n_A >= 60 %) & (n_B >= 70 %):
select B
elif (n_A < 60 %) & (n_A > 15 %):
select B
else:
select C
If after the previous steps still less than 40% of good quality observations outside of snow covered times are available in the resulting average time series for a given site and index, the BRDF inversion quality threshold is relaxed to also allow magnitude inversions (MCD43A2 BRDF inversion quality flag <= 3), and the procedure to select the pixels contributing to the average described above is repeated. Consequently, the size of the area that a MODIS reflectance time series represents varies between sites, and so does the BRDF inversion quality.
For the global gridded MODIS data, the BRDF inversion quality is consistently selected as <=2 or <=3 based on the number of available good quality observations in a pixel.
Additional cross-validation results
Large carbon source in tropical croplands
Comparison of X-BASE to RS+METEO ensemble
Potential overestimation of ET in dryland areas
References
1.
How eddy covariance flux measurements have contributed to our understanding of<i>Global Change Biology</i>
Estimation of net ecosystem carbon exchange for the conterminous United States by combining MODIS and AmeriFlux data
Jingfeng Xiao, Qianlai Zhuang, Dennis D Baldocchi, Beverly E Law, Andrew D Richardson, Jiquan Chen, Ram Oren, Gregory Starr, Asko Noormets, Siyan Ma, … Margaret S Torn
New data‐driven estimation of terrestrial CO
<sub>2</sub>
fluxes in Asia using a standardized database of eddy covariance measurements, remote sensing data, and support vector regression
Kazuhito Ichii, Masahito Ueyama, Masayuki Kondo, Nobuko Saigusa, Joon Kim, MaCarmelita Alberto, Jonas Ardö, Eugénie S Euskirchen, Minseok Kang, Takashi Hirano, … Fenghua Zhao
Statistical upscaling of ecosystem CO
<sub>2</sub>
fluxes across the terrestrial tundra and boreal domain: Regional patterns and uncertainties
Anna‐Maria Virkkala, Juha Aalto, Brendan M Rogers, Torbern Tagesson, Claire C Treat, Susan M Natali, Jennifer D Watts, Stefano Potter, Aleksi Lehtonen, Marguerite Mauritz, … Miska Luoto
Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations
Martin Jung, Markus Reichstein, Hank A Margolis, Alessandro Cescatti, Andrew D Richardson, MAltaf Arain, Almut Arneth, Christian Bernhofer, Damien Bonal, Jiquan Chen, … Christopher Williams
Reviews and syntheses: An empirical spatiotemporal description of the global surface–atmosphere carbon fluxes: opportunities and data limitations
Jakob Zscheischler, Miguel D Mahecha, Valerio Avitabile, Leonardo Calle, Nuno Carvalhais, Philippe Ciais, Fabian Gans, Nicolas Gruber, Jens Hartmann, Martin Herold, … Markus Reichstein
On the separation of net ecosystem exchange into assimilation and ecosystem respiration: review and improved algorithm
Markus Reichstein, Eva Falge, Dennis Baldocchi, Dario Papale, Marc Aubinet, Paul Berbigier, Christian Bernhofer, Nina Buchmann, Tagir Gilmanov, Andre Granier, … Riccardo Valentini
Jason Beringer, Lindsay Hutley, David McGuire, Paw U
FluxNet; Monash University; University of California Davis; Charles Darwin University; University of Alaska Fairbanks; University of Melbourne (2016) https://doi.org/gr2s6r
George Vourlitis, Higo Dalmagro, José De S. Nogueira, Mark Johnson, Paulo Arruda
AmeriFlux; California State University, San Marcos; Universidade de Cuiabá; Universidade Federal de Mato Grosso; University of British Columbia (2022) https://doi.org/gr2sz7
Francisco Domingo Poveda, Ana López Ballesteros, Erique Pérez Sánchez Cañete, Penélope Serrano Ortiz, Mª Rosario Moya Jiménez, Oscar Pérez Priego, Andrew S Kowalski
FluxNet; Estación Experimental de Zona Áridas (EEZA, CSIC) (2016) https://doi.org/gr2s69
Julia Boike, Sebastian Westermann, Johannes Lüers, Moritz Langer, Konstanze Piel
FluxNet; University of Oslo, Department of Geosciences, 0316 OSLO, Norway; Universität Bayreuth, Department of Earth Sciences, 95440 Bayreuth, Germany; Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Periglacial Research Unit, 14473 Potsdam, Germany (2016) https://doi.org/gr2s9k
National CO<sub>2</sub> budgets (2015–2020) inferred from atmospheric CO<sub>2</sub> observations in support of the global stocktake
Brendan Byrne, David F Baker, Sourish Basu, Michael Bertolacci, Kevin W Bowman, Dustin Carroll, Abhishek Chatterjee, Frédéric Chevallier, Philippe Ciais, Noel Cressie, … Ning Zeng
How does the terrestrial carbon exchange respond to inter-annual climatic variations? A quantification based on atmospheric CO&lt;sub&gt;2&lt;/sub&gt; data
Christian Rödenbeck, Sönke Zaehle, Ralph Keeling, Martin Heimann
GLEAM v3: satellite-based land evaporation and root-zone soil moisture
Brecht Martens, Diego G Miralles, Hans Lievens, Robin van der Schalie, Richard AM de Jeu, Diego Fernández-Prieto, Hylke E Beck, Wouter A Dorigo, Niko EC Verhoest
Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach
Martin Jung, Christopher Schwalm, Mirco Migliavacca, Sophia Walther, Gustau Camps-Valls, Sujan Koirala, Peter Anthoni, Simon Besnard, Paul Bodesheim, Nuno Carvalhais, … Markus Reichstein
Evaluation of global terrestrial evapotranspiration using state-of-the-art approaches in remote sensing, machine learning and land surface modeling
Shufen Pan, Naiqing Pan, Hanqin Tian, Pierre Friedlingstein, Stephen Sitch, Hao Shi, Vivek K Arora, Vanessa Haverd, Atul K Jain, Etsushi Kato, … Steven W Running
Pierre Friedlingstein, Michael O'Sullivan, Matthew W Jones, Robbie M Andrew, Luke Gregor, Judith Hauck, Corinne Le Quéré, Ingrid T Luijkx, Are Olsen, Glen P Peters, … Bo Zheng
Soil respiration–driven CO
<sub>2</sub>
pulses dominate Australia’s flux variability
Eva-Marie Metz, Sanam N Vardag, Sourish Basu, Martin Jung, Bernhard Ahrens, Tarek El-Madany, Stephen Sitch, Vivek K Arora, Peter R Briggs, Pierre Friedlingstein, … André Butz
Global distribution of groundwater‐vegetation spatial covariation
Sujan Koirala, Martin Jung, Markus Reichstein, Inge EM de Graaf, Gustau Camps‐Valls, Kazuhito Ichii, Dario Papale, Botond Ráduly, Christopher R Schwalm, Gianluca Tramontana, Nuno Carvalhais