Surface-mass-balance (SMB) and firn-densification (FD) models are widely used in altimetry studies as a tool to separate atmospheric-driven from ice-dynamics-driven ice-sheet mass changes and to partition observed volume changes into ice-mass changes and firn-air-content changes. Until now, SMB models have been principally validated based on comparison with ice core and weather station data or comparison with widely separated flight radar-survey flight lines. Firn-densification models have been primarily validated based on their ability to match net densification over decades, as recorded in firn cores, and the short-term time-dependent component of densification has rarely been evaluated at all. The advent of systematic ice-sheet-wide repeated ice-surface-height measurements from ICESat-2 (the Ice Cloud, and land Elevation Satellite, 2) allows us to measure the net surface-height change of the Greenland ice sheet at quarterly resolution and compare the measured surface-height differences directly with those predicted by three FD–SMB models: MARv3.5.11 (Modèle Atmosphérique Régional version 3.5.11) and GSFCv1.1 and GSFCv1.2 (the Goddard Space Flight Center FD–SMB models version 1.1 and 1.2). By segregating the data by season and elevation, and based on the timing and magnitude of modelled processes in areas where we expect minimal ice-dynamics-driven height changes, we investigate the models' accuracy in predicting atmospherically driven height changes. We find that while all three models do well in predicting the large seasonal changes in the low-elevation parts of the ice sheet where melt rates are highest, two of the models (MARv3.5.11 and GSFCv1.1) systematically overpredict, by around a factor of 2, the magnitude of height changes in the high-elevation parts of the ice sheet, particularly those associated with melt events. This overprediction seems to be associated with the melt sensitivity of the models in the high-elevation part of the ice sheet. The third model, GSFCv1.2, which has an updated high-elevation melt parameterization, avoids this overprediction.
Ice-sheet surface heights vary on timescales from hours (Amory et al., 2021; Lai et al., 2021) to millennia (Khan et al., 2016; NEEM Community Members, 2013). Repeated altimetry measurements can provide estimates of ice-sheet mass changes (Shepherd et al., 2020) and thus their contribution to sea-level change, providing clues to the mechanisms driving mass loss (Smith et al., 2020; Catania et al., 2020) based on spatial patterns and timing of the changes. On an ice sheet in steady state, whose volume and mass are constant in time, snow accumulation and ice ablation at the surface are balanced by ice-flux divergence in the ice–snow column (e.g. thinning of the ice column related to horizontal stretching of the ice) and by snow and firn compaction in the near-surface layers. Any deviation of the rate of one of these processes from its steady-state rate will result in a non-zero rate of surface-height change. We expect to see large variations in the net surface mass balance over the course of a year, and, over most of the ice sheet, we expect to see much slower, annual-to-decadal variations in the rate of ice-flux divergence driven by evolution of the local stress balance of the ice. Thus, even in a part of the ice sheet where the climatological mean surface mass balance exactly compensates for ice flow, we expect to see seasonal surface-height variations. Secular trends in the local net ice-sheet mass balance, such as thickening due to net annual SMB that exceeds the local flux divergence or thinning due to increased ice-flow speeds that are not balanced by additional snowfall, are superimposed on these seasonal signals.
Time series of ice surface height measured by altimeters cannot, by themselves, distinguish between the effects of surface-mass-balance changes and those of variations in ice flow or between surface-height variations caused by changes in the average firn density (e.g. due to imbalances between snowfall and compaction) and those caused by changes in the ice-column mass. This leads to two sets of challenges in the interpretation of altimetry records from the ice sheet: the first is in understanding the relationship between ice volume changes and ice mass changes, which is complicated by variations in near-surface density. The second is in understanding whether ice mass changes are driven by changes in ice dynamics, such as thinning driven by the acceleration of outlet glaciers or by variability in surface mass balance. These challenges may be addressed in part using surface-mass-balance (SMB) and firn-density (FD) models. SMB models provide estimates of the variability in accumulation, melt, and runoff, which allow estimates of the contribution of atmospheric processes to ice-sheet mass change. FD models are driven by information about heat and moisture flux variability provided by SMB models and provide estimates of variability in the firn air content (FAC) as a function of time and depth; the difference between the total measured volume change and the total FAC change gives the change in the ice mass, which can be converted directly into ice mass change. In some of the most rapidly changing parts of the Greenland ice sheet (i.e. outlet glaciers and the regions immediately upstream), height variations are driven in large part by changes in ice velocity (thus flux divergence rate changes) (Moon et al., 2015). These areas, however, are limited to a zone near the coast extending a few tens of kilometres inland; over the majority of the ice sheets, ice velocity has been relatively constant since the first systematic measurements in the late 1990s. In the absence of large variations in velocity, most ice-elevation changes should be SMB and FD driven, and all height-change components can, in principle, be described by a combination of FD and SMB anomalies.
A variety of models are available that can generate SMB (e.g. Gelaro et al., 2017; Fettweis et al., 2017; Noel et al., 2015) and FD (e.g. Stevens et al., 2020; Brun et al., 1989; Munneke et al., 2015) estimates for Greenland and Antarctica, each with differing temporal and spatial resolutions, with different internal representations of the physical processes driving SMB and firn densification, and driven by different climate-forcing data. Some processes within SMB models (e.g. surface albedo evolution) can be tested by comparison with remote sensing data (e.g. Banwell et al., 2012), and SMB models have been tested by comparison with point measurements, such as automatic weather stations, ice cores, and ablation stakes (e.g. Noel et al., 2015); by comparison with accumulation estimates derived from layering observed in ground-penetrating radar data (e.g. Medley et al., 2014; Koenig et al., 2016); and, in bare-ice zones, by direct comparison with altimetry data (Sutterley et al., 2018). Densification in FD models has been tested and, in some cases, calibrated by comparison with ice core density profiles (e.g. Ligtenberg et al., 2011; Alexander et al., 2019; Lundin et al., 2017; Li and Zwally, 2015; Kuipers Munneke et al., 2015); by comparison with borehole measurements (Morris and Wingham, 2014; Hawley et al., 2020); and, for a limited set of measurements in Antarctica, by repeated radar measurements (Ligtenberg et al., 2015). We have identified one study (Verjans et al., 2021) that has used altimetry differences to validate combined SMB and FD models in Antarctica and a second (Kuipers Munneke et al., 2015) that used altimetry differences to evaluate trends in snow-surface heights predicted by models in Greenland.
Previous model evaluations, particularly those of the FD models, have been limited in their spatial extent and do not demonstrate how the accuracy of the models varies over the full range of ice-sheet surface conditions and seasons. In this paper, we present an evaluation of three SMB–FD models in Greenland based on height changes measured with NASA's ICESat-2 satellite between the autumn of 2018 and the end 2020, a period that includes the substantially anomalous summer-2019 melt season (Tedesco and Fettweis, 2020). Although combined SMB and FD models can be evaluated at a regional scale in studies that evaluate ice-sheet mass balance based on multiple redundant datasets (e.g. Martin-Espanol et al., 2016) including gravimetry and altimetry, the coarse spatial resolution of these studies means that the effects of velocity changes are not as easily separated from SMB-driven changes in these data as they are in the altimetry measurements. The high (centimetre-level) vertical precision, 100 m spatial resolution, and quarter-annual temporal resolution of the ICESat-2 data allow us to make pointwise comparisons between the behaviour predicted by the models and the measured height differences, and, by selectively isolating groups of difference data in which the models predict different SMB processes to play a strong role in surface-height change, we evaluate the accuracy with which the models can predict these processes. Our results offer an ice-sheet-wide view of the accuracy of model processes driving surface-height changes.
Our results are based on altimetry data from ICESat-2, selected based on ice-surface velocity data. We compare these data with height-change predictions based on modelled SMB and FD changes from two atmospheric models, driving three different FD models. We describe each below.
Our altimetry data are derived from the ATLAS (Advanced Topographic Laser
Altimeter System) instrument on board NASA's ICESat-2 satellite. ATLAS
measures the height of the ice-sheet surface using six laser beams, which
measure three pairs of tracks, each separated from its neighbour(s) by 3.3 km. The central pair follows a set of 1387 reference ground tracks (RGTs),
which are separated by about 10 km in central Greenland (70
The beams within each of ICESat-2's beam pairs are separated by
Elevation-change data in this paper are based on release 004 of the ICESat2 ATL11 data product (Smith et al., 2021), which combines measurements from multiple cycles to correct for the spatial variation in surface height around each RPT. The limited precision of ICESat-2's repeat-track pointing introduces small apparent height differences between measurements from different cycles, with a magnitude approximately equal to the product of the across-track offset and the surface slope. At each of a set of reference points spaced every 60 m along the RPTs, the ATL11 algorithm solves for a reference surface that corrects for these offsets to give height estimates for each cycle with little or no contribution from the across-track offset. It uses the same correction for points where tracks from different cycles cross the RPTs (crossover points). For this study, height differences since the beginning of RPT pointing (April 2019) are calculated based on height measurements along the same RPT. Height differences from cycles 1 and 2 (prior to April 2019) are calculated based on crossover-difference measurements between the early non-RGT-pointed measurements and the cycle-3 (and later) RGT-pointed measurements.
ATL11 provides two kinds of error estimates. Per-point estimates
(
We present the ICESat-2 data as eight epochs of height differences – all except the first made up of differences between subsequent 91 d cycles. Because cycles 1 and 2 were not collected on the RGTs, the first two epochs use crossover differences relative to cycle 3; thus, the first epoch is made up of differences between the fourth quarter of 2018 (18.Q4) and the second quarter of 2019 (19.Q2), and subsequent epochs are made up of differences between adjacent quarters (e.g. the second epoch is 19.Q1 to 19.Q2).
This study is intended to evaluate the accuracy of the representation of
SMB-driven processes on the Greenland ice sheet. In parts of the ice sheet
where the ice-flux divergence is out of balance with SMB, we expect to see
surface-height changes due to a combination of ice-dynamic changes and SMB
changes. Because we cannot accurately predict the magnitude of height
variations associated with surface velocity variations, we restrict our
analysis to areas of the ice sheet for which the temporal ice-flow
variability is small (less than 20 m yr
We use our altimetry data to evaluate three state-of-the-art SMB and FD models, which were chosen for this paper because of their low temporal latency, which made them available for the same time period as the recently released ICESat-2 data. Two of these models (MARv3.11.5 and GSFCv1.1) have different surface-mass-balance forcing and a different firn model but share a similar melt-rate forcing. The third model (GSFCv1.2) updates the surface-melt forcing from GSFCv1.1.
Each model provides estimates of the height change due to surface mass balance and that due to firn-air-content change. The sum of these together represents the model estimate of surface-height change due to atmospheric and firn processes. Table S1 in the Supplement gives the internal model variables and the abbreviations used in this study for each.
The Modèle Atmosphérique Régional (MAR) (Fettweis et al., 2017; Amory et al., 2021; Tedesco and Fettweis, 2020) is a coupled surface–atmosphere regional climate model forced at 6 h intervals at the lateral boundaries and ocean surface with climate reanalysis data (here ERA5). It includes detailed snow and firn evolution based on the CROCUS snow model (Brun et al., 1989, 1992), an atmospheric model (Gallee and Schayes, 1994), and a land-surface energy-balance model (DeRidder and Schayes, 1997). MAR has been extensively validated over the Greenland ice sheet, showing general good agreement with weather station and SMB measurements, with some local biases (Alexander et al., 2019; Fettweis et al., 2020, 2017; Montgomery et al., 2020). MAR simulates the top 25 m of snow, firn, and ice, including energy and mass transfer between 30 layers of variable thickness, and incorporates the process of liquid water retention and refreezing. A physically based scheme is used to simulate snow densification as a function of the weight of overlying snow (Alexander et al., 2019). Here we use MAR version 3.11.5 (MARv3.11.5) (Amory et al., 2021), which contains modifications to the previous (Fettweis et al., 2020) versions, including updates to the cloud parameterization and bare ice albedo adjustments, but without the blowing-snow module. The simulations presented here are run at a spatial resolution of 10 km, forced with the ERA5 reanalysis over 1950–2020 (Hersbach et al., 2020).
We generated two sets of SMB and FD products (the GSFC model, after the Goddard Space Flight Center, where the modelling was carried out) using output from a global atmospheric model as input to an open-source FD model. Improvements between the initial (v1.1) and updated (v1.2) versions of this modelling scheme allowed us to explore some of the model processes that can lead to errors in SMB–FD models.
Both versions of the GSFC model used atmospheric variables from the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), developed at the Global Modeling and Assimilation Office (GMAO) at the NASA Goddard Space Flight Center (Gelaro et al., 2017). Atmospheric variables including snowfall, total precipitation, evaporation, 2 m air temperature, and skin temperature were downscaled to 12.5 km spatial resolution using an offline, high-resolution MERRA-2 replay, in which an atmospheric model (a nonhydrostatic version of the Goddard Earth Observing System model, version 5, GEOS-5) was nudged to match the MERRA-2 reanalysis. In other studies, the improved resolution due to this downscaling technique has led to improved agreement in skin temperature and SMB with other state-of-the-art models over the Greenland ice sheet (Cullather et al., 2014). One complication in the use of the MERRA-2 model output to drive the firn model was that MERRA-2 does not provide melt as an output. To derive a consistent melt-rate field, we used the MERRA-2 2 m temperatures as input to a degree-day model calibrated to MARv3.5.2 annual melt; the updates between MARv3.5.2 and the MARv3.11.5 model evaluated in this study did not have a major effect on temperature or melt-rate estimates in Greenland, so we assume that the melt-rate calibration for the GSFC models is consistent with MARv3.11.5.
These atmospheric products were used as forcing for the Community Firn Model (CFM; Stevens et al., 2020), which provided ice-sheet-wide simulations of the variations in FD through time (Medley et al., 2022a). The configuration of the CFM included several firn processes comprising densification, heat transport, grain-size evolution, meltwater percolation and refreezing, and sublimation. The combination of atmospheric variables from MERRA-2 and output from the CFM produces total firn-column height variations from January 1980 to December 2020 at 5 d time steps. The total thickness simulated depends on the ambient climate, typically varying between 118 and 298 m (lower and upper fifth percentile).
Between the GSFCv1.1 and GSFCv1.2 model versions, the positive-degree-day
model used to generate melt estimates was refined, and a more complicated
model was used to derive the near-surface density (Medley et al., 2022a). For
the GSFCv1.1 model, the factor relating the model positive degree days to
the melt estimates was derived based on a calibration for each 12.5 km grid
cell between the MARv3.5.2 annual melt production and the MERRA-2 2 m
temperature, and the calibration factor for each cell was applied to derive
melt estimates from the MERRA-2 temperatures. This calibration yielded
calibration factors that were consistent for cells with elevations up to
around 1500 m but increased sharply to large, likely unrealistic values at
higher elevation. Thus, to help avoid overestimation of surface melt in the
GSFCv1.2 model, any grid cell with a surface elevation higher than 1500 m was assigned a calibration factor equal to the minimum of the
calibrated value and 0.13 kg m
For GSFCv1.1, the surface density was assigned based on a linear function of
several climatic parameters (wind speed, specific humidity, accumulation
rate, and temperature), with coefficients chosen to match the observed
surface densities at 151 core sites (Medley et al., 2022a), while for GSFCv1.2,
the surface density was assigned based on a Gaussian process regression
model relating similar parameters to density and trained based on a larger
set of 187 core sites. This resulted in modestly higher surface densities in
GSFCv1.2: the 5 %–95 % range of surface densities for GSFCv1.1 was 247–364 kg m
For our models, the SMB and temperature data provided do not reflect a steady-state climate, and we do not expect the results to converge to any particular equilibrium state. Instead, we choose the period between 1980 and 1995 as a reference period and calculate the anomalies in surface-height change relative to the mean height change over this period. This is equivalent to assuming, first, that the mean vertical velocity of the ice at the bottom of the firn column is equal to the mean ice-equivalent SMB over this reference epoch and, second, that any change in the FAC over the reference period reflects a systematic error in the model, whose effects are corrected by subtracting a linear interpolation of the modelled FAC at the beginning and end of the reference period (so that there is no net modelled FAC change during that period) and by extrapolating the same linear relationship to later times. This is consistent with the way in which firn models have been used in correcting altimetry time series (e.g. Smith et al., 2020) and is useful in this study because it allows us to compare model-predicted changes with less potential influence from long-term drifts in the SMB rate or the total FAC, so that any differences between the models in this study here can only reflect differences relative to the calibration period, with less potential influence from the spin-up processes used to initialize the models. Although the spin-up of the FD model and our assumption of zero change during the reference period may result in errors in the detrended FD model results (e.g. Helsen et al., 2008), we expect these errors to result primarily in errors in the modelled height change that are steady over long (decadal) periods of time. The quarter-annual height changes that are the main focus of this study may experience a temporally uniform shift (i.e. might all be too positive or too negative at a particular location) as a result of these errors, but we do not expect the temporal variability of height changes to be significantly affected.
For each model, we reduce the full set of 57 million height-difference
measurements from ICESat-2 to a more compact sample with a more even spatial
distribution by calculating a block-median set of height differences for
each cycle-to-cycle (
After applying the block median to the height differences, there are still substantially more measurements in the northern part of the ice sheet than in the south. Without a correction for this measurement-density bias, differences described by regressions and other block statistics on the residuals would overrepresent the statistics to the north of the ice sheet, with less sensitivity to the south of the ice sheet, where some of the most dramatic changes have happened. To help correct for this sampling variability, we calculate the density of measurements (i.e. the number of measurements per square kilometre) on 10 km cells over the ice sheet and smooth the calculated measurement-density values with a 100 km square averaging kernel. This gives a map of measurement density for the whole ice sheet, which we then interpolate to the difference-measurement locations. We then calculate a weight value for each difference measurement that is equal to the inverse of its interpolated measurement-density value. Because the data gap in June/July 2019 leads to about 50 % fewer difference measurements in all epochs that included the second quarter of 2019, we also reduced the weights for all epochs later than Q3–Q4 2019, inclusive, by a factor of 2. The resulting weighting ensures that a weighted average of measurements assigns approximately the same weight per unit area to regions in the south of Greenland that it does to regions in the north in addition to accounting for changes in sampling density over time.
To help describe the relationship between modelled and measured
height-change estimates, we calculate weighted regressions using components
of the models' height changes as independent variables. Our goal in these
regressions is to identify how the modelled height changes differ from the
measured height differences over the ice sheet. These regressions estimate
the scaling(s) for the model parameters that minimize the variance between
the measured height differences and the sum of the scaled model parameters:
To isolate the effects of different processes on the data–model misfit, we calculate regressions against groups of model parameters for a few different spatial and temporal subsets of the data. Because regressions can be sensitive to points with large residuals that are not representative of the statistics of the data, for each subset of the data, we first remove large outlying difference values caused by, among other things, complex and steeply sloping ice surfaces, as well as blunders in the ATL06 data underlying the ATL11 data. To identify these, we calculate the robust spread of height-difference distribution (here defined as the half width of the central 68 % of the distribution) and remove from the analysis any outlying points whose difference values are more than 12 times this spread away from the mean. This editing strategy is applied iteratively until subsequent means are identical or until 10 iterations are complete. The final regressions and their residual are calculated after these outlying points are removed. For the ice sheet as a whole, this editing procedure removes about 1 % of points, of which the standard deviation is equal to 2.3 m.
We perform regressions for the total model change (
Properties of subsamples of the models. The elevation and subset
columns indicate the subsample of data for which the statistics were
calculated. The elevation column indicates “low” for elevations below 2000 m, “high” for elevations above 2000 m, or “all”. The subset column indicates
“strong melt” for data for which
Spatial distribution of elevation and melt categories. Panel
To help identify the processes at work in determining the model–data misfit,
we divided the data into low- and high-elevation subsets (
Table 1 gives some general properties of the model outputs for all of the
data together and for each subset. Based on these values, we can see that
melt was considerably stronger in 2019 than it was in 2020 for all three
models, with nearly 3 times as much of the data weight falling into the
strong-melt category (compare the f_melt statistics for sp-su
2019 with those from sp-su 2020). The rms statistics of the SMB variables
reflect the strong surface-melt signal in the summer of 2019 and the large
melt signals associated with the lowest-elevation part of the ice sheet. FAC
variability is largest in 2019 and for both strong-melt subsets of the
data. The fraction-of-weight column indicates that the largest fraction of
the data (by weight) fell into the high-elevation, weak-melt category, while
the smallest fraction fell into the low-elevation, strong-melt category.
MARv3.11.5 and GSFCv1.1 had similar distributions of data weight and
variance among the subsamples, but GSFCv1.2 had less weight in the
strong-melt category, particularly in the high-elevation region, and had
smaller FAC variance within the high-melt subsamples. Note that because of
the outlier editing applied to each subset, the superset of the high- and
low-elevation, weak- and strong-melt subsets contains a slightly larger
(
Data and model estimates for three locations indicated in Fig. 1.
Figure 2 shows height-change measurements, model data, and measurement-model
residuals for three
Measured height differences in Greenland from ATL11
Figure 3a–h show maps of height differences (
Figure 3i–z5 show the corresponding height differences expected
solely due to modelled SMB–FD processes for each epoch (
Residuals between measured and predicted height changes from
MAR3.11.5
Figure 4 shows the residuals between the measured height differences and the
changes predicted by the models, which we term the “corrected” height
changes (
Height-difference and residual histograms for the whole ice sheet,
for three time periods. The curves in each plot represent
measured height change (black,
Figure 5 shows histograms of observed height differences, model-corrected
height differences, and regression residuals for the full time series, as well as
for subsamples of the data spanning the spring-to-summer and
summer-to-autumn epochs of 2019 and 2020. For each subset of the data, we
plot the histograms of the data (measured height differences), of the
residuals between the unscaled model and the data (equivalent to the
residuals for
Distributions of measured height differences (Fig. 5a–c) have a standard
deviation of 0.25 m and a mean of
The data from spring–summer 2019 (Fig. 5d–f) show substantial (
Residuals between measured height differences and rescaled
MARv3.11.5
The spring–summer 2020 statistics and histograms are similar to the full-time-series statistics, and the rescaling coefficients and the improvements in residuals due to the rescaling are identical within a few percent to those of the full time series.
Maps of the residuals to the rescaled MARv3.11.5 and GSFCv1.1 models (Fig. 6a–p) show that the model overcorrections that were apparent as a blue-tinged rim around the ice sheet in the summer epochs in Fig. 4 (panels c, k, g, and o) are much less prominent, although for some points immediately adjacent to the margin during the summers, the rescaled model under-corrects for the measured height differences, resulting in locally larger residuals. For GSFCv1.2, the maps of residuals to the rescaled model (Fig. 6q–x) are not visibly different to those of the unscaled model and are visually similar to the rescaled residuals from the other two models.
Height-difference and residual histograms for subsamples of the
data based on height and model estimates of melt and SMB. The curves in each
plot represent measured height change (black,
The behaviour of the ice sheet and the models was evidently substantially
different between the spring–summer subsample of 2019 and the rest of the
model domain. To explore the role of melt in the data–model differences, we
subdivide the full time series of data into four groups of difference
measurements based on model melt and elevation: one division splits the data
between strong-melt (
For the weak-melt subsamples of the data (Fig. 7a–f), the MARv3.11.5 and
GSFCv1.1 models perform well with no rescaling, reducing the high-elevation
residuals from 0.12 to around 0.09 m and the low-elevation residuals from
0.25 to
For the strong-melt, high-elevation subsample of the data (Fig. 7g–i) the
results are markedly different. The data in these subsamples show
substantial mean height loss (
For the strong-melt, low-elevation subsample of the data (Fig. 7j–l), the histograms of uncorrected height differences have a near-zero peak, with a large negative tail of values indicating strong summer drawdown, a substantial negative mean, and standard deviations of around 0.65 m. All three models correct for a large fraction (71 %–75 %) of the variance in the data, and all have optimal rescaling values close to unity (between 0.79 for MAR3.11.5 and 1.13 for GSFCv1.2), which make very small improvements in the residuals over the unscaled models.
To further explore the importance of different components of modelled height
change, we perform regressions in which we allow different scaling factors
for individual components of the SMB and FD models. In these experiments, we
solve for the coefficients,
Unscaled-model residuals and regression residuals for individual
model components of the MARv3.11.5
For the high-elevation data and MARv3.11.5 (Fig. 8a), rescaling of the FAC
makes a notable reduction in residuals relative to the unscaled model, with
only small improvements due to rescaling the SMB. The combination of the FAC
rescaled by 0.45 and the full SMB leaves a residual standard deviation of
0.11, while the SMB rescaled (by
We see similar results for the high-elevation data and the GSFCv1.1 model
(Fig. 8b), where the combination of the FAC rescaled by 0.51 and the full
SMB correction leaves residuals with a standard deviation of 0.13 m,
approximately the same as seen for rescaling of the complete model (Fig. 7h). The optimal scaling for the SMB alone is
For the low-elevation, high-melt subsample, none of the rescalings result in large reductions in the residual standard deviations. The largest reduction is for MARv3.11.5 (Fig. 8c), where the FAC scaled by 0.57 plus the unscaled SMB yields a residual standard deviation of 0.30 (which should be compared to 0.35 for the unscaled model) and scaling the SMB (by an optimum value of 0.89) makes little or no improvement over the unscaled model. For GSFCv1.1 (Fig. 8d), the high-melt, low-elevation regressions all produce residual standard deviations at most 1–2 cm smaller than those from the unscaled model.
For the GSFCv1.2 model, the optimal rescalings do not make any notable improvement in the residuals over the unscaled model for either subsample of the data (see Fig. S4). The spread of residuals to the unscaled model for the high-elevation subsample of the data (0.15 m) is comparable to that of the fully rescaled GSFCv1.1 (0.13 m) but slightly larger than that of the rescaled MARv3.11.5 (0.11 m). For the low-elevation subsample, the GSFCv1.2 model, like the other two, has a residual spread of around 0.33 m for the unscaled model, and none of the rescalings improves the spread by more than 0.01 m.
Our results show that all three models considered here account for a significant portion of the cycle-to-cycle variance in the measured height change, particularly in the low-elevation, strong-melt subsamples of the data. However, the MARv3.11.5 and GSFCv1.1 models both tend to overpredict total height changes by factors of up to around 2, depending on the subsample of the data considered. These overestimates are most prominent in the spring–summer period in 2019 and in the strong-melt, high-elevation subsamples of the data, suggesting that melt processes play an important part in the overestimates. The updates to the melt model between GSFCv1.1 and GSFCv1.2 appear to improve these overestimates.
Likewise, the rescaling experiments on the SMB and FAC showed that systematic rescaling of the magnitude of the SMB processes in the model alone produced much smaller reductions in the residuals than systematic rescaling of FAC changes did, and for both MARv3.11.5 and GSFCv1.1, rescaling of FAC alone produced residual improvements approximately equal to those due to rescaling the total model or to rescaling the SMB and FAC separately. This points to melt of snow as the process most strongly driving the models' overestimates of height changes. In both MARv3.11.5 and the GSFC models, runoff is small over most of the ice sheet. This means that the SMB component of detrended height change is approximately equal to positive contributions equal to the ice-equivalent snowfall and negative contributions equal to the long-term average SMB rate that we subtracted to detrend the SMB. This component has relatively small temporal variability and cannot explain much of the variance in the height-change rate. In contrast to the SMB component, the FAC component has large temporal variations: when the surface of a snowpack begins to melt, the meltwater flows downward into the pore space in the snowpack, and until that pore space is full, the SMB change due to the melt is zero (because none of the melt runs off the ice sheet) and the total model height change is equal to the FAC change. This means that any overestimate of melt over snow translates directly into equal overestimates of surface-height change and FAC change. The MARv3.11.5 and GSFC models used different FD models, but the melt for the GSFC models was based on a degree-day parametrization of the MARv3.5.2 melt. We expect GSFCv1.1 to share the MARv3 models' overestimates of height change, but in GSFCv1.2, the positive-degree-day scalings were limited for the high-elevation part of the ice sheet, which results in less total melt in this part of the ice sheet and makes a notable improvement in the model's performance during times when melt is large. We observe, however, that GSFCv1.2 does not fare better than the other models and in fact has marginally larger residuals than MARv3.11.5 for the weak-melt subsamples of the data. Changes between GSFCv1.1 and GSFCv1.2 also include a different calculation of the initial surface density, which likely slightly increased the sensitivity of GSFCv1.2 total height change to melt events in the high-elevation interior of the ice sheet and decreased it at low elevations. The improved model performance in regions where GSFCv1.2 was likely more sensitive to melt events than GSFCv1.1 points again to better representation of melt in GSFCv1.2 as the major improvement between the GSFC models. The small reductions in residual spread that result from rescaling the SMB alone in the high-elevation part of the ice sheet for MARv3.11.5 and GSFCv1.1 (Figs. 7a–b, 8a–b) might provide weak evidence that the models overestimate SMB variability in this region, but the reductions in spread are much smaller than those associated with rescaling the FAC, suggesting that our analysis is not strongly sensitive to SMB scaling in this area.
The analysis in this study has focused on the variability of surface height
at quarter-annual timescales. Any long-term differences between the modelled
SMB–FD and the combined SMB, FD, and ice-flux divergence in the ice sheet
will appear in our results as a non-zero mean residuals, caused by the
regional mean of the differences, and as extra spread in the residuals,
caused by spatial variability in the differences. Without additional
information about the state of the ice sheet, we cannot distinguish the
extent to which FD model errors (e.g. Helsen et al., 2008), SMB model
errors, and errors in our assumption that the ice sheet was in balance
between 1980 and 1995 contribute the means and spreads in the residuals we
measure. Despite this, the spread of the residuals to the best-fitting
regressions (e.g. Fig. 7) bounds the spatial variability in any of these
errors to
This study demonstrates one of the first applications of altimetry-difference data to the validation of surface-mass-balance and firn-densification models (and, to our knowledge, the first in Greenland). It demonstrates that the three models evaluated account for a large fraction of the observed height change in the low-elevation, high-melt areas of the ice sheet, but two of the three do not accurately account for the observed changes in higher elevation areas where melt is less common.
The results presented here are based on only 2 years' data, and we do not attempt to distinguish model errors from long-term ice-sheet mass imbalances. Consequently, we cannot reach firm conclusions about whether these models correctly represent long-term volume change rates for the ice sheet. In MARv3.11.5 and GSFCv1.1, The largest model–data differences appear to be associated with the representation of FAC changes in high-elevation parts of the ice sheet during melt events, which, for two reasons, should not necessarily imply errors in the long-term behaviour of the model. First, if the model densification is too rapid near the surface, the densification rates in the excessively dense firn should be slower at a later time, and the long-term mean densification rate may be largely correct. Second, until recently, high-elevation melt events were rare (Trusel et al., 2018), so longer-term studies that use SMB–FD models to investigate decadal ice-sheet mass changes (e.g. Smith et al., 2020) should see relatively small errors due to these events. Conversely, studies seeking to interpret ICESat-2 time series at seasonal timescales will need to account for errors in FD models to obtain accurate mass-change estimates. We note that for MARv3.11.5 and GSFCv1.1, residuals to the rescaled models tend to have means that are closer to zero than the unscaled models, suggesting that model errors may lead to more extreme (larger in absolute value) estimates of ice-sheet change due to ice dynamics.
Our results give little or no evidence for substantial errors in SMB rates in any of the models. Notably, rescaling SMB rates (Fig. 8) produces only marginal improvements in misfits beyond those from rescaling FAC. This is consistent with studies that have compared mass balance from ice-discharge and SMB models with gravimetric estimates of mass changes, which show consistent seasonal and interannual mass variations (e.g. Sasgen et al., 2020; Fettweis et al., 2020). We note that high-temporal-resolution gravimetric estimates of ice-sheet mass change have been available for validation of SMB models for at least a decade, while seasonal altimetric measurements are relatively new, so it should not be surprising that the SMB models are better calibrated than the FD models. At the same time, the most significant deficiency that we infer in MARv3.11.5 and GSFCv1.1 is in the estimation of melt rates in the interior of the ice sheet, where meltwater is absorbed by the firn and makes no contribution to runoff. If the same problem were to be present in models used to predict ice-sheet SMB in the future, when the climate is warmer and runoff is more prevalent at in the ice-sheet interior, we would expect them to predict excessively negative SMB rates.
Considered as a direct comparison of model accuracy, our results suggest that the most recent of the three models considered here, GSFCv1.2, has substantially smaller errors in representing surface changes in the high-elevation part of the ice sheet during melt events. Model improvements between GSFCv1.1 and GSFCv1.2 include changes in the initial density of new snow and a limitation in degree-day scaling factors in the melt model for high-elevation grid cells, and based on the improvement in unscaled model residuals at high elevations, we suggest that the latter made a substantial improvement in the model representation of surface-height change. Because GSFCv1.1 derives scaling factors based on melt and temperature data from an earlier MAR version (v3.5.2) and both models show similar behaviour during melt events in high-elevation regions, our observations confirm the suggestion that MARv3.11.5 likely overpredicts melt for the white-snow surfaces prevalent at high elevation in Greenland.
This study demonstrates a technique for directly evaluating surface-mass-balance model output using altimetry data. We propose that this has the potential to become a standard technique to allow modellers to test whether updates to model calculations or parameters improve model fidelity. Our study compared model versions that included a variety of different processes and parameters and were thus not designed to isolate the melt-driven height changes that we identified as needing improved representation in MARv3.11.5 and GSFCv1.1; a more targeted future study might include model experiments that change only a single parameter or process change at a time. We would also hope to see future studies include a larger variety of models, including, potentially, the popular RACMO-driven IMAU firn model (Ligtenberg et al., 2018), and would hope to see the short-term densification information provided by altimetry studies fused with in situ data such as firn strain measurements (e.g. MacFerrin et al., 2022) and firn-density profiles (e.g. Montgomery et al., 2018) to produce holistic calibration of the short- and long-term evolution of models.
ICESat-2 ATL11 data are available from the National Snow and Ice Data Center
(NSIDC):
The supplement related to this article is available online at:
BES and MT originally proposed the study and conceived and directed the research. BES wrote most of the manuscript and developed the model–data comparison analysis. BM and XF provided model data and provided interpretations of model–data discrepancies. TS developed software for the study and advised on the handling of SMB data. PA formatted and analysed model data; together with DP, he provided insights into the behaviour of the models and suggested numerical experiments. All authors participated in discussions of the manuscript and played a substantial role in manuscript writing and editing.
The corresponding author, Ben Smith, is a volunteer editor for TC.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors would like to thank the editor, Bert Wouters, and three anonymous referees for their efforts in helping to shape and refine our manuscript. We would also like to thank the ICESat-2 project, the ICESat-2 science team, and ATL11 developers Ben Jelley and Suzanne Dickinson.
This research has been supported by the National Aeronautics and Space Administration (grant nos. NNX17AH04G and 80NSSC17K0351), the Heising-Simons Foundation (grant no. N/A), and the National Science Foundation (grant no. ANS 1713072).
This paper was edited by Bert Wouters and reviewed by Ian McDowell and two anonymous referees.