Sensitivity of geodetic glacier mass balance estimation to DEM void interpolation

Glacier mass balance is a direct expression of climate change, with implications for sea level, ocean chemistry, oceanic and terrestrial ecosystems, and water resources. Traditionally, glacier mass balance has been estimated using in-situ measurements of changes in surface height and density at select locations on the glacier surface, or by comparing changes in surface height using repeat, full-coverage digital elevation models (DEMs), also called the geodetic method. DEMs often have gaps in coverage (“voids”) based on the nature of the sensor used and the surface being measured. The way that these voids are 5 accounted for has a direct impact on the estimate of geodetic glacier mass balance, though a systematic comparison of different proposed methods has been heretofore lacking. In this study, we determine the impact and sensitivity of void-filling methods on estimates of volume change. Using two spatially complete, high-resolution DEMs over Southeast Alaska, USA, we compare 11 different void-filling methods on a glacier-by-glacier and regional basis. We find that a few methods introduce biases of up to 20% in the regional results, while other methods give results very close (<1% difference) to the true, non-voided volume 10 change estimates. Finally, we independently show using ASTER DEMs that some of best-performing methods are more robust than others, depending on the properties of the original DEMs, and therefore recommend that studies compare a few of these methods to estimate the uncertainty introduced by filling DEM voids.


IfSAR
As part of the Statewide Digital Mapping Initiative, the State of Alaska is producing an interferometric synthetic aperture radar (IfSAR) DEM of the entire state. The data are acquired from airborne radar operating in X-band and P-band, and are provided in a native resolution of 5 m mosaics. In our study area, flights were flown in summer 2012 and 2013. These data are available from the U.S. Geological Survey, see https://lta.cr.usgs.gov/IFSAR_Alaska. As of September 2017, 92% of the state has been 5 covered through this initiative, with 57% of the statewide data available for download (https://nationalmap.gov/alaska/).

Glacier Outlines
We use the Randolph Glacier Inventory v6.0 data as a base to mask glacier basins (RGI Consortium, 2017). As the IfSAR DEMs are only available over Alaska, and not adjacent areas in British Columbia and Yukon, we have selected only glaciers that fall 90% or more by area within Alaska. Additionally, we have removed any glaciers that fall 10% by area or more in both 10 collection years, in order to ensure that we are using temporally consistent data to estimate volume change. Finally, we remove any glacier basins that are smaller than 1 km 2 . This results in a total of 443 individual glacier basins used for the analysis.

Methods
We first calculate the "true" volume change by directly differencing the IfSAR and SRTM DEMs after co-registration following Nuth and Kääb (2011), and subsequently summing the elevation differences multiplied by pixel area within each glacier 15 outline. Ordinarily, using DEMs derived from radar of different bands, especially those acquired in different seasons such as the SRTM (February) and IfSAR (typically August/September), would require a consideration of the effects of differential radar penetration in snow and ice, as well as a temporal correction accounting for the difference in season, before converting elevation changes to a mass balance value (Haug et al., 2009;Kronenberg et al., 2016). In this region, the SRTM is known to have particularly high levels of penetration that cause significant biases when used in geodetic mass balance calculations 20 (Berthier et al., 2018). As our interest in this study is in isolating the effect of void interpolation methods on estimates of volume change, we ignore the differential penetration and temporal mismatch between our DEMs. We therefore highlight that biases will exist in the numbers provided in this study and do not recommend interpreting these relative estimates of glacier volume change.

25
In order to investigate the effects of filling voids, we first simulate voids in the IfSAR DEM to reflect the distribution and size of voids that might be expected in DEMs derived from optical stereo sensors. Correlation masks from 99 MicMac ASTER (MMASTER)-processed stereo scenes (Girod et al., 2017) provides the basis for void simulation as low correlation areas represent failure of the stereographic reconstruction and elevation determination. We thus use areas of low correlation in the We average and mosaic the 99 ASTER correlation masks together, and select a mean correlation threshold of 0.5 to serve as the lower bound for acceptable correlations. This choice of threshold is based on a visual inspection of the mask produced, and the desire to mimic the ASTER data as much as possible. To further investigate the effects of interpolation method on 5 the estimates of volume change, we also increase the threshold to 0.7, comparing the differences for a select few interpolation schemes. For each threshold value, we apply the resulting mask to the IfSAR DEMs, producing voids as shown in Fig. 2.

Void Filling
The following is a brief summary of the different methods used to fill the artificially-generated voids in the DEM and dDEM products. We have split the methods into three general categories, "constant" interpolation, "spatial" interpolation, and "hyp-10 sometric" interpolation.

Constant Methods
For the so-called "constant" interpolation methods, we calculate the mean (median) elevation differences of the non-void pixels for each glacier basin, then multiply this value by the area of the glacier basin, thereby obtaining an average volume change for the glacier basin. Examples of this method in the literature include Surazakov and Aizen (2006)

1.
Interpolation of elevation. This method, applied to the DEM containing voids (here, the IfSAR DEMs), interpolates raw elevation values of the surrounding pixels to fill voids. The resulting interpolated DEM is differenced from the second DEM, followed by calculation of the volume changes. Examples of this approach can be found in Kääb (2008) 2. Interpolation of elevation differences. Two original, unfilled DEMs are differenced to create a DEM difference (dDEM).
Then, the voids in the dDEM are filled using bilinear interpolation. An example of this approach can be found in Kääb 3. Mean elevation difference in 1 km radius. For each void pixel, we calculate the average elevation difference based on on-glacier pixels within a 1 km radius of the void pixel. Examples of this approach can be found in Melkonian et al. (2013Melkonian et al. ( , 2014.

Hypsometric Methods
The so-called "hypsometric" methods are based on the assumption that there is a relationship between elevation change and elevation. They can be further sub-divided into "global" and "local" approaches, depending on whether the mean is calculated using data from the entire region (i.e., "global") or for an individual glacier basin only (i.e., "local"). The global approach is often used by altimetry studies to extrapolate measurements from only a few glaciers to a regional scale (e.g., Arendt et al.,

Uncertainties
To estimate the uncertainties in the true volume changes, we first co-register each DEM (SRTM, 2012 and2013 IfSAR campaigns) to ICESat, using the method described by Nuth and Kääb (2011). We can then use the triangulation procedure described 20 in Paul et al. (2017) to estimate the residual bias ε bias after co-registering the DEMs to each other; i.e. the uncertainty in correcting the mean bias between the DEMs. We also estimate the combined random error in elevation, ε rand by calculating the RMS difference of the population of dDEM pixels on stable ground. For each glacier, the error in volume change ε ∆V can be estimated as: 25 with A the glacier area, ε A the error in glacier area (here assumed to be 10%), and ∆h the mean elevation change on the glacier. To account for spatial autocorrelation, as well as the two sources of uncertainty in the elevation differences (ε bias and ε rand ), ε ∆h can be written: where n is the number of pixels (i.e., measurements) that fall within the glacier outline, L is the autocorrelation distance (here assumed to be 500 m), and r is the pixel size (30 m). Finally, we can combine equations (1) and (2) to obtain: only a small number (8%, 36 glaciers) having more than 40% voids. Voids are distributed similarly to glacier area with respect to normalized glacier elevation, and most of the voids, as well as most of the glacier area, are found in the middle third of the glacier elevation range. These void and area distributions, along with the range of elevation changes, suggests that the 10 middle third of the glacier elevation range is the most important to ensure correct estimation; that is, uncertainties introduced by interpolating over voids in the upper and lower thirds of the elevation range will be muted, owing to the typically smaller areas and percentage of voids in these ranges.

Individual Glaciers
The initial, non-voided maps of elevation differences for the 2012 and 2013 IfSAR acquisition areas are shown in Fig. 4. In These contrasting patterns of elevation gain and elevation loss inform some of the patterns shown in Fig. 5. Generally speaking, the global hypsometric methods are the farthest from the true values, which is perhaps not surprising in a region with 25 a variety of elevation change patterns such as this one. Glaciers that are far from the average volume change of -0.11 km 3 will tend to be far from the true volume change when the volume change is estimated with the regional values, as the data used do not reflect conditions at that particular glacier. As a result, volume changes for glaciers losing much more volume than the average tends to be overestimated, while volume changes at glaciers that are losing less than the average, or even increasing in volume, tends to be underestimated. Methods which use data from a particular glacier outline, or in a small area close to the particular glacier outline, tend to do a much better job of reproducing volume changes over each of these glaciers than do these global methods.
We also see little overall difference between the two linear interpolation methods. Kääb (2008) estimated the difference between these two methods for glaciers on Edgeøya, Svalbard, to be 1±12 m RMS. If we convert the volume change estimates to a mean elevation change for the glaciated areas, we obtain a mean difference of 0.00±0.01 m for these two methods. This 5 difference is due to the fact that Kääb interpolated between contour lines over an entire ice cap, whereas we interpolate over much smaller areas that are confined by mountains, with much smaller differences on either side of a void. Thus, the effects of the different interpolation methods are muted. Additionally, Kääb used contour lines derived from aerial images with low contrast at higher elevations, which are likely biased as a result. We would most likely see similar results if we had used the NED DEM as reference, rather than the SRTM, as the NED was produced from similarly low-contrast aerial images.

10
The statistical summary for the difference in volume change estimates over all glaciers individually (Table 1) shows that on average, mean and median differences to the true values are generally low (<0.001 km 3 ), as are root mean square (RMS) values (typically <0.2 km 3 , with the exception of the global hypsometric methods). This pattern does not hold, however, for the larger glaciers, which tend to have a much larger spread in estimates of volume change, and therefore appear to be more sensitive to the various methods. The percentage of estimates that fall within the uncertainty range of the true volume change 15 estimates for most of the methods is quite high, around 95%. One notable exception is the median multiplied by glacier area method described in section 3.2.1, which aside from the global hypsometric methods, shows the fewest number of glaciers for which the interpolated value falls within the uncertainty (77.85%), shows the largest individual overestimation at 2.32 km 3 , the largest mean and standard deviation (0.02±0.15 km 3 ), and the largest RMS difference (0.15 km 3 ), and the worst agreement with the regional volume change estimate (8.3 km 3 overestimation). 20 Fig. 6 shows the elevation change over Taku Glacier, with holes filled in for the nine non-constant methods. As mentioned previously, the spatial methods ( Fig. 6b-d) and the local hypsometric methods (Fig. 6h-j) show the most similarity to the original elevation changes (Fig. 6a), with some subtle differences. The hypsometric methods have the effect of smoothing out the patterns of elevation change, whereas the spatial interpolation methods tend to preserve the original spatial patterns within elevation bands. Near the dividing lines between glacier basins, discontinuities can be seen in the local hypsometric maps, 25 compared to the gradual changes across dividing lines seen in the original elevation changes and the spatially-interpolated maps. This suggests that the choice of glacier basin outlines can have an impact on the resulting volume change estimates.
Finally, the global hypsometric methods (Fig. 6e-g), taking data from the region, do not faithfully reproduce the anomalous elevation change patterns for Taku Glacier.
For the largest 20 glaciers in the dataset (all >100 km 2 ), which represent 61% of the total glacier area for the glaciers 30 studied, as well as 68.8% of the volume change in the region (a total of -49.9 km 3 ), we see a number of patterns related to each of the methods. Fig. 7 shows that for these largest glaciers, most of the methods fall within ∼ ±0.5 km 3 of the true value, with significant outliers for some methods on some glaciers. For example, each of methods for glacier RGI60-01.27102 (Holein-the-Wall Glacier) are clustered quite close to the true value, with the exception of the global methods. Hole-in-the-Wall Glacier is directly adjacent to Taku Glacier, and is also slightly gaining mass, thus leading to the discrepancy with the regional glacier are relatively small overall (19% of the glacier area), and the elevation change pattern for this glacier is also in line with the regional trends (strong elevation loss at lower elevations, small gain at higher elevations). 5 Fig. 8 shows a box plot of the distribution for each method for the glaciers shown in Fig. 7. Again, we see that the bestperforming methods, based on the size of the interquartile range and then mean difference of each interpolated volume change estimate, are the spatial interpolation methods, the local hypsometric methods, and the mean dH constant method. Table 2 shows the differences to the true volume change for two glaciers with some of the largest deviations from the true values. The median elevation change estimate for Field Glacier has the largest overall change from the true value for the non-10 global methods, at +2.32 km 3 , while the global fits for Taku Glacier have the largest negative changes, all over -3.50 km 3 .
These differences are most likely for the reasons discussed above: the data being used to estimate volume change for Taku Glacier are far more negative than reality. For Field Glacier, only the median elevation change method and the global methods perform particularly poorly; the rest are all within ±0.36 km 3 of the true estimate of -3.02 km 3 (12%). As shown in Fig. 9, this is most likely because of the heavy slant towards very negative elevation changes in the elevation change distribution for Field 15 Glacier. Those positive values of elevation change found on the glacier are relatively small as compared to the negative values, and so the median is pulled heavily towards zero, greatly underestimating the volume change.
Based on the results for individual glaciers, the best methods (i.e., that introduce the least uncertainty/bias and the estimates closest to the original, non-voided estimates) appear to be linear interpolation of elevation change, the mean hypsometric approach, and the 1km neighborhood approach.

Regional Totals
While the differences when averaged over all glaciers tends to be close to zero, the differences in the regional estimates can vary substantially, as shown in Table 1. The methods that came closest to the "true" volume change for the region were local mean hypsometric method, linear interpolation of elevation differences, and the global mean hypsometric method, which all yielded estimates within 0.4 km 3 (0.8%) of the regional total. A form of the global mean hypsometric method is one that is often used 25 in altimetry-based studies to extrapolate measurements to unsurveyed glaciers, either using absolute or relative elevation (e.g., Arendt et al., 2002;Kääb et al., 2012;Johnson et al., 2013;Larsen et al., 2015), and this result would indicate that relatively little bias is introduced to the regional estimate through this form of extrapolation.
The next best estimates after the three closest were the local median and polynomial hypsometric methods, linear interpolation of elevation, and the 1 km average method, all coming within 2 km 3 (4%) of the regional total. One explanation for the 30 value for the elevation interpolation method is as discussed in Kääb (2008): elevations on the glacier surface are not necessarily self-similar in a given area, and elevations can vary greatly even on relatively small length scales. As for the 1 km neighbourhood method, it may be that 1 km is too large of an area to try to average over for some glaciers in this region, or it may be that the neighbourhood window used is including values from neighbouring glaciers that have very different patterns of elevation change, thus behaving more like a "global" method in some areas.
The methods that came the farthest from the regional total were the median elevation change method, as discussed above, and the global polynomial hypsometric method, both over/underestimating the regional total volume change by over 8 km 3 , well above the uncertainty of 5.21 km 3 . In contrast to this, estimating volume changes using the global median hypsometric 5 method overestimated the regional total volume change by 4.47 km 3 . While for an entire glacier basin, the median elevation change skews very heavily towards zero due to the asymmetry in positive and negative values of elevation change, this is not necessarily the case for an elevation bin. As noted by Kääb (2008), and borne out by the elevation change interpolation method, elevation changes tend to be rather self-similar on small spatial scales, and the median change for an elevation bin tends to be a more accurate reflection of the actual elevation change.

Increasing void area
By increasing the threshold used to induce voids from 0.5 to 0.7, we were able to estimate how sensitive the different methods are to the size of voids in the DEMs. As might be expected, the overall number of glaciers for which the interpolated volume changes fall within the uncertainty drops across all methods, though not completely evenly. The global methods are rather insensitive to the increase in voids, likely because the increase is muted on the regional scale as compare to the individual 15 glacier scale For the largest 20 glaciers shown in Fig. 7, Fig. 10 shows the difference in volume change estimate for each of the methods.
Overall, most glaciers do not have large differences for most of the methods (<0.15 km 3 ). The global methods tend to show the least change, while the constant methods and spatial methods tend to show larger, though still relatively small, differences.

ASTER differences 20
One caveat remains when using methods such as linear interpolation of elevation differences. To illustrate this, we used ASTER DEMs acquired on 13 August 2015 over a portion of the 2012 IFSAR acquisition area, and differenced these DEMs to the SRTM. Compared to the IfSAR DEM, the ASTER DEMs are quite noisy in the accumulation areas of glaciers, owing to the low contrast, and hence low correlation, between the original images in the ASTER scenes. As such, even after correlation masking, there is significant noise in the DEM difference map (Fig. 11a). When these values are linearly interpolated, the 25 resulting dDEM shows clear interpolation artefacts and elevation changes that differ greatly from the original IfSAR/SRTM differences, biasing the estimated volume changes (Fig. 11b). For the 91 glaciers covered by these ASTER DEMs, the other "best" estimates named above (local mean hypsometric and global mean hypsometric) provide a volume change estimate of ∼0 km 3 , whereas linear interpolation of elevation differences yields a volume change of 3.4 km 3 . Looking further at this, this discrepancy is almost entirely due to one glacier, Johns Hopkins -linear interpolation of elevation changes yields a volume 30 change estimate of 3.6 km 3 for this glacier, while the other estimates give only ∼2 km 3 . Thus, we caution against using a direct linear interpolation of elevation differences to fill voids without first filtering or otherwise removing potential outliers, or when the distances between known values are quite large in relation to the glacier width; that said, the local mean hypsometric approach used by many studies performs just as well as linear interpolation of elevation differences, appears to be more robust against this kind of noise, and is easily implemented in place of linear interpolation.

Conclusions
We have compared 11 different methods for filling voids in DEM difference maps over glaciers, and compared the effects of these different methods on estimates of glacier volume change. Three methods, linearly interpolating elevation changes, the 5 so-called local mean hypsometric method, and the so-called global mean hypsometric method, performed remarkably similarly in estimating the regional total volume change, differing from the true estimate by less than one percent; the first two methods also performed well on an individual glacier basis. For the input data we have used, linearly interpolating elevation differences tends to produce elevation change maps that look the most similar to the original maps. This may not hold, however, for voids that take up a larger portion of the glacier area, where the assumption that elevation changes are similar over small distances 10 may be violated, and interpolation artefacts would introduce larger uncertainties. Additionally, this may not hold for DEMs that are noisier, especially in low-contrast areas such as the accumulation zones of glaciers; as such, we caution against adopting this method without first considering the characteristics of the DEMs beings used. In terms of individual glacier estimates, using the mean elevation change per elevation bin multiplied by the glacier hypsometry performs quite well, which perhaps explains its widespread use in studies of glacier volume change and geodetic mass balance.

15
Taken on average, most of the methods perform well for the glaciers in the region, with low mean, median, and RMS differences for all methods, though large outliers skew the differences in the regional totals. Using the median elevation change for a glacier, multiplied by the glacier area, tends to work quite poorly, owing to the asymmetrical distribution of positive and negative elevation change values; i.e., the glaciers in the region tend to have significantly more negative values of elevation change than positive values. Unless there is good reason to think the distribution of elevation changes for a particular glacier 20 or region is more symmetrical, this method should be avoided; the same can be said for using a median hypsometric approach, which does not do as well as the mean hypsometric approaches, although this may not always be the case when there is significant noise in the original DEMs. As might be expected, using regional data to estimate the volume change of an individual glacier quite often does poorly, though the regional total volume change can be well-approximated in this way.
In summary, the effect of DEM voids on estimates of geodetic mass balance depends on the size of the voids, the magnitude 25 and spatial pattern of changes on the glaciers, as well as the nature of the DEMs used. The choice of void-filling method is important, and if not considered properly, biases many times the uncertainty of the volume change measurement can be induced, while on the regional level, biases of up to 20% can be induced. The choice of "best method" will depend on the goal of the study, as well as the nature of the voids in the DEMs and the changes of the glaciers. Interpolation methods using elevation differences from an individual glacier, or differences within a close proximity to an individual glacier, tend to be the 30 most accurate and robust. If the DEMs used have significant noise, or have large holes, however, linear interpolation may not be suitable. If attempting to estimate geodetic mass balance for unsurveyed glaciers, as is needed in many altimetry-based studies, only a global method will suffice, though the mass balance estimate for a given unsurveyed glacier should not be taken at