A minimal model for reconstructing interannual mass balance variability of glaciers in the European Alps

We present a minimal model of the glacier surface mass balance. The model relies solely on monthly precipitation and air temperatures as forcing. We first train the model individually for 15 glaciers with existing mass balance measurements. Based on a cross validation, we present a thorough assessment of the model’s performance outside of the training period. The cross validation indicates that our model is robust, and our model’s performance compares favorably to that from a less parsimonious model based on seasonal sensitivity characteristics. Then, the model is extended for application on glaciers without existing mass balance measurements. We cross validated the model again by withholding the mass balance information from each of the 15 glaciers above during the model training, in order to measure its performance on glaciers not included in the model training. This cross validation indicates that the model retains considerable skill even when applied on glaciers without mass balance measurements. As an exemplary application, the model is then used to reconstruct time series of interannual mass balance variability, covering the past two hundred years, for all glaciers in the European Alps contained in the extended format of the world glacier inventory. Based on this reconstruction, we present a spatially detailed attribution of the glaciers’ mass balance variability to temperature and precipitation variability.


Introduction
Glaciers are prominent features of the alpine landscape.As they integrate their surface energy and mass fluxes over multi-annual to centennial timescales (e.g.Jóhannesson et al., 1989;Oerlemans, 2001), the fluctuations of the glaciers' extent constitutes a naturally low-pass filtered signal of the atmospheric variability.Through this property, glaciers allow people to directly perceive slow changes of the climate system, that otherwise would be overwhelmed in human perception by short-term noise.Changes in glacier extent have therefore been discussed long before climate variability and change received the attention they do today (e.g.Walcher, 1773;Finsterwalder and Schunk, 1887).
But the low-pass filter comes without a manual.In order to understand what the fluctuations of glacier behavior imply for their atmospheric forcing, it is necessary to understand the interaction between glacier and atmosphere, and to attribute changes in glacier behavior to specific changes in the atmospheric forcing (e.g.Mölg et al., 2009).Without the ability to distinguish between different modes of change (e.g. between stochastically forced fluctuations and fluctuations caused by anthropogenic warming), the glacier fluctuations are meaningless to the observer interested in inferring atmospheric variability from them.
The surface mass balance of a glacier is closer related to the atmospheric forcing than changes in glacier length are.Because ice dynamics do not complicate the relation between surface mass balance and atmospheric forcing as they do with the relation between atmospheric forcing and length variations, the surface mass balance provides a more direct access point to understanding glacier-climate interactions.But measurements of the glacier mass balances exist only for much shorter time spans than observations of glacier length (Oerlemans, 1994(Oerlemans, , 2005;;Cogley, 2009).For detecting statistically robust connections between climate change and glacier extent on multidecadal and longer time scales, the period of direct measurements is too short (Roe and O'Neal, 2009;Roe, 2011).
Reconstructions of glacier mass balance time series are therefore desirable, and there are a number of approaches that have been followed.Schöner and Böhm (2007) present two hundred years of mass balances for two Austrian glaciers Published by Copernicus Publications on behalf of the European Geosciences Union.

B. Marzeion et al.: Reconstructing interannual mass balance variability of glaciers in the Alps
based on a statistical relation between mass balance and summer temperatures, minimum glacier elevation, area-weighted mean glacier elevation, winter precipitation, summer snow precipitation, and a measure of continentality.Huss et al. (2008) reconstruct spatially distributed mass balances of four Swiss glaciers back to 1865, based on a temperature index melt model (Hock, 1999).Nemec et al. (2009) employ a spatially distributed energy balance model to reconstruct the mass balance of Vadret da Morteratsch back to 1865.
Similarly, there are studies that attribute observed behavior changes of single glaciers to atmospheric forcing patterns (e.g.Hoinkes et al., 1968;Hodge et al., 1998;Rasmussen and Conway, 2004;Huybers and Roe, 2009).Nesje et al. (2000) find correlations between the mass balance of several glaciers in western Norway and the North Atlantic Oscillation (NAO), caused by the strong influence of the NAO on winter precipitation.Similarly, Reichert et al. (2001) compare a glacier in Norway (Nigardsbreen) to an alpine glacier (Rhonegletscher) and find that the decadal component of the NAO signal is driving the mass balances of both glaciers, with opposite signs between Nigardsbreen and Rhonegletscher.For the same pair of glaciers, Reichert et al. (2002) find that the recently observed negative mass balances exceed the variability of mass balances that can be explained by internal variability of the climate forcing.Steiner et al. (2008) use a nonlinear backpropagation network trained with reconstructions of temperature, precipitation, and glacier length in order to test the sensitivity of Lower Grindelwald Glacier to scenarios of future temperature and precipitation changes.Huss et al. (2010a) present 100-yr long reconstructions of the mass balances of thirty Swiss glaciers and detect links to North Atlantic multidecadal variability.
Here, we present a minimal model of the glacier surface mass balance, and its application to reconstruct two hundred years of mass balance variability in the Alps.The aim here is not to use this reconstruction to drive dynamical glacier models, which could serve to verify reconstructions of past climate variability based on a comparison of modeled and observed glacier extents.We rather intend to present a mass balance reconstruction that is based on objective measures of model robustness, that is spatially and temporally detailed, and that is extending over a time period long enough to allow for conclusions on how patterns of atmospheric variability influence glacier mass balance variability in the European Alps.In this contribution, we only present the reconstruction method and validation together with an exemplary simple application, as the analysis of the rich data set created is beyond the scope of this paper.
In Sect.2, the model is first derived for glaciers with existing mass balance observations.The model is trained for each glacier individually, and a detailed validation of the temporal robustness and skill of the parameter estimation is presented.We compare the results of the individually trained model to another model, which is based on seasonal sensitivity characteristics (a simplified version of the method of Oerlemans and Reichert (2000), hereafter referred to as the SSC model).Then, our minimal model is extended to glaciers without existing mass balance measurements, by applying the mean of the individually estimated parameters (hereafter referred to as the mean model).This is followed by a detailed validation of the mean model's skill, which allows for a quantification of how changing glacier geometry, and atmospheric flow regimes, impact the model error.The results of the reconstruction are presented in Sect.3, along with an simple example of how the model may be applied in order to perform spatially detailed studies that may detect the regional differences between the sensitivities of glaciers to climate forcing.The potential and limitations of the model are discussed in Sect.4, before we summarize the results and conclude in Sect. 5.

Consideration of the climatological mean
First, the model is established for a glacier that is in equilibrium with its climatological forcing.In this case, the annual mass gain 12 i=1 P i,clim , where P i,clim is the climatological monthly solid precipitation, integrated over the surface of the glacier, has to be lost to ablation (basically melt) within the same hydrological year.The production of melt is controlled by the energy budget of the glacier.While the determination of the exact energy budget of a glacier is quite intricate (see e.g.Kuhn, 1987;Oerlemans, 2000;Mölg and Hardy, 2004), at mid latitudes the air temperature is a reasonable proxy for the energy available to the glacier for producing melt (Sicart et al., 2008;Ohmura, 2001), and has been used to derive a whole set of minimal glacier mass balance models, also known as temperature index melt models (see Hock, 2003, for an overview).For our glacier in equilibrium, it is therefore reasonable to determine a temperature sensitivity µ such that 12 where T i,clim is the climatological monthly air temperature at the glacier terminus, T melt is the monthly mean air temperature above which melt at the glacier terminus occurs, and the maximum operator ensures that only months with mean temperatures above T melt contribute to the melting of ice.We use the terminus height of a glacier as the reference point because it provides a natural temperature threshold: if temperatures at the terminus are below freezing, it is reasonable to assume that no melt occurs at the entireglacier surface.Note that T melt does not necessarily have to be 0  above-freezing temperatures even if the monthly mean is below 0 • C.
The source of data for T i and P i used in this study is the HISTALP dataset (Auer et al., 2007) 1 , which provides monthly sums of solid (as well as total) precipitation and 2 m temperatures on a 5 × 5 min grid (approxmiately 9.5 × 6.5 km) of the greater alpine region, covering the years 1780 to 2008 (temperature) and 1801 to 2003 (precipitation).

Introducing variability
If the forcing of our glacier is variable, then 12 i=1 where MB is the annual specific mass balance of the glacier.
If there have been measurements of N years of MB on the glacier of interest, we can now in principle determine µ and T melt such that the mean square error mse = is minimized (least-squares regression).As it turns out, the mean of the mse of all glaciers considered in this study is quite insensitive to optimization of T melt (the mean of rmse = mse 1/2 is reduced by only 1 mm w.e. if we optimize our model for T melt ), and when we prescribe T melt , there is a minimum in the rmse at T melt = 0 (see Fig. 1).To obtain T i and P i , we first determine the HISTALP grid point whose center is closest to the glacier location, and extract temperature and precipitation time series from that grid point.However, the height z HISTALP of the location of the 1 Available at http://www.zamg.ac.at/histalpHISTALP temperature T i,HISTALP is usually different from the height z glacier terminus of the glacier tongue, which we obtain from Cogley (2009) 2 .We therefore correct the temperature for the height difference by assuming a constant lapse rate γ , such that In order to estimate γ , we run our model once for all glaciers considered in this study and require that any dependency of T melt on z glacier terminus − z HISTALP disappears.This yields γ = −0.0063K m −1 , which is reasonable.In the following, we ignore any changes of z glacier terminus that may result from mass balance variability (see Sect. 4 for a discussion), and because of the weak dependence of the model performance on the optimization of T melt , we prescribe T melt = 0 • C. While the modeled mass changes MB model have a high correlation with the observed mass changes MB measured (left panel in Fig. 2), there is a strong bias and an underestimation of the interannual variability of the mass balance (blue line versus green line in Fig. 3).The reason for this is likely an increase of precipitation with altitude (and potentially accumulation on the glacier from avalanches and aeolian transport), which leads to an underestimation of the mass balance (since typically, z glacier terminus > z HISTALP ).In order to compensate for this precipitation lapse rate, we introduce a scaling parameter a such that the final model for the annual mass balance of the glacier is to be minimal.

Model validation
In order to validate the model, we construct a time series of MB model,cross , where each value MB k,model,cross , k indicating the year, is independent of MB k,measured .The values of MB k,measured are area-integrated mass balances obtained from Cogley (2009).MB model,cross is determined by employing a leave-one-out cross validation routine (Michaelsen, 1987;Hofer et al., 2010): first, we determine which of the glaciers in the mass balance data set are situated within the HISTALP region (a total of 39 glaciers).We then determine the auto-correlation time lag t r,lag of MB measured for each of these glaciers by identifying the lag (in years) after which the auto-correlation drops below the 90 % significance interval.Finally, we optimize the parameters a and µ, leaving a moving window of one year ± t r,lag out of the data (the leftout value has to be buffered with the auto-correlation time lag in order to ensure that the remaining values are truly independent from the removed value).I.e. for a glacier with N measured mass balances, we perform N optimization routines, obtain N values for a optimized,cross and µ optimized,cross , and N values MB k,model,cross .This is an effective validation mechanism especially when only short observational time series are available, as it guarantees that for each time step, MB k,model,cross is independent of MB k,measured .The cross validation thus includes an assessment of the temporal model parameter stability, and at the same time allows for the usage of all available data for the training of the model.
Based on the time series MB model,cross we can derive an estimate of the model error The performance of N optimization routines also allows for an assessment of the robustness of the parameter optimization.Figure 4 shows the standard deviation of the parameters a and µ obtained during the cross validation.If only few measured mass balances exist, the values a optimized,cross and µ optimized,cross strongly depend on single measurements, and the standard deviations are high, indicating that the parameter optimization is not robust.The principle reason behind the increased parameter uncertainty for small numbers of measured mass balances is the potential of compensation between precipitation and melt in our model.If only one mass balance measurement would exist, there would be an infinite number of parameter combinations that reproduce the observed mass balance exactly.With increasing numbers of mass balance measurements, the parameters have to accommodate for an increasing number of different temperature, precipitation, and mass balance combinations, which constrain the parameter estimates.
Based on this evaluation, we determine a subset of all the glaciers with mass balance measurements by requiring at least 15 values of MB measured , leaving 15 glaciers in the final set (see Table 1). 3Values derived from glaciers rejected from this final set are omitted in any further analysis.In the figures, these values are shown in faded colors.
Figure 5 (black dots) shows the results of the cross validation in terms of rmse.There is no apparent dependence of rmse on the number of available MB measured , indicating that the parameter optimization is robust.
Based on the cross validation results, we then calculate the skill score of the model using the individually optimized parameters where mse reference,cross is the mean square error of a reference model.As the reference model, we determine MB k,measured,cross for each year k by averaging over MB measured leaving out MB k,measured , in order to ensure that also the reference model is independent of MB k,measured .If the reference model is based on the observed climatology of the modeled variable (as is the case here), the skill score can be understood to measure the correlation between modeled and observed values, with penalties for bias and under-(or over-)estimation of the variance (see Wilks, 2006, for a detailed discussion).
The correlation coefficient r model , determined as the correlation of MB model,cross with MB measured , is listed together with SS model in Table 1 for all glaciers in the final set, and Fig. 6 shows the relation between SS model and rmse model,cross .The model shows considerable skill for all the glaciers in the final set.
In a last step, we determine a optimized and µ optimized for each glacier as the means of a optimized,cross and µ optimized,cross (see Table 1).

The benefits of a parsimonious model
Models similar to the model discussed here have been presented before, e.g. by Oerlemans and Reichert (2000) who determine seasonal sensitivity characteristics (SSC) as a measure of a glacier's mass balance sensitivity to monthly temperature and precipitation anomalies.They determine the annual mass balance anomaly MB as where T j = T j −T j,ref is the monthly temperature anomaly, and P j = P j /P j,ref is the precipitation anomaly.In order to use such a model for mass balance variability reconstructions, a total of 24 parameters (12 for C T , and 12 for C P ) has to be determined, as opposed to 2 parameters in our model (i.e. a and µ).Here, we will compare the approach of using as few parameters as possible (our model) to an approach based on the attempt to mimic seasonal effects by introducing more parameters, or generally to strive for better model performance by including additional parameters (e.g.Schöner and Böhm, 2007;Fischer, 2010).First, the parameters C T ,j and C P ,j have to be determined for all the glaciers in the final set.Here, for reasons of simplicity, we determine these SSC model parameters by calculating a multiple linear regression of MB on T j and P j , using the climatological monthly means of T j and P j (calculated from HISTALP data at the glaciers' locations) as reference values T j,ref and P j,ref .Note that this is not the approach taken by Oerlemans and Reichert (2000), who, having substantially more information on the glacier available, determine the SSC model parameters C T ,j and C P ,j by fitting a more complex model of the mass balance to the observed mass balance profile for a climatological setting, and subsequently testing the sensitivity of that complex mass balance model to changes in monthly temperature and precipitation.We use the multiple linear regression approach solely as a straw man substitute for minimal glacier mass balance modeling approaches that try to improve model performance by including more parameters.
As above for our model, we assess the SSC model by a leave-one-out cross validation.Therefore, we first determine N sets C k,T ,j,cross and C k,P ,j,cross for each of the glaciers (where N is the number of available mass balance measurements of the year k).We ensure each of the sets is independent of MB k,measured , by leaving out the years k ±t r,lag during the multiple linear regression.
Figure 7 shows the resulting C T ,j and C P ,j , together with their uncertainty determined as the standard deviations of C k,T ,j,cross and C k,P ,j,cross , for two exemplary glaciers (Hintereisferner, with 51 measurements of MB within the HISTALP period, and Griesgletscher with 42 measurements).The uncertainty of the parameters increases strongly with decreasing number of available measurements.The negative values of C P , and positive values of C T found for some months indicate that using a multiple linear regression for parameter estimation without considering the physical meaning is problematic, as e.g.positive values of C T imply that higher temperatures lead to a more positive mass balance.
Based on this cross validation, analogous to the routine described above, we also determine the skill score and the correlation SS SSC and r SSC of the SSC model.The results for all glaciers in the final set are listed in Table 1.
Both SS SSC and r SSC are considerably lower than the corresponding values from our model for most glaciers (see Table 1).Statistically, this can be understood as a result of lesser parsimony (i.e. more model parameters) of the SSC model, which leads to stronger dependence of the values of C T ,j and C P ,j on single measurements, and which is detected by the cross validation procedure. 4Physically, it can be understood as the extra demand on the SSC model to reproduce information which is readily available in the forcing (T j and P j ), but which has been removed from the forcing before it is fed to the model.E.g.C T = 0 during the winter implies that a positive temperature anomaly during the winter, no matter how strong, will not produce any changes in MB.But this is only the result of the model not experiencing warm enough temperature anomalies in winter during the training period, a fact which is an expression of the low climatological winter temperatures.By removing the seasonality from the forcing (i.e. using T j and P j as input instead of T j and P j ), the SSC model has to reproduce not only the connection between temperature and precipitation anomalies and MB, but also the mean climate and the seasonality of the forcing.
The disadvantages of lacking parsimony go unnoticed if the model is not validated with scrutiny: for illustrative purposes, we have also calculated the correlation r SSC,fitted by using C T ,j and C P ,j instead of C k,T ,j,cross and C k,P ,j,cross to determine MB SSC .The results are listed in Table 1.Calculating r SSC,fitted instead of r SSC , i.e. omitting an independent model validation -as is frequently done (e.g.Oerlemans and Reichert, 2000;Wildt et al., 2003;Fischer, 2010) -leads to a significant overestimation of the model's skill.

Extending the model to unmeasured glaciers
While the optimized model parameters differ between the glaciers, their standard deviation is close to an order of magnitude smaller than the parameters (σ (a optimized ) = 0.71, and σ (µ optimized ) = 22), indicating that the model may have skill even if applied to glaciers without any measured mass balances available for training.In order to test this hypothesis, we perform a cross validation of the mean model (i.e. the model employing the arithmetic mean of the parameters optimized for all the glaciers in the final set).
For each of the glaciers in the final set, we build a mean model that is independent of that glacier's mass balance measurements by excluding from the calculation of the mean model parameters the optimized parameters from that glacier.We then calculate the rmse of the mean model for that glacier, the results are shown in Fig. 5 (red dots).While the rmse of the mean model increases compared to that of the individually trained model, the error is not excessive (rmse mean model,cross = 674 ± 246 mm w.e., up from rmse model,cross 395 ± 101 mm w.e., see Table 1).As above, we also calculate the skill score of the mean model5 , and the correlation based on the cross validation; the results are listed in Table 1 (see also red dots and lines in Fig. 6).As is to be expected, the skill score of the mean model is considerably lower than that of the individually trained model.However, the skill that can be expected from the mean model (SS mean model = 0.34 ± 0.23) is still considerably higher than that of the less parsimonious SSC model, which furthermore is limited to application on glaciers with measured mass balances.
Note that the correlation coefficient from the cross validation actually increases slightly for most glaciers when the mean model is applied, which can be understood as the benefit of a strongly enlarged data basis of the mean model compared to those of the individually trained model.
Since the mass balance variability reconstructed by the mean model is not notably different from that of the individually trained model (red dots and red line in Fig. 3), we can infer that the reduction in model skill is based on an increase in model bias.This is confirmed by the determination of the model bias (see Fig. 8): while the mean bias for all glaciers is quite small for both the individually trained model (24 mm) and the mean model (94 mm), the spread between glaciers is considerably larger for the mean model (standard deviation of 506 mm) than for the individually trained model (standard deviation of 73 mm).This relatively unconstrained bias of the mean model poses strict limits on the applicability of the results.This is especially the case for glaciers without existing mass balance measurements, and where absolute values of the mass balance are necessary, such as driving an ice dynamics model, or calculating contributions to sea level change.But it does not narrow the applicability for studies concerned with spatio-temporal mass balance anomalies (see discussion in Sect.4).exemplary case.6 Figure 9a shows the observed (green) and reconstructed (black) time series of the annual mass balance, together with the rmse derived from the cross validation.The visual inspection mirrors the results of the cross validation procedure: while there is generally good agreement between measured and modeled mass balances, the variability of the model is slightly smaller than that of the measurements.Figure 9b, c shows the annual anomalies of temperature and solid precipitation from the climatological mean at the site of the glacier, calculated from the HISTALP data set used as forcing.In order to estimate their respective influence on the modeled mass balance, we repeat the reconstruction twice: in the first case, we replace the variable precipitation of the HISTALP data with the climatological monthly mean precipitation.This yields the mass balance time series shown in Fig. 9d, a measure of how temperature variability imprints on mass balance variability.In the second case, we replace the variable temperatures of the HISTALP data with the climatological monthly mean temperatures.This yields the mass balance time series shown in Fig. 9e, a measure of how precipitation variability imprints on mass balance variability.In order to determine the relative importance of temperature and precipitation on the reconstructed mass balance, we then calculate a running 10-yr window correlation between temperature-driven and precipitation-driven mass balance and the full reconstruction (i.e.correlation between Fig. 9a and d, and between Fig. 9a and e).The results are shown in Fig. 9f.With a few exceptions, temperature variability contributes significantly to the mass balance variability, while the contribution from precipitation variability is generally less important and only intermittently significant.

Individually trained model
This reflects the typical situation of a glacier situated in the inner region of the Alps (see Sect. 3.3), and is different for other glaciers (see Supplement).

Mean model
The result of applying the mean model to Hintereisferner is shown in Fig. 9g.Again, the green line shows the measured mass balance, while the black line shows the reconstruction.
Note that here, opposed to Fig. 9a, the reconstruction is derived completely independent of any measurements of Hintereisferner.Note that with a skill score of −0.94, Hintereisferner is one of the glaciers where the mean model performs comparatively weak (see Table 1, and Supplement).But the relatively high mean skill score of the mean model, as well as the high mean correlation, indicate that it is indeed meaningful to apply the mean model to glaciers without existing measurements of the mass balance.

Spatial attribution of variability to temperature and precipitation
In order to demonstrate potential applications, we employ the mean model to reconstruct time series of annual mass balances for all glaciers in the European Alps contained in the extended format of the World Glacier Inventory (WGI-XF) data base (Cogley, 2005) 7 .We then repeat the procedure for the derivation of the time series in Fig. 9d and e for each of these glaciers, and calculate the correlation between temperature-driven and full variability, and precipitationdriven and full variability over the entire time series, and for each of the glaciers.The results are shown in Fig. 10.On the scale of the whole alpine region, temperature slightly dominates precipitation as the driver of mass balance variability.But a pattern emerges which indicates that the further west, and the closer to the margin of the Alps the glaciers are situated, the more important precipitation becomes.This confirms the analysis of Kerschner et al. (2000), who found that during the Younger Dryas, glaciers in the inner Alps responded weaker to precipitation changes than those closer to the margin of the Alps.It also reflects the finding of Oerlemans and Reichert (2000) that glaciers with high annual precipitation tend to have a stronger sensitivity to precipitation changes than glaciers in dry climates (see also Ohmura et al., 1992;Kaser, 2001).
Another pattern emerges in the vertical: the higher the glacier terminus is situated, the stronger the influence of precipitation becomes relative to the influence of temperature (see Fig. 11): the correlation of precipitation-driven variability itself correlates with terminus altitude with r = 0.33, and the correlation of temperature-driven variability with r = −0.44.This result can be understood as a consequence of the freezing threshold that affects temperature and precipitation in opposite ways: the generally colder temperatures of higher glaciers imply that temperature variability on a higher situated glacier has to be strong for a greater amount of months in order to bring temperatures above freezing.But at the same time, precipitation variability, also because of the colder temperatures, shows through in the mass balance for a greater number of months.

Discussion
A similar approach for estimating the monthly glacier melt water runoff was taken by Kaser et al. (2010), based on the assumption of the glaciers being in equilibrium with the climatological mean forcing.Additionally, they did not distinguish between solid and liquid precipitation onto the glacier surface, which implies that the negative side of the mass balance equation −µ(max(0,T i,clim − T melt )) not only has to include melting of ice, but the runoff of liquid precipitation from the glacier as well.
The annual runoff in their model can then be understood as the sum of ice melt and liquid precipitation onto the ice 12 i=1 µ(max(0,T i )) = 12 i=1 µ M (max(0,T i )) + µ LP (max(0,T i )) (10) where µ M + µ LP = µ are parameters for melting and liquid precipitation onto the glacier, and where 12 i=1 µ LP (max(0,T i )) = P liquid (11) is the annual amount of liquid precipitation.This approach has the advantage that less detailed knowledge on the climatic forcing of the glacier is necessary, since information on the total monthly precipitation is generally more readily available and accurate than information on solid precipitation alone.
In order to determine the implications of not distinguishing between solid and liquid precipitation, we repeated the entire model calibration and validation process, including liquid precipitation into the positive side of the mass balance equation.Remarkably, doing so reduces the rmse of the model by 11 mm to 384 mm.While it is counterintuitive to include liquid precipitation into the mass balance, this result indicates that the amount of liquid precipitation includes information that is relevant for the mass balance, and that is not included in the temperature data.There are several reasonable explanations how liquid precipitation may influence the mass balance (e.g.directly by percolation and refreezing, which would also change the subsurface energy balance, or by changing the ablation via altering the surface energy balance, or it may include relevant information on more indirect influences, such as cloudiness), but from our model it is not possible to conclude which is the most important influence.Taking only solid precipitation into account disables the model to take advantage of that information.If a pure reconstruction of the mass balance anomalies is the objective, because of the lower rmse it is reasonable to follow the approach Kaser et al. (2010).However, since the parameter µ corresponds not only to melting of ice in their approach, the sensitivities to precipitation and temperature variability may be flawed.Therefore, if an attribution of mass balance variability to variability in the forcing is of interest (as is the case in our study), one cannot take advantage of the information contained in the amount of liquid precipitation.Huss et al. (2009) show that the parameters of a temperature index melt  model similar to our model vary considerably throughout the 20th century for glaciers in the Swiss Alps.While the small standard deviations of the parameter estimates obtained during the cross validation (Fig. 4) indicate that the temporal variabilities of a optimized and µ optimized are small, the parameters may be impacted by e.g.changes in insolation, or snow albedo caused through the deposition of black carbon.However, such errors introduced by time-varying parameters are included in the error estimate obtained by the cross validation, as long as the range of parameter variability outside the period of mass balance observations is not larger than the range within the period of observations.
While we are able to quantify the model performance in depth for the time period where measurements of the mass balance exist, it is worthwhile to consider effects that might act to reduce the quality of the reconstruction outside of the period of mass balance observations.One source of uncertainty in the reconstructed mass balance which we are not able to quantify during the cross validation is the increasing uncertainty of the forcing, i.e. the HISTALP data, further back in time.Auer et al. (2007) provide a detailed description of the sources and methods used to create the homogenized data sets of precipitation and temperature used here.After 1880, the number of stations, and therefore the station network density is nearly constant at present day numbers.Before that, there is a significant decrease in the number of stations, and the mean distance between stations in 1800 is approximately three times that of today.However, the HISTALP data do not include estimates of the error range, and we are therefore unable to quantify the increase in uncertainty in the past.This means that the error estimates of the reconstructed mass balance time series obtained during the period of mass balance measurements are likely to underestimate the error before 1880.
If major reorganizations in the atmospheric circulation around the Alps occurred during the reconstruction period, but not within the observational period, the model errors derived by the cross validation may be to small.On the other hand, the glaciers in the final set resemble a wide variety of different climatic settings and glacier geometries.The cross validation of the mean model shows that it is applicable both to glaciers in the Western Alps, exposed to the maritime influence, as it is to dry glaciers in the center of the Eastern Alps.This indicates that such changes in the atmospheric flow regime may be a lesser source of error, probably connected to the fact that our model, depending on only two parameters, gathers most information on the setting of the glacier from the forcing, and not from model parameters.
Since we calculate reference-surface mass balances (i.e. the mass balance that would have been observed if the glacier's surface topography had not changed -see Cogley et al., 2011, for a detailed definition), another interesting question is the role of changes in the glaciers' extent during the reconstruction period.Since the observed mass balances are affected by changes in the glaciers' extent, but the modeled mass balances are not, errors caused by changes in extent that are comparable to those changes that occurred during the observational period are already included in the model error derived by the cross validation.But if the changes outside the observational period are substantially larger, the error is likely underestimated.This point has been subject of discussion regarding the influence of Atlantic multidecadal temperature variability on Swiss glaciers' mass balance (Huss et al., 2010a).As Leclercq et al. (2010) point out, long time scale changes in the forcing are damped in the specific mass balance of a glacier, because the glacier responds to long time scale changes in forcing by adapting its area.There are two ways in which this source of error may enter our model presented here: (i) the long term variability in the measured specific mass balances, which we use to estimate the model parameters, is potentially underestimated with respect to a constant reference surface area.(ii) Changes in glacier extent that include a vertical displacement of the glacier terminus, and thus a change in temperature, are ignored by our model.Regarding (i), Huss et al. (2010b) show that for the reconstruction in Huss et al. (2010a), this error is only of minor importance.Since the length of the measured mass balance time series used in our study is significantly shorter than the time series used by Huss et al. (2010a), we conclude that this way of entry of the error is of minor importance in our reconstruction, but we are not able to quantify it.Regarding (ii), errors caused by changes in altitude similar to the changes that occurred in the period of mass balance observations are included in the error estimate obtained during the cross validation.It is likely that this error will be larger on longer time scales, and we are not able to quantify it using our method.Since the height of the glacier terminus does not react instantly to changes in the forcing, this error will be more significant the longer the time scales of interest are, and care should be taken when interpreting the reconstructed mass balance variability on time scales that are as long or longer than the typical time scale of glacier advance and retreat.But note that the 15 glaciers used in this study represent a wide range of glacier geometries, and that the cross-validation of the mean model effectively assesses the impact of not representing this variability.
The derivation of the model does not lend itself for a correction of a model bias.Consequently, as shown above, the model exhibits a significant bias at some glaciers.This does not impede the application of the model in analyses relying on mass balance anomalies, such as correlation-based techniques which are insensitive to bias.However, it poses limits to applications that rely on absolute values, or timeintegrated mass balances, such as driving a dynamic ice model, or estimating sea level rise.Reconstructions of mass balance variability have been presented before for a number of glaciers, e.g. by Schöner and Böhm (2007); Nemec et al. (2009); Huss et al. (2010a).In order to illustrate how ours relate to their reconstructions, Fig. 12 shows a direct comparison of the reconstructions of Schöner and Böhm (2007) for two glaciers, Nemec et al. (2009) for one glacier, and Huss et al. (2010a) for five glaciers with our reconstruction from the mean model.As can be expected from the cross validation, the variability of the reconstructions is very similar, with a correlation between the different reconstructions that is comparable to the correlation between our mean model and the measured mass balances as obtained from the cross validations.As discussed above, the main difference between the reconstructions is that ours, based on the mean model, suffers from a non-negligible bias (that has been corrected for clarity of the figure, but its value is given in the figure, where e.g."bias corrected by 807 mm" indicates that the mean model produces too positive mass balances).
Finally, we find it remarkable how well the model performs even for glaciers with only few mass balance measurements.The gray dots in Fig. 5 indicate that the individually trained model performs comparably well for glaciers with only 7 or 8 measured mass balances as it does for glaciers with 40 or more measurements.The motivation to exclude these glaciers from the final set was therefore not based on the evaluation of their performance, but on the lack of robustness of the parameter estimation that is evident in the large standard deviations shown in Fig. 4 (gray dots).This result encourages applying the model also on glaciers with few mass balance records, as long as the model trained on such a glacier is not transferred to other glaciers.

Fig. 1 .
Fig. 1.Dependency of the mean of rmse model,cross of all glaciers on the prescribed value of T melt .

Fig. 2 .
Fig. 2. Modeled versus measured mass balances for all measured years.Left: results for a = const.= 1.Right: results for a = a optimized (see Eq. 5).Gray dots in the right panel indicate the values of those glaciers that were rejected from the final set based on the results of the cross validation procedure.Black numbers show the results for all the glaciers accepted into the final set, gray numbers show the results for all glaciers.Values of r are based on the cross validation described in Sect.2.2.

Fig. 3 .
Fig. 3. Standard deviation of the annual mass balance plotted against the area of the glaciers.Green dots show MB measured , black dots show the values of the glaciers selected based on the cross validation procedure.Blue dots show the results for a = const.= 1, red dots show the result of the mean model.Horizontal lines show the respective means; the red line is dashed so that the black line is visible.
Fig. 4.Standard deviation of a optimized,cross (top) and µ optimized,cross (bottom) as a function of the number of MB measured .Black dots are the values of the glaciers accepted into the final set, gray dots are those of the rejected glaciers.The gray vertical line indicates the minimum number of mass balances necessary to yield robust results for a optimized and µ optimized .

Fig. 5 .
Fig. 5. rmse model,cross as a function of the number of MB measured .Black (red) dots are the values of the glaciers accepted into the final set, gray (light red) dots are those of the rejected glaciers.Red dots show the result of the mean model.The gray vertical line indicates the minimum number of mass balances necessary to yield robust results for a optimized and µ optimized .

Fig. 6 .
Fig. 6.SS model as a function of rmse model,cross .Black (red) dots are the values of the glaciers accepted into the final set, gray (light red) dots are those of the rejected glaciers.Red dots show the result of the mean model.Vertical and horizontal lines show the respective mean values.

Fig. 7 .
Fig. 7. Parameters of the SSC model as derived by multiple linear regression for Griesgletscher (left) and Hintereisferner (right).(Light) shading indicates (double) standard deviations calculated from the cross validation.

Figure 9 Fig. 8 .
Figure 9 summarizes the results of the reconstructed mass balance time series for the glacier Hintereisferner as an

Fig. 9 .
Fig. 9. Timeseries of the reconstruction for Hintereisferner.(a) green: MB measured , black: MB model , light gray shading: ± 2•rmse model,cross , dark gray shading: ± 1•rmse model,cross .(b) annual temperature anomaly at the location of the glacier.(c) annual precipitation anomaly at the location of the glacier.(d) red: MB modeled using full variability in temperature, but climatological monthly precipitation, light red shading: ± 2 • rmse model,cross , dark red shading: ± 1 • rmse model,cross .(e) blue: MB modeled using full variability in precipitation, but climatological monthly temperatures, light blue shading: ± 2 • rmse model,cross , dark blue shading: ± 1 • rmse model,cross .(f) red: correlation in a running 10-yr window between d and a, blue: correlation in a running 10-yr window between e and a, correlations below the 95 % confidence intervall are omitted from the plot.(g) green: MB measured , black: MB mean model , light gray shading: ± 2 • rmse mean model,cross , dark gray shading: ± 1 • rmse mean model,cross .

Fig. 10 .
Fig. 10.Maps of correlation between reconstructed mass balance variability and mass balance variability driven only by temperature variability (a) and mass balance variability driven only by precipitation variability (b), for all glacier within the Alps included in the WGI-XF.Circled dots indicate the location and values of the measured glaciers included in the final set.Background shading shows the topography of the HISTALP data set.

Fig. 11 .
Fig. 11.Correlation between reconstructed mass balance variability and mass balance variability driven only by temperature variability (red) and mass balance variability driven only by precipitation variability (blue), for all glacier within the Alps included in the WGI-XF, as a function of the terminus altitude.Circles (crosses) show the values of the measured glaciers included in the final set, corresponding to red (blue) dots.

Table 1 .
Summary of skill scores, correlation coefficients, and parameters of the different models.The column SSC fitted shows the correlation coefficient of the SSC model when derived from the fitted model results instead of the cross validation.A negative skill score implies that the model has no skill over the reference model.For the calculation of the mean, negative skill scores have been replaced with 0. b The model has more degrees of freedom than mass balance measurements exist for this glacier; in the calculation of the mean correlation, these values were ignored. a