Anthropogenic climate change versus internal climate variability: Impacts on Alpine snow cover

Snow is a sensitive component of the climate system. In many parts of the world, water, stored as snow, is a vital resource for agriculture, tourism and the energy sector. As uncertainties in climate change assessments are still relatively large, it is important to investigate the interdependencies between internal climate variability and anthropogenic climate change and their 15 impacts on snow cover. We use regional climate model data from a new single model large ensemble with 50 members (ClimEX LE) as driver for the physically based snow model SNOWPACK at eight locations across the Swiss Alps. We estimate the contribution of internal climate variability to uncertainties in future snow trends by applying a Mann-Kendall test for consecutive future periods of different lengths (between 30 and 100 years) until the end of the 21 century. Under RCP8.5, we find probabilities between 15% and 50% that there will be no significantly negative trend in future mean snow depths over 20 a period of 50 years. While it is important to understand the contribution of internal climate variability to uncertainties in future snow trends, it is likely that the variability of snow depth itself changes with anthropogenic forcing. We find that relative to the mean, inter-annual variability of snow increases in the future. A decrease of future mean snow depths, superimposed by increases in inter-annual variability will exacerbate the already existing uncertainties that snow-dependent economies will have to face in the future. 25


Introduction
In large parts of the world water, stored in snow is a vital resource for water management with regard to agriculture and power generation. Snow cover extent and duration is also a premise for winter tourism. As part of the climate system, snow influences the energy balance and heat exchange and is therefore a crucial component for land surface-atmosphere interactions (Hadley and Kirchstetter, 2012;Henderson et al., 2018). At the same time, snow is very sensitive to changes in the climate system. 30 Several studies have analyzed trends in historical snow cover, but there is not a uniform pattern across the world. While there are many regions where snow cover and depth are decreasing, there are also areas that show no trend or even increasing snow depths (Dyer and Mote, 2006;Schöner et al., 2019;Zhang and Ma, 2018). These contrasting findings can be attributed to spatial and temporal climate variability, from global to local scales.
In addition to studies dealing with historical snow trends, many studies investigate the potential impacts of anthropogenic 35 climate change on snowpack. The vast majority of those studies conclude that man-made climate change will significantly reduce snow cover. In a global analysis, Barnett et al. (2005) find that reduced snow cover will lead to severe consequences for future water availability. On the continental scale, Brown and Mote (2009) simulate a serious decrease in seasonal snow cover in a future climate. On the regional scale, Marty et al. (2017) and Verfaillie et al. (2018) compared the impact of different emissions scenarios on future snowpack in the Swiss and French Alps respectively and found a significant reduction under all 40 scenarios and for all elevation zones. Ishida et al. (2019) and Khadka et al. (2014) found that climate change will lead to severe shifts in snow regimes in California and Nepal, respectively. https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. However, the potential impacts of climate change on snow hydrology remain disputed, largely because of uncertainties attributed to future greenhouse gas emissions, model uncertainties and internal climate variability (ICV) (Beniston et al., 2018).
ICV is defined as the natural fluctuations in the climate system that arise in the absence of any radiative forcing (Hawkins and 45 Sutton, 2009). Typically, studies compare different emissions scenarios to tackle the uncertainties related to future greenhouse gas emissions and use a multi-model ensemble approach to estimate uncertainties related to model uncertainties (Frei et al., 2018;Marty et al., 2017). While future greenhouse gas emissions and model uncertainties are the subject of multiple studies (Kudo et al., 2017), only very few studies investigate the impact of ICV on snow (Fyfe et al., 2017).
When using a multi-model ensemble approach, it is difficult to quantify ICV impacts or separate contributions from ICV and 50 external forcing since it is very challenging to distinguish between model uncertainties and ICV. The reason for this is that inter-model spread is commonly derived from the complex coupling of different model structures, parameterizations and atmospheric initial conditions (Gu et al., 2019). Nevertheless, a few studies estimated the fraction of uncertainty in the hydrometeorological process chain ranging from different emission scenarios to the applied impact model and found that on shorter timescales, ICV represents the single most important source of uncertainty (Fatichi et al., 2014;Lafaysse et al., 2014). To 55 investigate the combined influences of ICV and man-made forcing (atmospheric concentration of greenhouse gases and aerosols), single model large ensembles, generated by small differences in the models' initial conditions, have been developed (Deser et al., 2012;Kay et al., 2015;Leduc et al., 2019). Those studies allow a probabilistic assessment of ICV. Deser et al.
(2012), for example, used a 40-member initial condition ensemble to estimate the contribution of ICV in future North American climate, or Fischer et al. (2013) used a 21-member single model ensemble to assess the role of ICV in future climate extremes. 60 Mankin and Diffenbaugh (2014) investigated the influence of precipitation variability on near-term Northern Hemisphere snow trends. Because of high computational costs, those ensembles are usually used on the scale of General Circulation Models; downscaled single model large ensembles are very rare. To our knowledge, only Fyfe et al. (2017) used snow water equivalent from a downscaled single model large ensemble to estimate the impact of ICV on near-term snowpack loss over the United States. Nevertheless, the combined effects of ICV and external forcing on snow remain insufficiently quantified and a single 65 model large ensemble has not yet been used to drive a snowpack model for regional impact studies.
While it is important to estimate the contribution of ICV to uncertainties in future snow trends, it is however as important to investigate the inter-annual variability (IAV), of snow itself (defined as the year-to-year deviation from a long-term mean (He and Li, 2018)), which is likely to change under a future climate (IPCC, 2013). While the response of IAV of snow depths to anthropogenic climate change can pose risks and increasing uncertainties for agriculture, power generation and winter tourism, 70 these processes are only inadequately studied. Again, a single model large ensemble can help answering how inter-annual variability might respond to changes in climatic forcing.
We state the following hypothesis (H) and aim at answering two research questions (RQ):  H1: ICV is a major source of uncertainty in trends of future Alpine snow depth.
 RQ1: What are the uncertainties in future trends in Alpine snow depth attributed to ICV? 75  H2: IAV of snow depth will change with anthropogenic climate forcing.
 RQ2: How does IAV of snow depth change with anthropogenic climate forcing?
To answer these questions, we use a, dynamically downscaled single model large ensemble to drive a state-of-the-art, physically based snowpack model for eight stations across the Swiss Alps. In the first part, we assess the ensemble mean change of snow depths in a future climate. In the second part, we assess the probabilities for a significant reduction in annual 80 mean and maximum snow depth in the presence of ICV. In the third part, we quantify how inter-annual variability of snow depth might change in a future climate. https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Case studies
This study assesses the interdependencies between ICV and anthropogenic climate change for eight stations across the Swiss 85 Alps. The choice of these case studies was driven by the availability of long-term observations needed for model validation and bias correction. The selected stations are considered representative, as they spread over the whole ridge of the Swiss Alps and cover the northern and southern parts of the mountain range and cover elevations between 1040 m a.s.l. and 2540 m a.s.l.

The SNOWPACK model
SNOWPACK is a physically-based, one-dimensional snow cover model (Lehning et al., 1999). It was originally developed for avalanche forecasting (Lehning et al., 2002a), but is increasingly used for climate change studies (Katsuyama et al., 2017;Schmucki et al., 2015). SNOWPACK is highly advanced with regard to snow microstructural detail. The model uses a Lagrangian finite element method to solve the partial differential equations regulating the mass, energy and momentum 105 transport within the snowpack. Calculations of the energy balance, mass balance, phase changes, water movement and wind transportation are included in the model. Finite elements can be added through solid precipitation and subtracted by erosion, melt water runoff, evaporation or sublimation (Lehning et al., 2002a;Lehning et al., 2002b;Lehning et al., 1999;Schmucki et al., 2014). Recommended temporal resolutions range from 15 minutes (e.g. for avalanche forecasting) to three hours (e.g. for long-term climate studies). Minimum meteorological input for SNOWPACK is air temperature, relative humidity, incoming 110 long wave radiation, wind speed and precipitation. Due to the of lack measurements, incoming longwave radiation had to be estimated based on air temperature, incoming short wave radiation and relative humidity, using the parameterization by Konzelmann et al. (1994). Shakoor et al. (2018) and Shakoor and Ejaz (2019) applied this method for multiple sites and elevations and found that it gives reliable estimates. As wind induced gauge undercatch underestimates precipitation, especially for mixed-, and solid precipitation, we do not use the original measured precipitation data to run the model. As described in 115 Schmucki et al. (2014) precipitation was undercatch-corrected by applying a method developed by Hamon (1973), using a function of wind speed and temperature.
Soil layers are not included in our model setup. Therefore, ground surface temperature is determined as a Dirichlet boundary condition (Schmucki et al., 2014) and soil temperature is fixed at 0 °C. To account for site-specific characteristics, we calibrated roughness length and rainfall/ snowfall threshold temperature. For roughness length, we used values between 0.01 and 0.08. 120 For rainfall/snowfall discrimination, we used threshold temperatures between 0.2 and 1.2 °C, which lies well within the calibration ranges between -0.4 °C and 2.4 °C based on results from Jennings et al. (2018). The calibration was carried out individually for each site. A threshold of 50 % in relative humidity was set for all stations for rainfall/ snowfall discrimination. SNOWPACK usually operates on very high temporal resolutions. After an initial sensitivity analysis, to get better simulation results, the meteorological input was resampled within SNOWPACK to a resolution of one hour. Precipitation was evenly 125 disaggregated from a three-hourly time step to one hour, while the remaining parameters were linearly interpolated. The case studies were validated based on observed daily and monthly snow depths.

Simulation data
In this study, we use climate model data from a new single model large ensemble, hereafter referred to as the ClimEx large ensemble (ClimEx LE) (Leduc et al., 2019;von Trentini et al., 2019), to analyze the combined influence of ICV and climate 130 change on future snow depth. The ClimEx LE consist of 50 members of the Canadian earth system model (CanESM2) (Arora et al., 2011), which is downscaled for a European and North American domain by the Canadian regional model (CRCM5) (Šeparović et al., 2013). Each ensemble member undergoes the same external forcing and starts with identical initial conditions in the ocean, land and sea-ice model components, but slightly different initial conditions in the atmospheric model. The 50 https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. members of the CanESM2 originate from five families of simulations, each starting at different 50-year intervals of a 135 preindustrial run with a stationary climate and range from 1850 to 1950. In 1950, small differences in the initial conditions are used to separate each family into 10 members. After applying small atmospheric perturbations in the initial conditions each member evolves chaotically over time. Therefore, the model spread shows how much the climate can vary as a result of random internal variations (Deser et al., 2012). This makes all 50 members equally likely, plausible realizations of climate change over the next century. Until 2005 the ensemble is driven by observed forcing, from 2006 to 2099 all simulations are forced with 140 concentrations according to the RCP8.5 scenario (Moss et al., 2010).
The 50 members are then dynamically downscaled over Europe using CRCM5 with a horizontal grid-size of 0.11 ° on a rotated latitude-longitude grid, corresponding to a 12 km resolution (von Trentini et al., 2019). Detailed information on the design of the experiment can be found in Leduc et al. (2019). The ClimEx LE provides meteorological data on a 3-hourly temporal resolution. 145 In an inter-comparison experiment by von Trentini et al. (2019), the ClimEx LE was compared to 22 members of the EURO-CORDEX multi-model ensemble. It was found that the ClimEx LE shows stronger climate change signals and the single model spread is usually smaller compared to the multi model ensemble spread. Our analysis and Leduc et al. (2019) found a substantial bias between the ClimEx LE and observational data aggregated to the 12 km grid. Especially over the Alps, a strong wet bias was identified. With regard to temperature, we found a cold bias for most grid points covering the Swiss Alps. The 150 bias between model data and observations justifies the application of a bias adjustment procedure, which is explained in Sect.

Bias adjustment
As simulations from General Circulation Models are usually too coarse to be directly used as input in impact models, CanESM2 was dynamically downscaled to a 12 km resolution, which better represents regional topography and therefore regional 155 climatology. As practically all regional climate models still have a systematic model bias, and as our impact model simulates snow for a particular one dimensional point in space, another downscaling/bias adjustment step is needed to adjust systematic model biases and to bridge the gap between RCM simulations and the impact model. There exist multiple methods for bias adjustment and downscaling, and approaches are dependent on the study purpose. While e.g. the delta change approach is robust and easy to implement, the method neglects potential changes in variability and is therefore not suitable for our study. 160 In recent years, many studies have concluded that Quantile Mapping (QM) performs similar or superior to other statistical downscaling approaches (Feigenwinter et al., 2018;Gutiérrez et al., 2019;Ivanov and Kotlarski, 2017). When multi-model ensembles are bias corrected, usually each simulation is corrected separately, whereas bias adjusting a single model large ensemble requires specific considerations, as the chosen method should not only correct the bias of each individual member, but should also retain the individual inter-member variability (Chen et al., 2019). We apply a distribution-165 based quantile mapping approach to bias correct and downscale the model data to the station scale in one step, based on the daily translation method by Mpelasoka and Chiew (2009). Several studies tested this method and confirmed a reasonable performance (Gu et al., 2019;Teutschbein and Seibert, 2012). A downscaling to station approach was also performed for the official CH2018 Swiss climate scenarios (Feigenwinter et al., 2018). We modified the approach in the form that it was applied to a transient climate simulation ranging from 1980-2099 and, in contrast to previous studies that use daily scaling factors it is 170 based on sub-daily scaling factors. The last-mentioned modification was necessary, as SNOWPACK needs sub-daily meteorological input. The method does not take into account inter-variable dependency. As SNOWPACK is a physically-based model, physical inconsistencies in the data can lead to serious model errors. For example, precipitation occurring simultaneously with low humidity will result in model error. As we use a univariate bias adjustment approach, we also had to test the results for these inconsistencies. 190

Validation of bias adjustment and SNOWPACK
The performance of the applied bias adjustment approach, as well as the performance of SNOWPACK, and the snow simulations using the ClimEx LE as driver were validated prior to continued analyses.
To evaluate the performance of the bias adjustment approach, we compared the statistical characteristics of observed meteorological variables to the bias corrected ClimEx LE for subdaily, daily, monthly and yearly values and compared the 195 diurnal and annual regimes of the corrected parameters and obtained good results in the calibration period 1984-2009.
In terms of distributional quantities and climatologies this is to be expected, as the bias correction scheme was calibrated for exactly this period. Due to this reason and as the performance of the chosen bias correction method is already subject of the studies by Chen et al. (2019) and Gu et al. (2019) we do not present the validation exercise in detail. Instead we focus on a validation that is highly relevant for snow accumulation, namely on the performance of the bias adjustment with regard to the 200 snowfall fraction assuming the calibrated SNOWPACK rain-snow temperature thresholds. This is of special interest, as a univariate bias adjustment approach, as employed here, does not explicitly correct for biased inter-variable dependencies, which could, among others, affect the snowfall fraction.
SNOWPACK itself was calibrated for the period 1985 to 1989 and cross-validated for different five-year periods between 1990 and 2009 using observed meteorological input, which was compared to measured snow depths. All periods provide 205 similar results. For a clearer visualization, we only show results for the period 2000 to 2004. To assess goodness of fit, we compared daily and monthly measured snow heights with modelled snow heights simulated with observed meteorological input using several performance indicators such as mean absolute error (MAE), Nash-Sutcliffe coefficiency (NSC), coefficient of determination (R2) and index of agreement (d) (e.g. Krause et al. (2005) and Legates and McCabe (1999)).

Statistical analysis of simulated snow depth 210
In the first part of our analysis, we estimate the ensemble mean changes for annual mean and maximum snow depth between In the second part of the analysis, we estimate the uncertainties in future snow trends (and their drivers) emerging from ICV.
We apply the Mann-Kendall (MK) trend detection test (Kendall, 1975;Mann, 1945), to test for statistically significant trends 215 of different lengths of time-series starting in the year 2000 and ending between 2029 and 2099. The Mann-Kendall test is https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. frequently used in climatological-, and hydrological applications (Gocic and Trajkovic, 2013;Kaushik et al., 2020). The MK test statistic (S) is computed as: where and are the data points at times and ; represents the length of the time-series; ( − ) is a sign function 220 defined as: The null hypothesis of the MK-test is that a time-series has no trend (Libiseller and Grimvall, 2002). In this study, trend significance was tested for p-values of 0.01, 0.05 and 0.1. As air temperature below a certain threshold occurring simultaneously with precipitation is the most important prerequisite of snowfall (Morán-Tejeda et al., 2013;Sospedra-Alfonso 225 et al., 2015), we do not only test for future snow depth trends, but also for the drivers of future snow conditions. For that reason, we apply the Mann-Kendall test also to temperature, precipitation and snowfall fraction time-series.
In the third part of the study, we estimate the change in IAV. As measure for IAV, we use the standard deviations of annual mean/ maximum snow depths for the respective reference and future periods. The standard deviation is commonly used as a measure of IAV, such as in Siam and Eltahir (2017). All analyses were performed with the statistical software R (R Core 230 Team, 2017). All analyses in Sect. 3.3 to 3.5 are performed for mean and maximum winter snow depths. Winter is defined as the months October to March.

Validation of SNOWPACK
The validation results of SNOWPACK using meteorological observations as input are summarized in Fig. 2. From Table 2

Performance of bias adjustment and ensemble SNOWPACK simulations
Inter-variable dependence between precipitation and humidity, as well as incoming short wave radiation was not corrupted through the bias correction (e.g. there were no cases of precipitation and simultaneously low humidity, or precipitation and simultaneously high incoming shortwave radiation; not shown). We concentrate the presentation of our results on the impacts of bias adjustment on temperature and precipitation dependencies and consequently snowfall fraction, as this is an essential 255 part of the snow modelling process. Ulrichen by up to 16%. For the two stations Zermatt and Montana we find an even stronger underestimation by 33% and 50%, respectively. The results imply that there is no systematic error between observed and simulated snowfall fractions, but there are stations that show a significant underestimation of snowfall fraction compared to observations. The potential reasons for this underestimation are addressed in the discussion (Sect. 4). In Fig. 4 we show the long-term monthly mean snow depths for observations, simulations driven by observed meteorological input and simulations driven by the bias adjusted ClimEx LE. As already mentioned in Sect. 3.1, the comparison between observations and SNOWPACK simulations driven by observational data show a good model fit for all stations. For the stations 270 Adelboden, Engelberg, Davos, Weissfluhjoch and Scuol, the 50 ClimEx LE simulations enclose the observed snow depths for the calibration period of the bias adjustment. This implies a good performance of the bias adjustment. As shown above, the bias-adjusted dataset systematically underestimates snowfall fraction in Zermatt and Montana, resulting in a pronounced https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. underestimation of simulated snow depths. The two stations are not excluded from further analyses, but we have to clarify, that those results must be interpreted cautiously and that we cannot consider absolute snowfall fraction or snow depth values 275 for those stations.

Mean climate change signal 290
Both, the impacts of ICV on future snow trends and the changes of IAV under a given emissions scenario must be put into perspective to the ensemble mean climate change signal. To do so, we present the absolute and relative changes in mean and maximum winter snow depth between our reference-, and future periods. Fig. 6 visualizes the mean and maximum winter snow depths for the reference and future periods and the respective percentage change. For the ensemble mean, we find a continuous and significant decrease of mean and maximum snow depth over all stations. While absolute changes are partly 295 stronger for maximum winter snow depths, the percentage decrease in mean winter snow depth is more severe for all case studies. All stations below 2000 m.a.s.l., except Ulrichen, show a similarly strong decrease in mean winter snow depth. The decrease in the near future period is relatively small and for all stations but Scuol, there is at least one member that simulates a small increase in winter mean/ maximum snow depth. For the mid-and far-future periods, decreases range from -40% to -60% until 2069 and up to -60% to -80% until 2099. For Ulrichen, we obtain a slightly smaller decrease of up to -60% until 300 2099. For the high-altitude station Weissfluhjoch, decreases are considerably lower, ranging from -30% until 2069 to -50% until 2099. The percentage decreases for maximum snow depths are considerably lower compared to those for mean snow depths, with an ensemble mean ranging from -30% to -40% until 2069 and -40% to -60% until 2099 for all stations but Weissfluhjoch. For Weissfluhjoch we observe an ensemble mean decreases of -10% until 2069 and -15% until 2099. In the discussion, we set these values into perspective to results from other studies. 305 https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Significance of future snow depth trends
Despite the strong mean climate change signal described in Sect. 3.3, this section points out the important role of ICV for the detection of statistically significant trends in future time-series of snow depth, and its most important drivers temperature, https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. precipitation and snowfall fraction. Note that, according to the poor validation results, we cannot draw conclusions on the absolute values of snowfall fraction and snow depth for the stations Zermatt and Montana. Nevertheless, we can apply the 315 trend test and compare relative changes. Figure 7 visualizes the results of the Mann-Kendall test for a positive trend in winter mean temperature and winter precipitation sums and a negative trend for winter snowfall fraction, as well as winter mean and maximum snow depth for the lowest station Engelberg, and the highest station Weissfluhjoch for 1%, 5% and 10% significance levels. Each time-series starts in the year 2000 and ends in 5-year intervals between 2029 and 2099. Figure 8 shows the percentage of members with a significant Mann-Kendall test result for all case studies, based on the 5% significance level. 320 With regard to temperature, all case studies show a rapidly increasing percentage of significant positive trends. For a 30-year period, already between 45% and 60% of members show significant results. Until the year 2049 more than 90% of members show a significant positive trend and by 2059 all stations show a 100% trend significance. Here we can clearly conclude that the man-made climate change signal is significantly stronger than the ICV of temperature. For precipitation, we can see a totally different picture. For all stations but Scuol, there is no clear sign towards an increase in future winter precipitation sums. 325 There is a tendency towards an increasing number of members with a significant positive trend in precipitation towards the end of the century, but the percentage is below 50% for all stations but Scuol. For Scuol, there is a clear sign towards an increase in winter precipitation sums. Here, more than 75% of members show a significant positive trend in winter precipitation sums. In summary, we cannot detect a clear climate change signal for precipitation, because of strong ICV. With regard to snowfall fraction, we observe a consistent increase of members with a significant negative trend in snowfall fraction over time, 330 but, as expected, the strong temperature signal does not translate into a similarly strong trend significance and there are significant differences between case studies. Over a 50-year period, most stations show a significance of only up to 50% of members. The stations Davos, Weissfluhjoch and Scuol, the stations with the highest snowfall fraction in the reference period, show significant reductions for 75% to 85% of members over this period. This emphasizes the huge contribution of ICV to future trends in winter snowfall. Over a period of 60 years there is still a 35% chance of not detecting a significant negative 335 trend in snowfall fraction for Montana due to ICV, but by 2069 all stations show significant decreases in snowfall fraction for more than 90% of members. Similar to snowfall fraction, we find a steady increase in the count of members with a statistically significant negative trend in winter mean snow depth, but compared to winter temperature this development starts considerably later. Generally, the trend significance is stronger compared to snowfall fraction, but there is still a 50% to 25% chance (for all stations but Scuol, where more than 90% of members show significant negative trends) that ICV will superimpose 340 anthropogenic climate change impacts on mean winter snow depth over a period of 50 years. By 2069, the percentage of significant members increases to more than 90%. The stronger significance compared to snowfall fraction implies the combined effects of less snowfall and a more rapid snow melt. Lastly, maximum winter snow depth shows a significantly different evolution than mean winter snow depth. Here, we also obtain the largest differences between the lower lying stations and the highest station Weissfluhjoch. For most stations but Scuol, there is a probability of more than 50% of no significantly 345 negative trend in future maximum snow depth over a period of 50 years. For Scuol this probability is only 15%, while it is 80% for Weissfluhjoch. Over a period of 80 years all stations but Weissfluhjoch show a significant decrease in maximum snow depths in more than 90% of the cases. For Weissfluhjoch high probability remains that ICV will superimpose anthropogenic climate change. By 2049 the percentage of negative trends is below 20%, by 2079 the probability is still below 50% and even over a period of 100 years there is a 25% chance of no significantly negative trend in maximum winter snow 350 depth. This emphasizes that also in the far future single snow peaks can be expected, especially in high elevations.
Our results underline the outstanding contribution of ICV to uncertainties related to future trends in snow depth and its drivers.   https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Changes in inter-annual variability
In Sect. 3.4 we revealed the large contribution of ICV to uncertainties related to future trends in snow depth. However, the 365 variability of snow depth itself is likely to change with anthropogenic forcing. Here we investigate how IAV of mean and maximum snow depth, defined as the standard deviation of snow depth over a period of 30 years is likely to change under RCP 8.5. From Fig. 6 and Fig. 9 we can see that a gradual decrease of winter mean and maximum snow depth leads to a decrease in absolute IAV. Nevertheless, in relative terms (relative to the mean), IAV can strongly increase in the future, but there are differences between mean and maximum snow depth and at different stations. In the reference period, and at lower elevation, 370 relative IAV of mean snow depth lies between 30% (Scuol) and 70% (Engelberg) and relative IAV of maximum snow depth lies between 20% (Scuol) and 60% (Engelberg). For most cases, relative IAV of mean snow depth is larger compared to maximum snow depth. For Weissfluhjoch the overall variability is lower compared to the other stations (22% in reference period) and maximum snow depth has a larger variability than mean snow depth. For Weissfluhjoch neither for mean, nor for maximum snow depth an increase in relative IAV can be found. In contrast, an increase in relative IAV for the stations 375 Adelboden, Engelberg, Davos, Zermatt, Montana, Scuol and Ulrichen is projected. Larger increases of relative IAV are obtained for mean winter snow depth, while the increases for maximum snow depth are very small. For Davos, for example, we find an ensemble mean increase in relative IAV of mean snow depth from 35% to 55% until the end of the century. Increases in relative IAV of maximum snow depth ranges from 35% to 40%. For Montana and Scuol, IAV increases from 50% (40%) in the reference period up to more than 80% (70%) in the future 2 period can be observed. 380 https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Discussion
Many studies have significantly improved our knowledge about the cryosphere in a future climate (Barnett et al., 2005;Beniston et al., 2018). Nevertheless, the predominant number of studies focuses on changes in the mean, while studies on the interdependencies of climate change and ICV are very rare. Our analysis is the first study that uses a single-model large ensemble as input for a physically based snow model. This allows a probabilistic uncertainty assessment of future snow trends 390 in the European Alps attributable to ICV. We further estimate how IAV might change in a future climate. https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Uncertainties and limitations
While we are gaining important insights into the dynamics of mean, maximum and inter-annual variability of snow depth and the role of ICV under climate change conditions, a number of important uncertainties and limitations must be taken into account, which span over the whole modelling process. Important boundary conditions are that our results are highly dependent 395 on the choice of the emissions scenario and GCM-RCM combination, as well as the selected bias adjustment approach. First, it must be stated that the ClimEx LE is still unique regarding members and spatio-temporal resolution; like other currently existing single RCM initial condition large ensembles it is available under RCP8.5 only. Being aware of the extreme character of this GHG-concentration scenario and the high sensitivity of the GCM-RCM combination, the results obtained from the presented analyses are considered valid as they represent the expected dynamics and states of other emission scenarios, just 400 reach certain levels of change earlier in time.
Due to the single model approach, it is understood that the presented setup has limited capacity in providing a robust estimate of anthropogenic climate change; this is where multi model ensemble setups have clear advantages (Tebaldi and Knutti, 2007).
Comparing the detected climate change signals of the ClimEx LE with the EURO-CORDEX ensemble shows that the data used for this study provide a highly sensitive forced response, yet within plausible ranges (von Trentini et al., 2019). 405 Nevertheless, multi model ensembles make it very difficult to distinguish between model uncertainties and ICV, which is a major advantage of our approach, when the goal is to study ICV. Of course, it would be of interest to estimate model uncertainties and do a probabilistic analysis of ICV. To do so an ensemble of ensembles would be the preferred approach. Due to computational limitations such analyses are not yet feasible, especially on the regional scale.
Another source of uncertainty is the choice of the bias adjustment methodology. While quantile mapping was found to be 410 similar or superior to many other bias adjustment approaches (Gutiérrez et al., 2019;Ivanov and Kotlarski, 2017;Teutschbein and Seibert, 2012), it has some important drawbacks. QM assumes stationarity of the model bias structure, an assumption that is uncertain under changing climatic conditions (Maraun, 2013). Furthermore, QM cannot correct misrepresented temporal variability (Addor and Seibert, 2014). Therefore, temporal variability was validated in Sect. 3.1, yielding acceptable results.
When applied in a downscaling context, QM cannot reproduce local processes and feedbacks, as QM is a purely empirical 415 approach (Feigenwinter et al., 2018;Kotlarski et al., 2015). In contrast, QM can modify the raw climate change signal and simulated trends (Ivanov et al., 2018). This point is especially important in our study, as we have to correct each member based on the empirical distribution of the whole ensemble to retain the internal climate variability. Therefore, the climate change signals of the single members are modified. Cannon et al. (2015) developed a method that preserves the raw climate change signal, but applied to this study it would only preserve the ensemble mean signal. Further research is needed to develop 420 potential methods that preserve the climate change signal for single members from single model ensembles. As a last point, we employ a univariate bias adjustment approach, which treats all meteorological variables independently. While, intervariable consistency cannot be guaranteed (Feigenwinter et al., 2018), multiple studies show that QM generally maintains inter-variable consistency (Ivanov and Kotlarski, 2017). In the course of this work, inter-variable consistency was validated and we obtained good results for radiation, precipitation and humidity. Variable consistency with regard to snowfall fraction 425 was inaccurate for individual case studies. Prior to bias adjustment, a strong cold bias over most grid points caused snowfall fraction to be significantly too high. Therefore, bias correction generally improved the simulated snowfall fractions. The exploration of possible reasons for inaccurate snowfall fractions in some cases will be subject to future work. Schlögl et al. (2016) found that the uncertainties from the snow model itself account for approximately 15%. However, as we focus on investigating relative changes in snow depth, this source of uncertainty is of less concern. 430 Lastly, as most stations are situated in altitudes between 1320 m.a.s.l and 1640 m.a.s.l, a detailed analysis on altitude dependencies could not be performed. https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License.

Discussion of results in context to existing research and potential future research
Despite the above mentioned uncertainties and limitations, this study can provide important insights into the interdependencies between anthropogenic climate change and ICV and its impacts on snow depth. Its novelty stems from a true probabilistic 435 assessment of ICV. In the first part of our results section, we present the ensemble mean change between a reference period  and three future periods and found significant decreases of ensemble mean snow depths in the future. Schmucki et al. (2015) did a similar analysis for partly the same case studies using ten GCM-RCM model chains from the ENSEMBLES project under the IPCC A1B emissions scenario. Although the reference periods do not fully match (Schmucki et al. (2015) use 1984-2010), we can put the changes between the reference period and the mid-(2040-2069 in this study and 2045-2074 440 for Schmucki et al. (2015)) and far-future period (2070-2099) into perspective. For Weissfluhjoch, Schmucki et al. (2015) simulate a mean decrease of 28% (near future) and 35% (far future), which is close to our simulation results of -20% (near future) and -29% (far future) in mean winter snow depth and -30% (near future) and -42% (far future) for annual mean snow depth. Both studies show comparable decreases in mean snow depth at Weissfluhjoch, although our study uses the much stronger RCP8.5 compared to the A1B scenario in Schmucki et al. (2015). 445 In the near and far future, Schmucki et al. (2015) found, an ensemble spread of mean snow depth between 35-135 cm (near future), and 30-130 cm (far future), whereas we simulate an ensemble spread between 62-87 cm (near future) and 52-82 cm (far future) for winter mean snow depth and 48-70 cm (near future), and 39-60 cm (far future) for annual mean snow depth. While a regression-based analysis of different elevations is not possible due to the limited altitude ranges, we still find significant differences between stations, especially between the highest station Weissfluhjoch and the lower altitude case studies. With regard to trend significance, we can conclude that for all stations there is a non-negligible probability of hiatus 455 periods of mean and maximum snow depth of lengths up to 50 years. Still, those probabilities are highest for Weissfluhjoch, where we find a probability of more than 50% that there will be no significant reduction in future maximum winter snow depths over a period of 80 years.
We can observe an uneven response of different snow metrics to anthropogenic warming. Statistically significant trends are first detected for mean winter snow depth, followed by winter snowfall fraction and later still by winter maximum snow depth; 460 these results coincide with Pierce and Cayan (2013) and emphasize two points. First, also in the far future, there will be events of large snow accumulations, even under RCP8.5. Consequently, trend detections for maximum snow depths over periods of less than 50 years largely dependent on noise from ICV. Second, ICV remains the highest source of uncertainty over a short to medium range of time, but it can even hamper a statistically significant signal over periods of more than 50 years. On the other hand, with regard to future research, ICV cannot only reveal the possibilities of long hiatus periods, but it can also 465 illustrate even faster snowpack declines in the Swiss Alps.
In this study, we found that ICV does not only obscure the forced climate change signals, but that variability in terms of IAV itself is likely to change in the future. These findings not only support our understanding on the ranges of internal climate variability, they are particularly useful to distinguish the "noise" of climate variability from "real" climate change signals.
With regard to changes in IAV, Weissfluhjoch is the only station, where we cannot identify a change in the IAV relative to the 470 mean. For the remaining stations, we find that anthropogenic climate change has an impact on IAV. A thorough investigation of the causes of this change is beyond the scope of the resent work. We assume that snow rich and snow scarce winters are often dependent on general weather situations, such as large scale blockings (García-Herrera and Barriopedro, 2006), and also large scale oscillations caused by El Niño or the Arctic oscillation (Seager et al., 2010;Xu et al., 2019). These factors remain https://doi.org/10.5194/tc-2020-84 Preprint. Discussion started: 14 April 2020 c Author(s) 2020. CC BY 4.0 License. insufficiently studied and their identification could be subject of future research, which takes into account large scale synoptic 475 patterns from the ClimEx LE.

Summary and conclusions
In the present work, we analyzed the interdependencies between ICV and anthropogenic climate change and its impacts on snow depth for eight case studies across the Swiss Alps. For this purpose, we made use of a 50-member single model RCM ensemble and used it as driver for the physically based snow model SNOWPACK. The large number of members used in this 480 study allowed for a probabilistic analysis of ICV. We can confirm our first hypothesis that states that ICV is a major source of uncertainty in trends of future Alpine snow depth. By applying a Mann-Kendall trend test we estimate the trend significance of snow depth and its main drivers for time series of different length (i.e. different lead times). We present the probabilities of detecting significant trends caused by anthropogenic forcing in the presence of ICV and find that ICV is a major source of uncertainty for lead times up to 50 years and more. 485 We can also confirm our second hypothesis that states that IAV of snow depth will change with anthropogenic climate forcing.
To answer our initial research question, we compare inter-annual variabilities of snow between a reference period and three future periods, and find that relative to the mean, IAV of snow considerably increases in the future, for most cases but Weissfluhjoch.
Our results show how important it is to not only analyze changes in the mean snow depth, but also its variability, as it is a 490 dominant source of uncertainty, and because variability itself can change with man-made climate change. For all economies being directly dependent on snow, or being dependent on runoff from snowmelt, future climate impact assessments are, hence, subject to important uncertainties. On the one hand, climate change will significantly reduce snow cover, but to what extent remains disputed and ICV is one of the top sources of this uncertainty. On the other hand, in addition to a reduction of mean snow depths its variability is likely to change. This will additionally increase vulnerabilities of snow dependent economies in 495 the future.

Author contribution
All authors designed the research. FW performed the simulations and analysed the data. FW wrote the draft. All authors edited the paper.

Data availability
The ClimEx LE data analysed in this study is publicly available via the ClimEx project webpage (https://www.climexproject.org/en/data-access). The observational data sets, as well as the snow depth data for Switzerland are available for 510 research and educational purposes via the IDAweb by Meteo Swiss (https://www.meteoswiss.admin.ch/home/services-andpublications/beratung-und-service/datenportal-fuer-lehre-und-forschung.html). On request, the analysis code is available from the corresponding author.