A regional climate model hindcast for Siberia : analysis of snow water equivalent

This study analyzes the added value of a regional climate model hindcast with respect to snow water equivalent (SWE) for Siberia when compared to SWE estimates from forcing NCEP-R1. In addition, we examine the discrepancies of simulated SWE to several recent reanalysis products (NCEP-R2, NCEP-CFSR, ERA-Interim). We apply the regional climate model COSMO-CLM (CCLM) to a 50 km grid spacing using NCEP-R1 as driving force to obtain a 63 yr (1948 to 2010) gridded dataset of historical SWE. Simulated regional climate data is necessary because of the absence of station data in that region. To perform large-scale assessments we use the satellite-derived daily SWE product of ESA DUE GlobSnow from 1987 to 2010. Russian station SWE data is used for cross-checking the findings. In January (mid-winter), the SWE hindcast is in good agreement with GlobSnow, whereas it overestimates SWE during the melting season. CCLM shows a clear added value in providing realistic SWE information compared to the driving reanalysis. The temporal consistency of CCLM is higher than that presented by ERA-Interim and NCEP-R2.


Introduction
The main goal of the present study is to assess the additional information in the NCEP-R1 (National Centers for Environmental Prediction Reanalysis 1; also see Table A1 in the Appendix for the definition of acronyms) driven regional hindcast of SWE (snow water equivalent) compared to the SWE product of its driving reanalysis.Furthermore, we are interested in the quality of the hindcast compared to more recent reanalyses that are restricted to shorter time periods.The objective is to introduce an alternative multi-decadal climatol-ogy of SWE over six decades that can be used to investigate long-term changes and trends.Therefore, the widely used reanalysis NCEP-R1 was taken as forcing, which is the only reanalysis for downscaling purposes that extends back before 1979 and starts in 1948.
Terrestrial snow cover is a key component of the cryosphere and plays an important role in the entire climate system by modifying surface energy and water balance (Alexander et al., 2010;Cook et al., 2008).This extensive, rapidly and seasonally changing cryospheric variable (ACIA, 2005) is critical in shaping the land surface during the prolonged Siberian cold season (Bulygina et al., 2009).
The higher albedo of snow-covered compared to snowfree surfaces leads to an increased reflectance of solar radiation and a near-surface cooling (Stieglitz et al., 2003;Vavrus, 2007).Additionally, the low thermal conductivity of snow makes it a good insulator that limits the heat exchange between soil and the atmosphere.Changes in snow depth, extent, timing, duration and density have profound implications for soil temperatures and, therefore, for the permafrost thermal state (Shkolnik et al., 2010;Stieglitz et al., 2003;Zhang et al., 2005), ecology and biogeochemical cycles (Sturm et al., 2005).Moreover, snow cover plays an important role within the hydrological cycle that controls evaporation, water storage, soil moisture, river discharge and freshwater transport to the Arctic Ocean (Groisman and Amber, 2009;Troy et al., 2012;Yang et al., 2003).Numerous studies indicate that Siberian snow cover has the potential to influence largescale atmospheric circulation (Allen and Zender, 2011;Cohen et al., 2012).Evidence that Eurasian snow cover may feed back to Arctic and North Atlantic Oscillation was discussed by Alexeev et al. (2012).A review of recent studies of Arctic snow was published within the SWIPA (Snow, Water, Ice and Permafrost in the Arctic) report by Callaghan et al. (2011).
Currently, the analysis of long-term changes and trends of the snow cover characteristics of SWE for all of Russia is hampered because of the lack of reliable observational data (Ge and Gong, 2008;Bulygina et al., 2011).The availability of continuous, homogeneous in situ snow observations in Siberia is restricted because of a sparse meteorological network and incomplete data records (Brown et al., 2003;Khan et al., 2008;Serreze et al., 2003).
Regional hindcasts obtained using regional climate models (RCMs) are useful for filling the spatial gaps between sparse weather stations and deliver multi-decadal climatologies of various meteorological parameters -including SWE -on a uniformly spaced grid.These reconstructions provide dynamically consistent data that is continuous in time.Additionally, they offer greater spatial and temporal resolution than observations alone.To perform regional hindcasts, large-scale atmospheric fields of global reanalysis data are taken as initial and boundary conditions over a limited area (Giorgi, 1990;Giorgi and Mearns, 1999).This technique of dynamical downscaling allows a more detailed representation of regional aspects, e.g., land-sea contrast, local orography, land cover and small-scale atmospheric features.It is expected that this technique leads to a better description of regional climate than that presented by coarsely resolved global reanalyses.
The additional information leading toward a more realistic description of climate compared to the global driving data is called "added value" within the regional climate modeling community.Added value studies are crucial in the evaluation of dynamical downscaling techniques and assessment of relative skill of RCMs compared to their forcing data (Di Luca et al., 2012).An analysis must be undertaken to decide whether the additional computational effort of RCM simulation is justified.The higher resolution does not automatically result in more realistic detail because many variables are spatially quite homogeneous and are already well described in coarser reanalyses (Prömmel et al., 2010).Thus, there is no added value of RCMs per se.The value of RCMs depends on the physical parameterizations, experimental setup, the analyzed variable and location (Feser et al., 2011).Although a large number of studies have validated the RCM output and have demonstrated that RCMs can realistically simulate climate compared to observations (e.g., Früh et al., 2010), mostly they have not explicitly shown whether the capabilities of the RCM exceed those of global forcing data (Prömmel et al., 2010).
At present, there are only a few added value assessments of RCMs.These assessments primarily concentrate on temperature, precipitation, sea level pressure, wind or mesoscale atmospheric circulation systems.More realistic detail compared to driving reanalyses was achieved on regional scales, e.g., in cases of temperature with complex orography (Prömmel et al., 2010), orographically induced wind systems (Winterfeldt et al., 2010) or North Atlantic polar lows and East Asian typhoons (Feser and von Storch, 2008;Zahn and von Storch, 2008).
There have been several efforts to apply RCMs over Siberia.Most consider a pan-Arctic domain that includes northern parts of Siberia (e.g., Rinke et al., 2010).Within the SHEBA (Surface Heat Budget of the Arctic Ocean) project, an ensemble was evaluated to quantify the scatter among different RCMs and to assess the reliability of their Arctic simulations (Rinke et al., 2006).The Polar Weather Research and Forecasting (Polar WRF) model (e.g., Bromwich et al., 2009) was used to provide a high resolution (10 km) Arctic System Reanalysis for 2000-2011.Shkolnik et al. (2010) used the MGO (Main Geophysical Observatory) regional climate model for permafrost and snow cover studies.Furthermore, regional snow simulations over pan-Arctic or Siberia are performed using detailed snow pack models that are coupled to a land-surface scheme (Brun et al., 2012;Liston and Hiemstra, 2011) and are forced by global hydro-meteorological data.
Our study focuses on using the whole model system of the regional climate model CCLM (COSMO-CLM), including land-atmosphere interactions, to provide a hindcast for SWE over Siberia for the last decades.To obtain a climatology for the longest possible period , we use NCEP-R1 as driving global reanalysis.Other global reanalyses provide only noticeably shorter periods.We address the question if CCLM can provide an added value on a large-scale because of its own model physics and finer resolution in space and time compared to the SWE estimate of the forcing.Because there were some issues with the snow information directly from NCEP-R1 that were partly related to an erroneous snow cover analysis (Kanamitsu et al., 2002), we also assess the hindcast quality relative to a set of newer global reanalysis products (NCEP-R2, NCEP-CFSR and ERA-Interim) -even though they are not directly used as driving fields.
In our case, the dynamical downscaling technique using NCEP-R1 might derive multi-decadal gridded SWE fields covering a period from 1948 to present, for which no reliable evenly distributed SWE information exists.It is important to provide an alternative climatology to the existing datasets in that region, which exhibit considerable differences (Clifford, 2010;Khan et al., 2008).We focus on SWE as an important parameter within the hydrological cycle determining snowmelt runoff and, therefore, Arctic freshwater budgets.For this added value study, we use a satellite-derived SWE dataset as a reference to perform a large-scale assessment in areas in which in situ snow measurements are rare.ESA (European Space Agency) GlobSnow was chosen for the years 1987-2010 because it shows an improved accuracy of SWE compared to typical stand-alone passive microwave algorithms (Takala et al., 2011) and includes an estimate of the uncertainty of the SWE estimate per grid cell.To confirm the results, we use the SWE for the period 1979-1996 of the Former Soviet Union Hydrological Snow Surveys (FSUHSS) based on observations (Krenke, 2004).
Section 2 provides an overview of CCLM (the regional climate model), the hindcast simulation and reanalyses, satellite-derived SWE product of ESA DUE (Data User Element) GlobSnow, and used SWE measurement data.A description of the methods for the data analysis follows in the last part of this section.Results and discussion are presented in Sect.3. The last section provides the summary and conclusion.

Regional climate model hindcast
To perform the climate hindcast simulation, we apply the nonhydrostatic regional climate model CCLM (http://www.clm-community.eu,Rockel et al., 2008).CCLM is the climate version of the numerical weather prediction model COSMO (Steppeler et al., 2003), originally developed by the Deutscher Wetterdienst (DWD).
Because the standard model setup was optimized for simulations over Europe, its application over Siberia implies some changes in its configuration, e.g., the reduction of the minimal heat diffusion coefficient, which results in reduced mixing of the atmosphere to better reproduce winter temperatures in the high pressure system of the Siberian High.In order to better account for vertical temperature changes in Siberian permafrost soils, we add soil layers from the standard 13 m up to a total soil depth of 92 m in the multilayer soil and land surface model TERRA-ML (Jacobsen and Heise, 1982;Doms et al., 2011).Because of the importance of snow cover in Siberia, a multilayer snow model within TERRA-ML, introduced in a preliminary version by DWD, is used.
Several aspects of cryospheric processes are considered in the CCLM, e.g., falling snow melting, rain freezing, water freezing in the interception reservoir, snow melting in the snow reservoir, freezing and thawing of water, and ice in the soil layers.For the hindcast simulation, we have chosen two snow layers (a greater number of layers did not improve the results in our case), each described by its own temperature, water content, and porosity, according to the snow density.The snow temperature T sn changes with time according to Eq. (1): where ρ sn and C sn are the density and specific heat capacity of snow, λ sn is the heat conductivity of snow, L is latent heat of freezing, M and F are melting and refreezing rates, and R is radiative heating.The time rate of change of the specific liquid water content, W liq , is given by Eq. ( 2): and the specific total water (liquid and solid) content by Eq. (3): where q is the rate of liquid water percolation and P is the precipitation rate.Snow density may vary at any time step, according to the following equation: where ρ w , ρ i and ρ fr are the densities of water, ice and fresh falling snow respectively and σ (t) is gravitational compaction and compaction because of metamorphism.
The lower boundary condition of the soil model is prescribed by the climatological mean temperature of the lowest soil layer as the heat conduction equation is solved for the entire column consisting of snow and soil layers.A time dependent snow albedo is used and gravitational compaction and compaction of snow caused by metamorphism are also described.
Spectral nudging is applied to prevent the regional model from deviating from the prescribed large-scale state within the entire simulation domain (von Storch et al., 2000).The horizontal resolution is 0.44 • (approximately 50 km) in rotated coordinates with 40 atmospheric vertical layers.As global forcing for the initialization and the regional boundaries, we use NCEP-R1 (Kalnay et al., 1996) because it provides some of the longest temporal data coverage (from 1948 to present) among the reanalysis products.The regional hindcast of CCLM thus constitute a dataset of 1948 onwards and provides an hourly output of main meteorological variables.
Figure 1 presents the entire model domain of the CCLM hindcast simulation on a lat/lon grid.It covers a region in Siberia that spans from the Laptev Sea and Kara Sea to northern Mongolia and from the West Siberian Lowland to the border of Sea of Okhotsk.

Reanalyses
The general method of performing an added-value study is to assess the relative skill of RCM output against the considered parameter (the SWE, in this study) of the driving global reanalysis -here, NCEP-R1 (Kalnay et al., 1996;Kistler et al., 2001).For the intercomparison period from 1987 to 2010, we additionally compare the SWE hindcast to a set of SWE fields from recent reanalyses, including the following: the updated NCEP/DOE or R2 (Kanamitsu et al., 2002), the newest generation climate forecast system reanalysis (CFSR) (Saha et al., 2010) and ERA-Interim produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) (Dee et al., 2011).for Siberia and considered subregions on a lat/lon grid.

NCEP-R1
NCEP-R1 is available in a grid spacing of 1.875 • × 1.875 • (∼ 210 km) and 6-hourly SWE is provided on a T62 Gaussian grid.A 3-D variational TS2 scheme is used as spectral statistical interpolation and various observations, e.g., upper air rawinsonde observations of temperature, horizontal wind, and specific humidity; operational Television Infrared Observation Satellite (TIROS); Operational Vertical Sounder (TOVS); and vertical temperature soundings from NOAA are assimilated (Kalnay et al., 1996;Kistler et al., 2001).
Snow cover is based only on a weekly Northern Hemisphere snow cover analysis without snow depth.Therefore, maximum snow depth was set to 100 mm in an empirical formulation (liquid water equivalent) and no prediction of the snow accumulation by the model was used.Further errors in the snow cover analysis have been detected, such as the usage of the 1973 data for the period 1974-1994, an incorrect snowmelt term that led to an overestimation of the conversion of snow to water by a factor of 1000, and an erroneous moisture diffusion leading to incorrect snowfall in winter over valleys in high latitudes ("spectral snow" problem) (Kistler et al., 2001).We use the SWE data of NCEP-R1 for comparison in spite of the aforementioned problems to highlight the ability to add realistic detail to the global reanalysis via the technique of dynamical downscaling of atmospheric forcing fields using CCLM.

NCEP-DOE/ NCEP-R2
The errors discussed above in the NCEP-R1 were eliminated in the updated version of the NCEP-DOE or R2 reanalysis, covering the time period from 1979 to the present.Additionally, different snow budget diagnostics were introduced (Kanamitsu et al., 2002).The procedure to compute snow depth was handled differently than in R1.The model that predicted snow depth was no longer ignored.In the case of correspondence with snow cover observations (weekly Northern Hemisphere analyses of snow cover using satellite imagery), the snow depth of the model was used.Otherwise, the modeled snow depth was adjusted to the analysis.In that case, the snow was either deleted or added by applying the same empirical formulation as in R1.Using this scheme has the advantage of accumulating deep snowpacks (Kanamitsu et al., 2002).

CFSR
The climate forecast system reanalysis is available from 1979 to the present (Saha et al., 2010).This latest reanalysis of NCEP offers a coupled atmosphere-ocean-land surface-sea ice system with a spatial resolution of ∼ 38 km (T382) and 64 vertical levels for the atmosphere.Additional new features include the assimilation of satellite radiances and the integration of observed greenhouse gases, aerosols and solar variations.To produce daily analyses of snow depth over land, data from the Air Force Weather Agency's SNODEP model (Kopp and Kiess, 1996) and the NESDIS Interactive Multisensor Snow and Ice Mapping System (IMS) (Helfrich et al., 2007) were used.Since February 1997, both analyses of SNODEP and IMS were used in combination for the Northern Hemisphere.

ERA-Interim
The ERA-Interim is the latest version of the ECMWF forecast system that is available for 1979-2010 and covers many years of the GlobSnow product.In addition to the higher horizontal resolution of ∼ 80 km (T255), it includes improvements such as a 4-D variational assimilation system, variational bias correction of satellite radiances, new humidity analysis and improved model physics compared to the former ERA-40 reanalysis (Dee et al., 2011).These changes are expected to provide a better quality and more homogeneous analysis than that of the ERA-40 forecast.Certain problems were documented with respect to the analyzed snow and data processing.Errors occurred in the Cressman-based interpolation scheme inducing snow-free patterns in periods in which only sparse observations were available.Since July 2003, the ERA-Interim snow analysis has been constrained with the satellite-derived NOAA/NESDIS daily IMS snow-cover dataset for the Northern Hemisphere.Shortcomings in the pre-processing of this dataset that led to mistaken locations of the data itself, in addition to the land-sea mask and the orography were addressed.Dee et al. (2011) posited that this problem has caused errors in the snow analysis from July 2003 to February 2010.

ESA GlobSnow
We use ESA GlobSnow for assessing the skill and added value of CCLM relative to global reanalyses for Siberia because of its more sophisticated approach to retrieve SWE from passive microwave satellite data than given in standalone algorithms.An important point in selecting this product was the availability of an uncertainty estimate and the advantage of it being a gridded dataset for the entire Northern Hemisphere compared to single point station measurements.
The brightness temperature derived from different channels of passive microwave sensors on satellites makes it possible to provide daily information on SWE, snow depth and snow mass for the full spatial coverage under dry snow conditions beginning in 1978 (Derksen et al., 2012;Foster et al., 2005;Pulliainen, 2006).The SWE retrievals obtained by the space-borne passive microwave radiometer has the advantage of continuous wide swath, all-weather monitoring capabilities and being insensitive to cloud cover (Brown et al., 2010;Foster et al., 2005;Derksen et al., 2012).
However, standalone passive microwave SWE retrieval algorithms are highly uncertain, which limits the use of these datasets for model validation (Clifford, 2010;Takala et al., 2011).Certain snow properties affect microwave emission and scatter and make the extraction of SWE information difficult.For example, wet snow leads to increased microwave brightness temperature, whereas increases in snow grain size decrease the brightness temperature independent of any change in SWE.Additionally, vegetation cover, such as densely forested areas (e.g., the boreal forest in Siberia), can impact the accuracy of the SWE estimates and lead to underestimations (Foster et al., 2005).A common problem occurs with deep snow.The SWE retrievals tend to systematically underestimate the snowpack because of the changes in its microwave behavior (Foster et al., 2005;Pulliainen, 2006;Takala et al., 2011).
To overcome these problems, the GlobSnow consortium has introduced a new dataset of SWE that is based on an assimilation scheme that uses passive satellite microwave radiometer data and in situ measurements of snow depth (Pulliainen, 2006) in combination with a time-series meltdetection algorithm (Takala et al., 2009).The combination of these two algorithms yields information about SWE and of the extent of snow cover.The passive microwave data includes radiometer information of SSMR (scanning multichannel microwave radiometer) for 1979-1987, SSM/I (spe-cial sensor microwave/imager) for 1987-2002 and AMSR-E (advanced microwave scanning radiometer EOS) for the period 2003-2009.Additional station data of snow depth collected by ECMWF from national observing networks were used.The Helsinki University of Technology (HUT) semiempirical snow emission model was used to interpret the passive microwave radiometer data and to calculate the SWE estimates.A detailed description of these methods was published by Takala et al. (2011).
SWE estimates, the accuracy estimate and the information of snow extent were produced with a resolution of 25 km × 25 km grid cells in a Lambert's equal-area azimuthal projection for the Northern Hemisphere land surface.Mountainous regions were masked out because of poor algorithm performance in regions with strong orographic complexity (Takala et al., 2011).
A validation study of the GlobSnow SWE retrievals was performed for the years 1980-2010 (Takala et al., 2011).The SWE estimates were compared for Eurasia against INTAS-SCCONE snow course measurements (Kitaev et al., 2002).This study demonstrated RMSE values of 30 to 40 mm for SWE values below 150 mm.The uncertainty of SWE estimates increased RMSEs up to 45 mm for Eurasia when the complete dataset was assessed.Takala et al. (2011) also compared the performance of the SWE assimilation technique against the SWE retrievals of NSIDC global monthly SWE climatology (Armstrong et al., 2007), which are obtained by a standalone passive microwave algorithm.They found a clear improvement in RMSE and bias error.In their study, they acknowledge that further improvement is needed to better account for land cover and forest properties and the effect of lakes.

FSUHSS Data
To cross-check whether the results are valid using reference data other than that of GlobSnow, we compare the SWE estimate of CCLM with in situ observations of SWE provided by Former Soviet Union Hydrological Snow Surveys (FSUHSS) (Krenke, 2004).This dataset provides SWE measurements over a snow course transect near World Meteorological Organization stations.The observations are available from 1966-1996, were taken 3 times per month and represent an average of 20 measuring points.However, the station-based comparison is restricted to single point measurements, thus being sparse (particularly, in the northern parts of the model domain), while snow measurements suffer from uncertainties as well, e.g., because of wind-induced redistribution.Additionally, the results can be affected by the grid box versus station comparison; one grid box represents a mean area of ∼ 2500 km 2 .Therefore, no standard seasons are considered; the analysis is restricted to single months.We choose January and April as representative months of snow accumulation and the beginning of the melting period for the southern regions in which sufficient daily data over the long-term period of 1987-2010 is available.Unfortunately, no fall month representing the beginning of snow accumulation can be considered because of a shortage of daily data from GlobSnow over the considered years.Additional missing values occur over mountainous regions and water bodies.
The monthly mean values are calculated from daily SWE data.Missing values that occur in GlobSnow are excluded from all datasets before the monthly mean value of each dataset is calculated.In this study, we decide to use daily data from 1987 until 2010.1987 was the year when the SSM/I began to operate and daily data was available.Using SSM/I ascending and descending data, it is possible with daily data to cover all land areas north of ∼ 20 • N. The SWE product of GlobSnow also includes the information about the snow cover's extent (SCE) where 0 mm denotes snow-free areas and > 0.001 mm means areas with full snow cover (snow extent 100 %).A better choice would be to take a direct SCE dataset (e.g., the NOAA IMS SCE product) instead of using the GlobSnow SWE product to derive the SCE information because of the uncertainties in the wet/dry snow masking with a microwave radiometer time series.Because the NOAA IMS SCE dataset is used with the assimilation of ERA-Interim, for example, an independent intercomparison is not possible.
In the first step, we consider the spatial distribution of SCE -here the frequency of snow-covered days during April averaged for 1987-2010 after masking all daily datasets according to the relevant criterion.With this information, we can compare whether similar grid boxes are covered by snow or are snow-free.To obtain a quantitative comparison, the differences in snow cover frequency for all datasets are calculated against GlobSnow.Thus, all datasets are interpolated on the same spatial resolution as CCLM on a geographical grid.
To give an overview of the spatial distribution of SWE in that region, in a second step, we show spatial patterns of the mean monthly SWE for January and April of Glob-Snow, CCLM and ERA-Interim.Because we want to account for the underlying uncertainty information of the satellitederived SWE, we did not compare spatial monthly mean SWE fields of CCLM, ERA-Interim and reanalyses with GlobSnow directly.Instead, we would rather compute spatial averages of SWE and uncertainty estimates for several subregions and compare the different datasets within the calculated uncertainty range.Regional averaged analysis of SWE data is undertaken by decomposing the model domain into seven subdomains representing the Arctic (northwards of the Arctic Circle), sub-Arctic regions and those of the midlatitudes.The subregions are the following: Arctic-West (AW), Arctic-East (AE), Mid-West (MW), Mid-Mid (MM), Mid-East (ME), South-West (SW) and South-East (SE), as shown in Fig. 1.
The GlobSnow SWE data is originally available in EASE-Grid projection and the SWE of CCLM is interpolated into the geographical coordinate system at 0.44 • spatial resolution.The reanalyses data are kept on their original projection and spatial resolution and masks are used for selecting single regions.Multi-year monthly means, standard deviation and temporal correlation are calculated for all the months in which more than 20 yr of SWE of GlobSnow for the time period 1987-2010 are available.The time series presented here exclude the monthly mean if GlobSnow has more than 3 missing values.The uncertainty range of GlobSnow is calculated as the standard deviation of the accuracy estimates.
There are 2 types of data analyses conducted.On the one hand, the spatial patterns of snow cover frequency for April averaged over 1987-2010 are examined for all considered datasets.Additionally, we present the spatial distribution of mean monthly SWE for GlobSnow, CCLM and ERA-Interim.On the other hand, we consider area averages of subregions to evaluate the regional variations of all the different datasets for monthly, multi-year monthly and multi-year monthly standard deviation of SWE in January and April.
To determine a direct measure of association to GlobSnow the temporal correlation of monthly SWE is calculated using the Kendall rank correlation coefficient.We have chosen this non-parametric correlation because the monthly SWE of January and April does not follow a normal distribution.The statistical significance of the correlation coefficient is defined at the 95 % level.No temporal correlation of monthly SWE for April or January among the years is evident from the autocorrelation function of the observed dataset.
To compare in situ observations of SWE given by FSUHSS, we consider 2 subregions of the middle domain where most of the station data is available.We select those stations that are within these subregions and extract the corresponding nearest neighboring grid boxes of CCLM and ERA-Interim.The other subregions are disregarded due to the limited station number.Only days with SWE measurements are extracted from the gridded datasets.Monthly averages are calculated and averaged over all available stations and grid boxes per subregion.The comparison starts at 1979, when ERA-Interim begins, and ends with the data availability of FSUHSS at 1996.

Spatial patterns of snow cover frequency
The ability to reproduce the extent of the large-scale distribution of snow extent over Siberian land areas by CCLM and reanalyses is critical with respect to the surface albedo and, therefore, with respect to the amount of energy available to turbulent and radiant energy exchange.The onset of and the melting of snow during the transition seasons of fall and spring is of particular concern.With respect to the fall season, the data coverage of GlobSnow was not sufficient throughout the years, and we can only consider the spring period represented here through April.In Fig. 2 the absolute frequency of snow covered days of April averaged over 1987-2010 is first illustrated for the whole model domain for GlobSnow.The other panels illustrate absolute differences in the frequency for the single datasets against GlobSnow.In some areas, e.g., the Sayan Mountains in the southwest, the white grid boxes indicate missing values because the Glob-Snow product does not deliver data, which makes the evaluation for these regions impossible with this reference dataset.This is a significant disadvantage for this study because the potential of a RCM providing an added value is expected and particularly so in areas with strong orography.
During April, GlobSnow shows a snow-cover frequency of 90-100 % for more than half the region down to ∼ 55 • N. West of Lake Baikal, the snow line is located more northward than eastward of the lake.This pattern is similar to the long-term  monthly snow cover frequency of the Northern Hemisphere for April provided by the NSIDC (not shown here) that is derived from the Northern Hemisphere EASE-Grid Weekly Snow Cover and Sea Ice Extent dataset (Brodzik and Armstrong, 2013).
Absolute differences in the frequency of snow-covered days of CCLM to GlobSnow indicate that CCLM has up to 10 % more snow-covered days in the central section whereas it underestimates the frequency of snow-covered days by up to 10 % in the northwestern and northeastern parts of the Arctic and sub-Arctic regions.These features are also visible for all the reanalyses when compared to GlobSnow.In general, the spatial pattern of the differences of snow-cover frequency that the CCLM hindcast illustrates is similar to that shown by NCEP-R1.Compared to NCEP-R1 and NCEP-R2, CCLM shows more grid boxes with 20-40 % of days that are snowcovered south of 50 • N. CCLM, NCEP-R1 and NCEP-R2 underestimate the frequency of snow cover in South Siberia particularly in northern parts of Mongolia, south of Lake Baikal.Thus, the snow retreat takes place earlier than presented by GlobSnow.
A special feature of NCEP-R1 and NCEP-R2 becomes obvious for the coastal regions of Siberia in which the frequency of snow cover in April is lower than presented by GlobSnow, and some grid boxes show an underestimation.Here, CCLM can show an added value being in the same range of frequency of snow-covered grid boxes as GlobSnow.NCEP-CFSR and ERA-Interim with higher spatial resolution overestimate the snow cover frequency during April to be more pronounced at approximately 48-55 • N. The overestimation in large parts of that region is approximately 20-40 % and even 40-60 %; this indicates that the snowcover extent persists longer at that latitude.In the southernmost parts NCEP-CFSR and ERA-Interim show an underestimation but less pronounced than CCLM, NCEP-R1 and NCEP-R2.Here, snow-covered-days are less frequent than in GlobSnow during April, which makes the melting stronger.

Spatial distribution of mean monthly SWE
Figure 3 presents the spatial distribution of mean SWE during 1987-2010 for January and April for GlobSnow, CCLM and ERA-Interim.The uncertainty range of GlobSnow is disregarded, i.e., the SWE fields just serve as a general overview.During January almost the entire domain north of 50 • N is covered by snow with more than 25 mm SWE.The most pronounced values occur in the northwestern part of Siberia at up to 200 mm.Even higher values occur in that section during April with a slight northward shift.During April, the southern section with values from 0-25 mm are now more expanded to the north, which indicates the started melting period moving northward.It is evident in the GlobSnow data that the spatial patterns of SWE distribution are very smooth with low spatial detail despite the 25 km original resolution.Matias Takala, one of the authors of the GlobSnow product, comments (personal communication, April 2013): "Although one underlying reason for the smoothness is the kriging interpolation one has to bear in mind that assimilation algorithm is adaptive and thus other factors also do contribute the final result.In fact it is possible to add more spatial details by giving more weight to satellite interpretation of SWE but we spent a considerable time to adjust the parameters to get the most accurate estimates of SWE.We are in the process of developing next version of GlobSnow product and it will, among other, implement new version of snow emission model, taking into account different land use (taiga and tundra for example) and also take into account variable snow density.I am quite confident that we get some improvement in terms of spatial details too." By contrast, CCLM is able to add spatial detail.This is evident at the mountain ranges, e.g., at the highest elevation of the Central Siberian Plateau east of West Siberian Plain, the Stanovoy Range northeast of Lake Baikal, and the Verkhoyansk Mountains east of the Lena River basin.Here,  Brown and Mote (2009), who used the SWE climatology derived from the daily global snow depth from the Canadian Meteorological Centre.A more detailed comparison is restricted because of varying analyses and SWE patterns with less regional detail.Less orographic detail within the SWE patterns is visible, for example, in the Lena River basin in ERA-Interim because of the coarser spatial resolution.During January, ERA-Interim has more pronounced maximum values in the northwest of the domain along the western border of the Central Siberian Plateau than GlobSnow.This peak of snow accumulation in that area coincides with the pattern of SWE climatology  during January for ERA-40 presented by Clifford (2010).In April, the spatial pattern of CCLM shows higher SWE values in mountainous terrain than ERA-Interim.This might be explained by the effect that precipitation increases with higher resolutions and improved representation of complex topographical features as shown by Giorgi and Marinucci (1996) for Europe.Kunz and Kottmeier (2006) discussed the overestimation of precipitation with respect to orographic lifting.Also Rojas (2006) found large positive precipitation bias at high altitudes in South America.She suggests that this effect is related to better representation of steeper mountain slopes that influence the divergence of the horizontal wind flow, the vertical ve-locity, and the precipitation on the upward slope and at the top of the mountains.

Regional characteristics of SWE
To analyze the added value of the SWE hindcast relative to the forcing and the quality compared to more recent reanalyses, we consider spatial averages for several subregions.This makes it possible to analyze all the datasets together with the uncertainty range given by GlobSnow.We compare the longterm means of January and April given by CCLM, GlobSnow and reanalyses averaged over 1987-2010, which represents a characteristic monthly SWE during snow accumulation and melting period.except the southern domains.No latitudinal variation occurs in NCEP-R1 for January and only marginally in April in the southern regions.GlobSnow (as a reference) shows the contrary, i.e., a north-south gradient from the Arctic to the southern subregions with less SWE of Arctic regions than in the middle domains and decreasing values southward.
In January, the higher values of AW compared to AE (which can be observed in all datasets except NCEP-R1) match with climate conditions during the winter in Siberia.From November to March, Siberia is dominated by the Siberian high pressure system centered southwest of Lake Baikal (Przybylak, 2003).The relatively infrequent eastward propagating cyclones with moist air masses from the Atlantic occur mainly in the northern regions.The decreasing moisture source explains the decline in snow and SWE eastwards of AW and MW.MW presents the highest value of SWE that is evident from GlobSnow, ERA-Interim and CCLM.This region is located in the West Siberian Lowlands in which the Central Siberian Highlands act as barrier and favor orographically induced solid and fluid precipitation.In the direct dataset comparison it is evident that CCLM reproduces SWE well for January compared to GlobSnow, whereas NCEP-R1 is clearly outside the uncertainty range given by GlobSnow, except for SW and SE.The poor performance of NCEP-R1 is related to erroneous snow analysis, which was previously documented by Kanamitsu et al. (2002) and further discussed in the study by Khan et al. (2008).However, this shows the added value of CCLM compared to NCEP-R1 for January and the benefit in using this RCM to generate more realistic SWE than its driving reanalysis provides.
NCEP-R2 is in better agreement with GlobSnow than NCEP-R1, except for SE.ERA-Interim reproduces the regional SWE distribution of satellite-derived SWE but tends to overestimate SWE.The largest discrepancies occur for the MM region, where ERA-Interim is outside the uncertainty range of GlobSnow.Except for AW, the ERA-Interim is in less agreement with GlobSnow than with the CCLM hindcast.In the subregion SE, NCEP-R2 presents higher SWE than ERA-Interim.This coincides with the results found for the Amur River basins in Khan et al. (2008), in which they compared the SWE of NCEP-R2 with ERA-40.NCEP-CFSR shows the regional variations of SWE but underestimates SWE for all subregions and is even outside the uncertainty range for the middle Siberian regions.Therefore, we conclude that CCLM for January provides an even more realistic dataset than ERA-Interim and CFSR.
In April, CCLM overestimates SWE in all subregions compared to GlobSnow data.CCLM is even outside the uncertainty range of GlobSnow for the subregions AW, MW, MM and ME, which shows higher values than ERA-Interim.ERA-Interim is clearly in better agreement with the satellitederived SWE data for AW, MW and SW, although SWE is overestimated as well.This overestimation is particularly pronounced in the regions MM and ME.
The best agreement between NCEP-R2 and ERA-Interim is evident for SE in April, whereas the differences were higher in January.This is similar to the study by Khan et al. (2008) comparing NCEP-R2 and ERA-40 for the Amur River basin.
NCEP-R1 shows again almost no regional variations of SWE and is outside the uncertainty range of GlobSnow except that the regions SW and SE are close to the values presented by GlobSnow.In most regions south of the Arctic Circle, the beginning of the snowmelt period is indicated by decreasing SWE values of GlobSnow.
For all subregions, NCEP-R2 is in good agreement with GlobSnow and within the uncertainty range.For the regions MW, MM and ME, almost no variations are evident.Except for NCEP-R1, CFSR has the lowest SWE for all subregions and falls off the uncertainty estimate of GlobSnow for MW and MM.Additionally, only small regional variations are obvious.
The overestimation during melting of the snow pack, which is the case in southern regions, is a common feature of climate models.Various state-of-the-art global climate models overestimate the snow mass of the Northern Hemisphere, particularly in the spring (Clifford, 2010;Raeisaenen, 2008;Roesch, 2006).As noted by Roesch (2006), the reasons for the surplus of snow amount and delayed melt in spring might be excessive snowfall rate, temperature biases and poor representation of the snowmelt processes.Another reason might be related to the absence of subgrid snow cover heterogeneities that lead to a snow cover that does not gradually abate (Liston, 2004).Deficiencies in the snow's melting processes because of missing fractional snow cover leading to overly high albedo and the overestimation of precipitation (not shown here) are reasons that are evident in CCLM.
Even though CCLM produces considerably more SWE for several regions than GlobSnow, an added value can be observed in terms of the regional variations that are more realistically described in CCLM than in NCEP-R1.Nevertheless, it might highlight also shortcomings in snow cover simulations, especially during melting seasons, that must be addressed in future work.However, the overestimation may also be related to the bias for GlobSnow to underestimate SWE under deep snow conditions, as discussed in Takala et al. (2011).This aspect will be considered in Section 3.4 with the comparison of the FSUHSS ground data.
To determine a measure of association provided for SWE between CCLM, reanalyses and GlobSnow, we calculate the temporal correlation of the monthly mean SWE using the Kendall rank correlation coefficient.Statistically significant coefficients at the 95 % level of confidence are marked with black hachures in Fig. 4b.In January for all subregions, CCLM shows significantly high correlations with a maximum of approximately 0.8 in ME.Except for SW and SE, ERA-Interim shows higher correlations than CCLM.We find lower correlations of NCEP-CFSR for all subregions.NCEP-R1 even shows negative correlations of approximately -0.2 in MM.
In April, CCLM shows for all subregions statistically significant correlations between 0.3 and 0.7.Except for AW and MW, higher correlations are given by ERA-Interim, NCEP-R2 and CFSR.It is notable that CCLM shows higher rank correlation coefficients than ERA-Interim with GlobSnow because of better agreement in rank orders, although CCLM overestimates the multi-year mean SWE for April for MW, MM, ME, and SW.Despite the low long-term mean, the April SWE of NCEP-R1 in certain regions, such as MW, ME and SE witnessed correlations between 0.4 and almost 0.6.

Interannual variability of SWE
In the previous section, it was evident for the multi-year monthly means that CCLM is in good agreement with the remote-sensing-derived SWE during the cold season but overestimates SWE in April.
To assess the added value of CCLM in terms of interannual variations of these characteristic months Fig. 4c provides the multi-year monthly standard deviation.In NCEP-R1 almost no interannual variations occur for January and April.The deviation of the long-term monthly mean in January is approximately 8-15 mm for GlobSnow.A good agreement to GlobSnow is given by CCLM, which tends to slightly overestimate the standard deviation, particularly for the MW region.CCLM captures well the regional characteristics of long-term monthly standard deviation in April compared to GlobSnow with an overall slight overestimation, particularly for AE, ME and SW.In terms of the two considered months (January and April) CCLM provides more realistic detail, thus, it provides an added value to NCEP-R1 and a higher quality than NCEP-R2.NCEP-R2 shows the highest discrepancy compared to GlobSnow, which is particularly pronounced for ME in April.High standard deviations are also  Because of the erroneous snow processing of NCEP-R1, as discussed in Kanamitsu et al. (2002), the SWE for the entire considered Siberian region can be regarded as unrealistic.This shows a clear added value of CCLM during the cold season, with GlobSnow as reference data.The approach of dynamical downscaling of NCEP-R1 reanalysis using CCLM can add realistic details in terms of SWE because of its own model physics of 1987-2010 to NCEP-R1.Even compared to ERA-Interim for 2003-2010, NCEP-R2 and NCEP-CFSR CCLM is in better agreement with GlobSnow.This clear added value cannot be observed in April.By contrast to January, the CCLM-simulated SWE for April shows an overestimation for all considered subregions.In most of the years, CCLM is even outside the uncertainty range of GlobSnow, except for the southeast region.Here, ERA-Interim is in better agreement with GlobSnow, whereas in certain subregions the sudden change in the presented time series is again visible after 2003, which leads to higher SWE values than estimated by GlobSnow.Even though CCLM overestimates SWE, this overestimation is consistent with time; this is in contrast to ERA-Interim and NCEP-R2, which show temporal inconsistencies in their SWE estimates.
As discussed in Sect.2.3.1, the SWE estimate of Glob-Snow might be problematic during the melting period.To confirm whether the overestimation of CCLM during April is not caused by the potential erroneous estimate of GlobSnow, we additionally compare the SWE hindcast and ERA-Interim against in situ measurements of the FSUHSS dataset.Station data of the 2 subregions MM and ME are considered as in the part of the region where the overestimation of CCLM is pronounced and most of the station data is available.Unfortunately, the datasets end in 1996, i.e., the time series overlap only for 8 yr.
Figure 7 presents the differences of monthly mean SWE averaged from all available stations and the corresponding grid boxes selected for the subregions MM and ME for January and April 1979-1996.The number of stations with available data changes over time in MM (between 5 and 19 in January and between 8 and 19 in April) and ME (between 4 and 14 in both months).Throughout the considered time period, ERA-Interim mostly overestimates SWE in January, both in MM and ME, with a maximum in ME for 1990 of 70 mm.For the subregion ME, CCLM varies between an over-or underestimation within the range of −20 to 30 mm but is in general in better agreement with the FSUHSS data These features are consistent with the time series presented in Fig. 4. in which GlobSnow was used as reference data.More notable is that the overestimation of CCLM during April is also visible when FSUHSS transect measurements are used as a reference.Except for 1990-1993 in MM and 1993 in ME, CCLM presents higher SWE with a maximum in 1996 of more than 100 mm.We can conclude that the overestimation of CCLM is a general feature and not dependent on the reference dataset of GlobSnow.

Summary and conclusions
A regional climate model hindcast of CCLM has been obtained over the past 60 decades by means of dynamical downscaling of NCEP-R1.The aim is to provide a better gridded dataset with enhanced spatial resolution and temporal availability than that provided by satellites and global reanalyses in that data-sparse region of Siberia.This study demonstrates the potential and limitations of the hindcast for the example of SWE as an important parameter in that domain.On the one hand, it contains an assessment of the added value with respect to the SWE estimate given by NCEP-R1 itself and, on the other hand, it provides an intercomparison between the CCLM data and further global reanalyses.The aspects examined in this paper include frequency of snow coverage, spatial distribution of mean monthly SWE, regional characteristics and interannual variability of SWE for January and April.As reference data, we choose a satellite-derived SWE product of ESA GlobSnow to perform the assessments of the regional SWE hindcast area wide.
In terms of the spatial distribution of the frequency of snow cover during April, CCLM is in good agreement with Glob-Snow presenting (particularly in the coastal areas) more days with snow cover than NCEP-R1 and NCEP-R2.The greatest discrepancies to GlobSnow occur at the southernmost extent of snow cover for ERA-Interim and NCEP-CFSR.
Compared to GlobSnow, CCLM indicates a clear added value in representing more realistic information for SWE compared to NCEP-R1, according to the spatial distribution of the considered mean monthly SWE.CCLM provides more spatial detail along the Lena River basin, for example, than GlobSnow and ERA-Interim.During January and April, CCLM captures the location of maximum snow accumulation given by GlobSnow but tends to overestimate SWE during April, mainly along the Central Siberian Plateau.This might be related to the effect of increased precipitation at higher resolutions or delayed melting and intensive snow accumulation because of the overestimation of snowfall rate or poor representation of the snow's melting processes.The absence of fractional snow cover within CCLM might also be plausible.
The SWE product of NCEP-R1 does not represent any of the regional and temporal variations of SWE for the considered subregions, except in southern parts.We can provide more realistic historical SWE fields for the past 60 yr than NCEP-R1 offers, which justifies the computational effort in applying a regional climate model.This added value compared to NCEP-R1 was expected because of erroneous SWE fields of NCEP-R1 that are already well documented.It also shows, however, that the technique of dynamical downscaling of atmospheric forcing fields (e.g., pressure, wind, etc.) provided by NCEP-R1 can be used to derive SWE fields back to 1948 with more realistic information than the reanalysis product itself can present.It is possible because of the own model physics of the RCM, e.g., snow parameterization and finer resolved regional features, such as orography and land cover.This is evident for the entire SWE field even for mean values.
Because of the known poor quality of NCEP-R1 data, we additionally compare our output against the SWE fields of newer reanalyses datasets in order to see if we also can compete with these datasets.We can show that the SWE of the regional hindcast is more homogeneous in time than ERA-Interim in presenting a spurious jump in 2003 that becomes obvious in certain subregions.A temporal inconsistency is also evident in NCEP-R2 near 1999-2001, which explains the highest multi-year monthly standard deviation among the considered datasets.The CCLM hindcast of SWE can even compete with the newest generation of NCEP reanalysis (CFSR) at 38 km resolution that underestimates SWE in many subregions.Particularly in periods of snow accumulation, the CCLM hindcast is in better agreement with Glob-Snow.However, as clearly shown by the SWE overestimation of CCLM in April (both compared to GlobSnow and to the snow survey measurements of FSUHSS), there is still an obvious model deficiency that must be addressed to justify the RCM application even in snow-dominated cold regions such as Siberia.
It should be stressed that the results are dependent on the quality of the reference data of ESA GlobSnow.The used GlobSnow product shows coarse SWE patterns, which were among others caused by kriging interpolation methods.The spatial detail might be improved in future products in which, e.g., variable snow density and different land use types will be taken into account (Takala et al., 2011).
Nevertheless, this study shows that the regional CCLM hindcast of SWE can add more realistic information to the global product of NCEP-R1 and provide a better quality in temporal consistency compared to many of the recent reanalyses for the years after 1987.It is important to demonstrate the discrepancies between existing global reanalyses and to propose an alternative climatology of historical SWE.Using atmospheric fields of NCEP-R1, it is possible to derive a regional dataset of historical SWE fields over the past six decades that is not provided by newer reanalyses.Potential temporal inconsistencies of NCEP-R1 before 1987 due to changes in the observing systems (e.g., the use of satellite observations since the late 1970s) and their impact on the regional climate model hindcast have to be assessed in a further study.In general, a regional multi-decadal dataset is necessary in the data-sparse region of Siberia to assess long-term changes and trends with more spatial detail.This may help to improve the understanding of snow-climate relations in a region in which snow has the potential to feed back to the climate of the whole Northern Hemisphere.

Fig. 1 .
Fig. 1.Orography [m] of model domain of CCLM (colored area) for Siberia and considered subregions on a lat/lon grid.
Fig. 4. (a) Regional variations of multi-year (1987-2010) monthly mean of SWE [mm] for January and April for CCLM, NCEP-R1, NCEP-R2, NCEP-CFSR, ERA-Interim and GlobSnow; (b) temporal correlation of monthly mean of SWE (1987-2010) of each dataset versus GlobSnow, statistical significant coefficients defined at the 95 % level are marked with black hachures; (c) multi-year monthly standard deviation of SWE for 1987-2010.The gray shades with hachures represents the uncertainty range of GlobSnow.Locations of considered subregions (AW, AE, MW, MM, ME, SW and SE) are presented in Fig. 1.

Fig. 5 .
Fig. 5. Time series mean January SWE [mm] (1987-2010) for all considered datasets for different subregions.The gray shaded area represents the uncertainty range of GlobSnow.Data gaps occur where GlobSnow provides SWE with more than 3 missing days per month.These months are excluded in all datasets.

Fig. 7 .
Fig. 7. Differences of monthly mean SWE [mm] of CCLM and ERA-Interim compared to FSUHSS measurements for the subregions MM and ME for January and April.All available transect data and corresponding grid boxes are averaged over each considered subregion

www.the-cryosphere.net/7/1017/2013/ The Cryosphere, 7, 1017-1034, 2013 1022 K. Klehmet et al.: A regional climate model hindcast for Siberia: analysis of snow water equivalent 2.4 Methods To
assure reasonable comparison with model data, we use the daily L3A-product (v1.2) as one of the available products with no postprocessing applied (e.g., a 7-day sliding time window aggregation).This daily product of GlobSnow has the disadvantage that it contains several missing days, and in certain months, e.g., May, June and September, data availability is reduced to single days because the assimilation algorithm was not able to produce good SWE output for certain dates.The reasons for erroneous retrievals include missing data of weather stations or unusable satellite data.Particularly in late spring and early autumn, problems with SWE retrieval occur because of difficulties in using radiometer data when a thin snow layer or wet snow is predominant.
Absolute frequency of snow covered days over land points of model domain during April between 1987-2010 for GlobSnow (upper row left) and differences for the remaining datasets against GlobSnow.White boxes within the model domain indicate missing values given by GlobSnow.