Evaluation of snow depth and snow-cover over the Tibetan Plateau in global reanalyses using in-situ and satellite remote sensing observations

The Tibetan Plateau (TP) region, often referred to as the Third Pole, is the world highest plateau and exerts a considerable influence on regional and global climate. The state of the snowpack over the TP is a major research focus due to its great impact on the headwaters of a dozen major Asian rivers. While many studies have attempted to validate atmospheric re-analyses over the TP area in terms of temperature or precipitation, there have been –remarkably– no studies 20 aimed at systematically comparing the snow depth or snow cover in global re-analyses with satellite and in-situ data. Yet, snow in re-analyses provides critical surface information for forecast systems from the medium to sub-seasonal time scales. Here, snow depth and snow cover from 4 recent global reanalysis products are inter-compared over the TP region, namely the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 and ERA-Interim re-analyses, the Japanese Re-Analyses (JRA-55) and the NASA Modern-Era Retrospective analysis for Research and Applications (MERRA-2). The 25 re-analyses are evaluated against a set of 33 in-situ station observations, as well as against the Interactive Multi-sensor Snow and Ice Mapping System (or IMS) snow cover and a satellite microwave snow depth dataset. The high temporal correlation

coefficient (0.78) between the IMS snow cover and the in-situ observations provides confidence in the station data despite the relative paucity of in-situ measurement sites and the harsh operating conditions.
While several re-analyses show a systematic over-estimation of the snow depth or snow cover, the reanalyses that assimilate local in-situ observations or IMS snow-cover are better capable of representing the shallow, transient snowpack over the TP region. The later point is clearly demonstrated by examining the family of re-analyses from the ECMWF, of which only the 5 older ERA-Interim assimilated IMS snow cover at high altitudes, while ERA5 did not consider IMS snow cover for high altitudes. We further tested the sensitivity of the ERA5 land model in off-line experiments, assessing the impact of blown snow sublimation, snow cover to snow depth conversion and, more importantly, excessive snowfall. These results suggest that excessive snowfall might be the primary factor for the large overestimation of snow depth and cover in ERA5 reanalysis. Pending a solution for this common model precipitation bias over the Himalayas and TP, future snow reanalyses 10 that optimally combine the use of satellite snow cover and in-situ snow-depth observations in the assimilation and analysis cycles have the potential to improve medium-range to sub-seasonal forecasts for water resources applications.

Introduction
Often referred to as the Third Pole, the Tibetan Plateau (TP) region is the world highest plateau, with an average elevation of 15 4000 m above sea level. Due to its spatial extent, elevation and geographical position in the mid-latitudes, it exerts a considerable influence on regional and global climate. The formation and variability of the Asian summer monsoon in particular, is affected by the TP through thermal and mechanical effects (Wu et al., 2012(Wu et al., , 2015Xiao and Duan, 2016), with remote impacts both downstream (e.g., Zhang et al., 2004;Xue et al., 2018) and upstream ( Lu et al., 2017). In autumn or winter, snow anomalies over the TP have been linked to wave trains extending downstream over East Asia and the Pacific 20 Li et al., 2018). Given its importance for climate and given that climate change strongly affects the region (Yang et al., 2014), the cryosphere over the TP is closely monitored in terms of glacier shrinking (Yao et al., 2012;Treichler et al., 2018), lake expansion (Zhang et al., 2017;Treichler et al., 2018) or change in the snowpack (Chen et al., 2017;Wang et al., 2017;Wang et al., 2018). The extent and variability of the snowpack over the TP has been a major focus of investigation because of the role of snow in the surface energy balance and in the hydrological cycle, as well as its potential 25 There is, nevertheless, a substantial number of meteorological stations over the TP region, operated by the Chinese Meteorological Administration (CMA), which have been used in many climatological studies (e.g., for recent studies, Basang et al., 2017;Li et al., 2018). Since most of the stations are located in inhabited valleys, below 4000 m, in the southeast TP, the representativeness of these in-situ data for the TP as a whole is questionable.
Advances in satellite remote sensing have however provided invaluable information on the state of this high-elevation 5 snowpack (Basang et al., 2017;Yang et al., 2015;Menegoz et al., 2013), often through blending data from multiple instruments to combine the high spatial resolution of optical data with the all-weather capability of microwave (MW) data.
Since 2004, NOAA has provided an operational, daily, multi-sensor snow cover product (the Interactive Multisensor Snow and Ice Mapping System or IMS) at a high-resolution (4 km), by combining optical, infrared and MW satellite data and surface observations (Helfrich et al., 2007). The accuracy of the IMS product compared to station data over the TP has been 10 evaluated by Yang et al. (2015) and Li et al. (2018), who found the pixel matching accuracy (snow or no snow) to be over 90 %. Another widely used satellite optical product is the Moderate-resolution Imaging Spectroradiometer (MODIS) 8-day snow cover (Riggs and Hall, 2015;Basang et al., 2017). Joint analysis of MODIS and station data by Basang et al. (2017) indicated that, despite large day-to-day snow cover fluctuations and the non-synchronicity of these two datasets, there is a high temporal correlation (0.77), if the station data is pre-processed in 8-day bins in a similar fashion as the MODIS data, 15 i.e., if the station data is considered snow-covered in the 8-day period when snow covered for at least one day. Nevertheless, Basang et al. (2017) also showed that the spatial distribution of snow cover is uneven: the snow cover fraction is less than 21 % for 70 % of the TP area, and yet it can be up to 40 % in the eastern part of the TP. They also indicated a discrepancy in the seasonality of the maximum snow cover between station data and MODIS: the former had higher snow cover in the spring, when precipitation increases and falls as snow in the southeast TP. On the contrary, MODIS had a higher winter snow cover, 20 more representative of the mean conditions over the plateau. These latter discrepancies point out at large spatial differences in the characteristics of precipitation. Snow depth products are also available from satellite remotely sensing. Snow depth retrieval from MW observations is difficult in regions with complex topography, hampered by non-heterogeneity of snow grain size and of vegetation cover. Few MW snow depth products have been thoroughly evaluated over the TP, a region characterised by a sparse and rapidly melting snowpack where, given the small snow grain size of fresh snow, the MW 25 retrievals can lead to large errors (Dai et al., 2017). Nevertheless, a long-term MW snow depth product has been developed to account for the special conditions of the TP by Che et al. (2008) and Dai et al. (2017).
Atmospheric re-analyses comprehensively assimilate in-situ, balloon-borne, aircraft and satellite observations into a forecast model, and form an essential tool in meteorological and climate research (e.g., Bronnimann et al. 2018;Fujiwara et al., 2017). Furthermore, re-analyses are sometimes used as initial conditions for seasonal hindcasts or reforecasts. A thorough 30 up-to-date description and intercomparison of atmospheric re-analyses is found in Fujiwara et al. (2017). On the other hand, data assimilation in land surface models is performed separately from the atmospheric data assimilation due to the different nature of observations and methodologies involved (Hersbach et al., 2018). Early land re-analyses were performed as offline simulations without any actual assimilation of observations, but land data assimilation is rapidly evolving. For example, the newest operational seasonal forecasting system 5 (known as SEAS5, Johnson et al. 2018) and the Integrated Forecast System (IFS) used for medium-range prediction at the European Centre for Medium-Range Weather Forecast (ECMWF), both rely on a land surface data assimilation system using a 2-dimensional optimal interpolation, which blends IMS snow cover and in-situ snow depth observations with the model background (de Rosnay et al., 2014(de Rosnay et al., , 2015. This evolution is partly motivated by the renewed interest in tapping into the potential of the land surface, and snow in particular, to improve the 5 prediction skill at the subseasonal-to-seasonal time scale (Orsolini et al., 2013). Concerning the TP region more specifically, Lin et al. (2016) assimilated snow cover fraction from MODIS and/or water storage from GRACE in a land model and showed improvement in seasonal atmospheric temperature prediction resulting from the improved land conditions. The impact of realistic land initialization over the TP region on the Indian Summer Monsoon (ISM) has also been investigated (Senan et al., 2016;Halder et al., 2017;Rai et al., 2015). In particular, the impact of springtime initialisation on predicting 10 the ISM onset was investigated by Senan et al. (2016) using the coupled ocean-atmosphere seasonal forecasting System 4 of the ECMWF. They found that, in years with anomalously high mean snow depth over the TP, as determined from the ECMWF land re-analyses, the ISM onset was delayed by about 8 days. Half of this delay was attributed to the initialization of snow over the TP region, highlighting the importance of land initialisation over that region for seasonal forecasting of the ISM onset. 15 While many studies have attempted to validate atmospheric re-analyses over the TP area in terms of temperature, precipitation or snowfall (Wang et al., 2012;Palazzi et al., 2013;Zou et al., 2014;Viste et al., 2015), there have been remarkably few studies aimed at evaluating and comparing the snow depth or snow cover in the re-analyses. Snow in longterm climate reanalyses over the plains of central and northern Russia have been evaluated against station data, showing surprisingly good performance (Wegmann et al., 2017). However, the terrain there is not as complex as over the TP. 20 Previous studies of snow evaluation over the TP have focused purely on the remote sensing snow products (Basang et al., 2017;Yang et al., 2015;Dai et al., 2017). We argue that evaluation of snow re-analyses is of interest per se, since these reanalyses provide initial, critical surface information for the forecast systems, and link surface conditions with atmospheric dynamics.
The aim of this study is to compare snow cover and snow depth over the TP region among a set of modern atmospheric and 25 land re-analyses, and to evaluate them against in-situ snow observations and selected satellite-based remote sensing products. A second goal is also to better characterise the snow biases in the re-analyses and to identify their origin.
Sensitivity experiments are then carried out with a land surface model to assess the potential relevance of snow processes in driving the identified biases.

2 Data and methods
Maps of our study area with the orography at the resolution of the atmospheric re-analyses are shown in Figure 1. We make use of 3 re-analyses produced by ECMWF, namely the latest generation of atmospheric re-analyses (ERA5) (Hersbach et al., 2018), its older counterpart (ERA-Interim, hereafter ERA-I) (Dee et al., 2011), as well a land re-analysis obtained by running the ECMWF land model (HTESSEL) off-line, forced by the ERA5 meteorological forcing. Here, we refer to the latter as ERA5L-CTRL since it is not yet the officially released version of ERA5-Land expected to become available online in 2019 (https://climate.copernicus.eu/climate-reanalysis). In addition, we also make use of the Japanese Re-Analyses (JRA-55) (Kobayashi et al., 2015), and of the NASA Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA2) (Gelaro et al., 2017). We did not carry out an exhaustive inter-comparison of all existing global re-analyses, but 5 we chose the aforementioned analyses because they belong to the latest generation (ERA5, MERRA-2, JRA-55), or else because they incorporate either local snow observations over the TP region (JRA-55) or satellite snow cover observations that encompass the TP (ERA-I, JRA-55). Some general characteristics of these re-analyses are shown in Table 1, and more detailed information about the treatment and assimilation of snow variables is provided in the Appendix. Further information about the re-analyses systems can be found in Fujiwara et al. (2017) and Wright et al. (2018). For the evaluation of re-10 analyses, we use station data, the multi-sensor IMS snow cover product and satellite MW data, as detailed below.

Station Data
We obtained five years (2009 to 2013) of in-situ daily meteorological observations from 33 stations over the TP from CMA.
The observations comprise minimum and maximum temperature, snow depth and total precipitation. The measuring time is 00:00 UTC (0800 Beijing time). The operational procedures dictate that, when more than 50 % of the ground surrounding 15 the weather station is covered by snow, then the snow depth (SD) is measured. When converting in-situ snow depth to a snow cover fraction (SCF), a 100 % SCF is assumed when snow depth exceeds 1 cm, according to the in-situ observing rules of CMA (CMA, 2003). The in-situ observations also provide the snow cover days (SCD), i.e. the number of days when the snow covers more than 50 % of the ground in the sight from the station. The geographical locations of the stations are shown on Figure 1, along with their altitudes in comparison to the topography resolved by the four atmospheric re-analyses. Most of 20 the stations are located in inhabited valleys, below 4000m, in the southeastern part and are not representative of the TP region as a whole. There is only one station in the western part of the TP, west of 85 E. In-situ observations have several sources of uncertainty. Here, we highlight two sources: (i) the stations might not be fully representative of their local surroundings due to the complex nature of the terrain, and (ii) the quality of the records could be affected by the harsh operating conditions. For example, strong winds limit the instrument ability to record the amounts of falling snow or solid 25 precipitation, a phenomenon called undercatch.

Interactive Multisensor Snow and Ice Mapping System (IMS)
We use SCF from IMS, a multi-instrument, near-real time daily product covering the Northern Hemisphere with a pixel resolution of 4 km. IMS provides a binary (1/0) snow cover information: either 1 if more than 50 % of the 4-km pixel is 30 covered by snow, or 0 (snow free) otherwise. Since, in this paper, we compare IMS data to re-analyses, we aggregate the 4km product to a 0.25-degree grid, comparable to highest horizontal resolution among the re-analyses. This is done by counting the number of pixels with a value of 1 in a grid-box, assuming they represent 100 % cover -this gives the "high estimate". If we assume that a value of 1 represents only 50 % of the 4 km pixel, we obtain the "low estimate", for which the maximum SCF possible is 50 %. These two estimates provide a range of values, reflecting the uncertainty inherent to aggregating the 4-km binary data, e.g., a value of 1 in a pixel means 50% to 100% snow coverage.
The choice of IMS snow cover is motivated by its use in the ECMWF analysis system. A key point is that it is used 5 differently in ERA-I (Drusch et al., 2004) than in ERA5 (de Rosnay et al., 2014Hersbach et al., 2018). First, the high-resolution 4 km product is used in ERA5, while the coarser resolution 24 km-product is used in ERA-I. More importantly, IMS data is not used above 1500 m in ERA5, i.e., in high altitude regions such as the TP, while it was still in use in ERA-I. When comparing the snow cover in IMS with the one in ERA-I or ERA5, one has to recall that these are not independent datasets. Nevertheless, the different usage of IMS data between ERA5 and ERA-I allows to highlight the 10 importance of snow cover analysis and assimilation over the TP. In that region, in-situ data available for the numerical weather prediction community for real time applications is still scarce.

Snow depth
Figure 2 presents a comparison of daily SD over the 5-year period (2009-2013) between the observations, comprising both in-situ station and MW data, and the 4 re-analyses collocated at the station coordinates. The comparison is an average over the 33 stations. The in-situ observations, even in the mean over all available stations, reveal a very thin and rapidly 25 fluctuating snowpack, with episodes of fast build-up and decay, followed by periods void of snow, even in mid-winter. This concurs with previous studies showing that large parts of the TP can remain snow-free even in winter (Basang et al., 2017;Li et al., 2018). A quantitative estimate of the discrepancy between in-situ daily observations and the other datasets as a rootmean-square error (RMSE) is presented in Fig. 4 (left panel). The RMSE is calculated for the two years 2009 and 2010 when satellite MW data is available. We also calculated the temporal correlation matrix between the different SD datasets (Table  30 2), using daily data year-round over the 5-year period, except for the satellite MW data where only the two years 2009 and 2010 are used. It is clearly apparent that, with the exception of MERRA-2, the re-analyses show a regular seasonal cycle, with a snowpack that grows nearly steadily during the cold season and culminates in February or March. This is unlike the in-situ observations, which show rapidly fluctuating snow increases with little snow accumulation throughout winter. In comparison with in-situ observations, MERRA-2 has the best performance among re-analyses for both the RMSE and the temporal correlation, closely followed by JRA-55. The ERA5 re-analysis over-estimates the seasonal maximum SD by a factor of 10. Among the ECMWF family of re-analyses, the older ERA-I is performing best in terms of RMSE.
Nevertheless, the temporal correlation is similar to the one obtained with ERA5, indicating that the newer, higher resolution 5 ERA5 similarly captures the snow variability despite its large positive bias. The MW satellite data is also overestimating SD, as noted by Yang et al. (2015) or Dai et al. (2017). Like the re-analyses, it shows a progressive snow accumulation throughout the cold season. It also poorly captures short-term SD variability (Fig. 2) and its temporal correlation with the insitu data is 0.32, poorer than for the re-analyses, even though calculated over only two years. The RMSE error is nevertheless comparable to ERA-I, and the MW data is able to estimate SDs smaller than 5 cm. 10 It is not surprizing that JRA-55 performs well, since it assimilates SDs from a dozen CMA stations over the TP area and snow cover from satellite MW data (see Appendix). Note that the assimilated MW data is not from the same instrument as the one used in the CAREERI MW data. A key factor in the relatively good performance of ERA-I is the fact that it assimilates SCF from IMS in high-altitude regions, hence also over the TP. On the contrary, as mentioned earlier, assimilation of IMS SCF above 1500m was discontinued in the production of ERA5. Among the differences in the snow 15 treatment between ERA-I and ERA5, the tendency to reduce or remove the snowpack imposed by the IMS observational constraint during assimilation appears the most likely key factor in bringing ERA-I SDs significantly closer to the in-situ observations.

Snow cover and snow maps
Figure 3 presents a comparison of daily SCF over the 5-year period (2009-2013) between the in-situ station observations and 20 the IMS satellite data and the 5 re-analyses, collocated at the station coordinates. Again, an average over the 33 stations is presented. The IMS data is represented by the range between the low and high estimates (see Section 2). The correlation matrix and the RMSE for SCF are provided in Table 2 Li et al. (2018) indicates that the satellite snow cover data readily captures the rapidly fluctuating snowy events. It also provides some confidence in the station data despite the harsh operating conditions for in-situ measurements, and the spatial degradation applied to IMS (from 4 km to 0.25 degree).
It is clearly apparent that ERA5 again considerably overestimates SCF. JRA-55 is much worse than MERRA-2 for SCF, 30 while their performance for SD was similar. As described in the Appendix, the a-posteriori conversion from SD to SCF varies among re-analyses. For JRA-55, a thin 2 cm layer is equivalent to 100 % SCF, hence JRA SCF is persistently high; on the contrary, in MERRA-2, SCF values are small due to the large SD required to have full snow cover (see Appendix for details). While the in-situ station data (e.g., Fig. 3) also displays thin layers of a few cm, they are very transient. Further validation is provided in terms of the annual-mean SCDs. Table 3 compares SCDs among the datasets, both for the 5-year average and for individual years, based on monthly-mean SCD values. The values of SCDs compare well between in-situ observation and IMS. On the other hand, ERA5 and JRA-55 re-analyses largely overestimate SCDs (ERA5 by nearly three times), while ERA-I is closer to the observed values. The temporal correlation coefficients for monthly SCDs tend to be higher than that for the daily SCF or SD. Note the values of SCDs in MERRA-2 are small due to the SCF being below the 5 0.5 threshold used in the SCD definition (see Fig. 3).
Moving to the geographical distribution over the TP region, Fig. 5 shows maps of the 5-year mean SD in January for each of the 4 re-analyses, with the in-situ data embedded in each map. January is chosen to illustrate a mid-winter month when snowfall is relatively common and differences between re-analyses are large. In the southeast TP, where the stations are located, MERRA-2, JRA-55 and ERA-I have SDs comparable to in-situ observations, as was shown in Fig. 2. In the western 10 station-void part, there are also large differences in the snow depth (up to factor 5) among the re-analyses, especially along the arc of the Himalayas and other high mountain ranges. Figure 6 is similar to Fig. 5 Figure 6 also reveals that ERA5 and JRA-55 have an excessive SD over the high mountains of the Western Himalayas compared to satellite MW data, unlike ERA-I with its high-altitude snow 15 cover assimilation. Figure 7 shows maps of the 5-year mean SCF in January for each of the re-analyses, for the two (low and high) estimates from IMS, again with the in-situ SCF embedded in the maps. While there is a good agreement between the in-situ data and the IMS low estimate, as shown earlier at station locations, the overestimation by ERA5 extends over the eastern and western parts of the TP. Nevertheless, some parts of Central Tibet and the Taklamakan desert to the north are snow-free in January in both IMS and most re-analyses. ERA-I agrees much better with the IMS data, while JRA-55 SCF is 20 too high, due to the SD vs. SCF conversion as explained above. On the contrary, the MERRA-2 SCF is exceedingly low and featureless.

Snow cover to snow depth conversion
A first important issue to mention is that the conversion of SD to SCF differs significantly among re-analyses. SCF is 25 perhaps the most important snow-related climate driver for the surface budget, since the short-wave snow albedo feedback is strong on the TP, especially in the spring. The snowpack is also quite thin, so that thermodynamical feedbacks linked to snow thickness are less important. A 100 % SCF might correspond to snow depths ranging from 26 cm to 2 cm, depending on the re-analysis considered. There may also be some dependence on snow density as in the cases of ERA-I and MERRA-2.
Hence, while JRA-55 has excellent performance among re-analyses for SD, the snow cover is exaggerated due to the 30 conversion to 100 % SCF obtained with a small SD threshold value of 2 cm. While MERRA-2 SCF compares best with station data among the among re-analyses (see Fig. 4, right), being close to the low IMS estimate, it is exceedingly low over the TP overall (Fig. 7), given the comparatively high value (26 kg m -2 ) used for snow mass to get a fully covered snow area (Reichle et al. 2017a).

Sensitivity in off-line ERA5-land simulations
We carried out simple sensitivity off-line experiments with the ERA5 land model to shed some light on potential explanations for the large biases in the ERA5 reanalysis, which might be partly due to a missing snow removal process or to 5 excessive snowy precipitation. Gordon et al. (2006) reported that blowing snow events are common in the Canadian Prairies and in the Arctic. As snow is carried out in suspension, it may sublimate. In a model study, Pomeroy and Li (2000) predicted that two-thirds of the snow transported from the surface to the atmosphere sublimates in the Prairie regions, and one-half in the Arctic regions. Brun et al. (2013) used Gordon's parameterization and argued that this was an important process in simulating snow dynamics in Northern Eurasia. Processes like the blowing of snow and the sublimation of the blown snow 10 might be important processes in the windy and dry conditions of the TP. To test the hypothesis that sublimation of blowing snow is an important missing process, a simple parametrisation of blown snow sublimation has been introduced, as described in the Appendix. It has been tested for two wind thresholds for blowing snow initiation. Furthermore, the impacts of changing the threshold in the snow depth to snow cover conversion and of reducing snowfall were tested separately. Other processes such as the impacts of blown snow on surface roughness or of snow drifts were not tested. In addition to the 15 ERA5L-CTRL run described in Sect. 2, we hence performed an additional set of 5 land-only experiments testing (i) the effect of snow sublimation due to blowing snow in BLW and in BLW_L with a lower threshold for blowing snow initiation, (ii) the 100 % SCF threshold, changed from 10 cm to 5 cm in SCF05, and to 20 cm in SCF20, and (iii) the excessive snowfall by reducing snowfall by 50 % in SNF50. These 5 experiments with ERA5-land, as summarized in Table 4 The evaluation of the different budget terms over the 33-station mean snow depth and cover (Table 5) as well as the root mean square errors and the temporal correlation with observations (Table 6). These results indicate that (i) the offline 25 simulation (CTRL) with the same configuration as ERA5 represents closely the errors found in ERA5 as expected, (ii) the effects of snow sublimation due to blowing snow and of the SCF threshold change are negligible, and (iii) artificially reducing snowfall by 50% reduces the systematic biases significantly, and increases the temporal correlation of snow depth as well. The temporal variability of snow cover is captured reasonably well by ERA5 and the other experiments, suggesting that the main dynamic features of snow accumulation and depletion are well represented. 30

Excessive precipitation issue
These results suggest that excessive snowfall might be a key responsible for the large overestimation of snow depth and cover in ERA5 reanalysis. Excessive snowfall over this region is in fact a common bias among climate and forecast models (Su et al., 2013). In fact, the largest relative climate model bias for precipitation over land, globally, is on the TP (Flato et al., 2013). In order to provide further evidence for the precipitation bias, we present in Fig. 8 the 5-year mean seasonal cycles of minimum temperature, maximum temperature, total precipitation rate (i.e. liquid and snowy precipitation), snow cover and 5 snow depth, based on monthly means and averaged over all the 33 stations. Figure 8 reveals that all re-analyses have a cold temperature bias compared to in-situ observations, especially in maximum temperature, which is largely consistent with their respective thick snowpack bias. For example, ERA-I is warmer and closer to the in-situ observations than ERA5, likely due the latter having a higher snow depth. The good performance of ERA-I (and also of the first-generation MERRA) against insitu station temperature data had been noted by Wang and Zeng (2012). All re-analyses have a precipitation excess (Fig. 8c), 10 except MERRA-2 which uses observed precipitation data, including over the TP region. Incidentally, this may explain the excellent performance of MERRA-2 in terms of mean snow depth over the stations (see Fig. 4, left). Palazzi et al. (2013) also reported the precipitation excess in ERA-I, for example. In fact, ERA-I displays a precipitation excess greater than the more recent ERA5, despite having smaller SDs. The improvement in precipitation in ERA5 is likely due to its higher horizontal resolution, allowing better representation of synoptic events. Note that the precipitation seasonal cycle is largely 15 dominated by summer monsoonal precipitation in the southeast TP. While our focus here is on the cold season, when precipitation is much smaller but likely falling as snow, there is still a positive bias in the re-analyses then, indicating excessive snowfall. It must be kept in mind though, that in-situ observations of snowfall are associated with a large uncertainty due to undercatch and other technical challenges.
Hence, a remaining question that must be addressed in future studies is the origin of the seasonal moisture transport to the TP 20 region leading to the excess precipitation in models. While the results of the off-line sensitivity tests do not support a role for sublimation due to blowing snow, it could be that the local turbulent surface winds are much higher than in the re-analyses.
Hence the importance of the blown snow sublimation cannot be fully discounted. We further note that there is a strong regional seasonality in the precipitation pattern over the TP. The wet season precipitation (May-September) accounts for more than 70 % of the total annual precipitation over the south and southeast TP (where stations are located), a region that 25 falls under the influence of the Asian summer monsoon (Maussion et al., 2014). Over the western and northwest parts of the TP on the other hand, the wet season precipitation is less than 300 mm. This region is influenced by strong winter and spring westerlies and transient migratory synoptic eddies embedded in them, the so-called westerly disturbances, which provide a significant portion of the annual mean precipitation, especially over the Western Himalayas (Tiwari et al., 2017). An effective barrier effect, inhibiting large-scale moisture transport over the region, is not captured by any of the models 30 underpinning the re-analyses used here. The impact of having higher model horizontal resolution to resolve the complex topography has to be considered. Lin et al. (2018) carried out simulations for the summer monsoon season with the weather research forecasting model (WRF) at resolutions of 30, 10 and 2 km, and found that the finest resolution (2 km) diminished the water vapour transport to the TP, and the precipitation bias. They also noticed a large improvement when model resolution changed from 30 to 10 km. The insufficient subscale orography variance and orographic drag seems to be a key factor (Zhou et al., 2019). Based on these results, it appears that the highest resolution in atmospheric re-analyses examined here (near 32 km in ERA5) remains insufficient, as demonstrated by the precipitation excess compared to station data in Fig.   8c. 5 Finally, we note that the particular relevance of the western TP snowpack for subseasonal-to-seasonal prediction, due to the limited snowpack persistence over the eastern and central TP, as was already emphasized by Xiao and Duan (2016). Yet, with only one station in our inter-comparison, the western TP snowpack remains the less constrained by in-situ data, and further studies on this issue are warranted.

Summary 10
We have shown that several recent, state-of-the-art re-analyses produce an over-extensive snowpack in autumn, winter and spring over parts of the TP. This is at odds with observational studies revealing that snowfall events are very transient, that the snow cover vanishes rapidly on time scales of days, and that large parts of the TP can remain snow-free in winter (e.g., Basang et al., 2017;Li et al., 2018). The reanalyses that assimilate local in-situ observations or satellite snow-cover or precipitation are better capable of representing the shallow, transient snowpack over the TP region. MERRA-2 and JRA-55 15 have the best performance among re-analyses for snow depth. Considering the family of re-analyses from the ECMWF, we surmise that the underperformance of ERA5 in terms of SD compared to its older counterpart ERA-I, is due to the lack of IMS data assimilation at high altitudes, including over the TP. This issue will be subject of a follow-up publication, using dedicated assimilation experiments. Pending a solution for the common model precipitation bias over the Himalayas and TP, which may require much higher horizontal resolutions than currently used in global re-analyses, improved snow initialisation 20 through better use of observations in the analyses may improve the medium-range to sub-seasonal forecasts.  Data availability. All re-analyses data are publically available: MERRA-2 from NASA Goddard Earth Sciences Data and Information Services Center, ERA5/ERA-I/ERA-land from the European Centre for Medium-Range Weather Forecasts (ECMWF), and JRA-55 from the Japanese Meteorological Agency. The station data has been made available through the Chinese Meteorological Administration (CMA). IMS snow cover is publically available from NOAA, and the long-term MW satellite data is available from CAREERI. 5 Geophys. Res. Lett., 44, 252-260, doi:10.1002/2016GL072033, 2017 Snow is represented as a single bulk layer on top of the soil column with prognostic temperature, mass, density and albedo.
Snow density is constant with depth, increasing exponentially with snow age, and decreasing after snowfall events assuming a constant snowfall density of 100 kg m -3 . Snow albedo reduces exponentially as snow ages over low vegetation or bare soil, but is constant in time for snow under high vegetation, and is reset to a maximum value of 0.85 after snowfall event.
A snow depth analysis is performed using a Cressman (1959) interpolation with successive corrections. Station observations 15 of snow depth are used, however there are no stations used over the TP. Gridded snow cover from IMS is also assimilated since 2004, using the coarser 24-km resolution product as detailed in Drusch et al. (2004). The snow depth analysis is relaxed toward a climatology when observations are unavailable.
Snow cover fraction is a diagnostic variable derived from the relation SCF = min(1, SWE/15) where SWE is the snow mass (kg m -2 ). A layer of 15 kg m -2 represents 100 % snow cover (15 cm 20 depth considering a snow density of 100 kg m -3 , or 5 cm depth if considering a snow density of 300 kg m -3 ).

ERA5
The snow model is also a bulk single layer as in ERA-I, with the same prognostic variables but with several modifications, in particular a diagnostics of snow liquid water content, a revised SCF formulation and snow density evolution (Dutra et al., 2010). A two-dimensional optimal interpolation snow analysis, including the water equivalent, temperature and density of 25 snow is performed using station observations of snow depth and the gridded SCF product from IMS 4-km resolution product (de Rosnay et al., 2014;. Unlike ERA-I, IMS data is not used above 1500 m, i.e. in high altitude regions, which includes the TP. No station data over the TP is used. Unlike ERA-I, the snow depth analysis is not relaxed toward a climatology when observations are unavailable. Snow cover fraction is a diagnostic variable derived from the relation 30 SCF = min(1, SD/10) where SD is the snow depth in cm so that a layer of 10 cm represents 100 % snow cover.

ERA5L-CTRL and dedicated experiments
We made several additional off-line simulations of the HTESSEL model forced by meteorological data from ERA5. The land model is hence the same as used in ERA5 but there is no assimilation of snow or land observations, contrarily to the ERA5 re-analyses. A simple parametrization of snow sublimation due to blowing snow, as proposed by Gordon et al. (2006) and used in Brun et al. (2013), is newly implemented in HTESSEL. The sublimation of blowing snow (Qs, kg m -2 s -1 ) is computed as in Gordon et al. (2006): 5 where A and B are dimensionless constants ( A=0.0018, B=3.6), γ=4, T0 is the water freezing temperature. The formulation requires the lowest model level (at 10 m height) fields of air temperature (Ta), air density (ρa), wind speed (Ua), relative humidity (Rha) and saturation specific humidity (qsa). Ut (m s -1 ) is the threshold for initiation of blowing snow also used by Gordon et al (2006) following Li and Pomeroy (1997) 10 = 6.98 + 0.0033( − 245.88) 2 In the experiments with reduced wind threshold, the first coefficient (6.98) is replaced by 5. This simple parametrization only represents the snow mass removal by sublimation due to the blowing of snow, but it does not account for the change in the energy budget. Hence, the energy required to sublimate the snow is not taken from the surface or from the lower atmosphere. While this approximation would not be valid in a coupled land-atmosphere simulation, it seems appropriate in 15 an off-line simulation. Other experiments tested the impacts of the threshold in snow cover conversion (see SD parameter in the description of ERA5 above) and of reducing snowfall by half. The experiments are summarized in Table 4.

JRA-55
The land surface model in JRA-55 is the Simple Biosphere model (SiB) (Sellers et al. 1986) which represents snow mass on the ground and its evolution by updating several parameters and calculations. A separate optimal interpolation-based snow 20 depth analysis is performed once per day (at 18UTC). The first-guess background state combines the land-surface analysis and Special Sensor Microwave/ Imager (SSM/I) and the Special Sensor Microwave Imager Sounder (SSMIS) snow cover satellite observations. The analysis also considers in-situ observations of snow depth over China from CMA archives, including a dozen stations over the TP (Onogi et al., 2007;Kobayashi et al., 2015).
Snow cover fraction is a diagnostic variable derived from the relation 25 SCF = (min(1,(SD)/2) where SD is the snow depth in cm, so that a layer of 2 cm represents 100 % snow cover.

MERRA-2
The evolution of snow mass, depth and heat content is simulated within each catchment using a three-layer snow model (Reichle, et al., 2017b). Density in each layer evolves via parameterized representations of compaction due to snow overburden and melting/refreezing (Stieglitz et al. 2001). Snow is redistributed among layers as necessary to keep the 30 surface layer shallow enough to respond to diurnal variability. The albedo of snow-covered land surface depends on snow density and vegetation type.         Table 3: Values of the annual means of monthly snow cover days averaged during the whole 5-year period (the 2 nd column) and during 2009 to 2013 (the 3 rd to 7 th columns), respectively (top), for each dataset. Temporal correlation 5 matrix between in-situ observations and re-analyses for monthly SCD during the whole 5-year period (bottom). The SCDs are defined as the days when the SCF is greater than 0.5, and are averaged over the 33 stations.

Experiment Description
ERA5L-CTRL Land-only simulation with the same configuration as ERA5 in the land-surface BLW Including the effect of sublimation due to blowing snow as in Gordon et al. 2006 BLW_L As BLW, but changing the critical threshold for initiation from 6.98 to 5 m s -1 SCF05 Using 5 cm instead of 10 cm threshold for 100% snow cover fraction SCF20 Using 20 cm instead of 10 cm threshold for 100% snow cover fraction SNF50 Reducing the snowfall rate by 50%  Table 5: ERA5 and off-line ERA5-land experiments mean annual snow and associated fluxes averaged over the 33 stations.
The fluxes are the annual mean accumulated: snow sublimation, snow sublimation due to blowing snow, snow melt, snowfall and rainfall intercepted in the snowpack. The last row (OBS) presents the mean snow depth derived from the station 5 data, and the mean snow cover from the IMS satellite data (low estimate).