The ERA5-Land Soil-Temperature Bias in Permafrost Regions

ERA5-Land (ERA5L) is a reanalysis product derived by running the land component of ERA5 at increased resolution. This study evaluates its soil temperature in permafrost regions based on observations and published permafrost products. Soil in ERA5L is predicted too warm in northern Canada and Alaska, but too cold in mid-low latitudes, leading to an average bias of -0.08 °C. The warm bias of ERA5L soil is stronger in winter than in other seasons. Diagnosed from its soil temperature, ERA5L overestimates active-layer thickness and underestimates near-surface (< 1.89 m) permafrost area. This is, in part, due 5 to the shallow soil column and coarse vertical discretization in the ERA5 land-surface model and to warmer simulated soil. The soil-temperature bias in permafrost regions correlates with the bias in air temperature and with maximum snow height. Review of the ERA5L snow scheme and a simulation example point to a low bias in ERA5L snow density as a possible cause for warm-biased soil. The apparent disagreement of station-based and spatial evaluation of ERA5L highlights challenges in our ability to test permafrost simulation models. While global reanalyses are important drivers for permafrost simulation, ERA5L 10 soil data is not well suited for directly informing permafrost research decision making. To alleviate this, future soil-temperature products in reanalyses would require permafrost-specific alterations to the land-surface models used.


Introduction
Permafrost regions occupy more than one fifth of the exposed land area in the Northern Hemisphere (Gruber, 2012) and are subject to important temperature-dependent processes (Cheng and Wu, 2007;Westermann et al., 2009;Schuur et al., 2015;Walvoord and Kurylyk, 2016). Research on permafrost is often impeded by sparse observations and difficult or costly access to study sites (e.g., Ran et al., 2018;Luo et al., 2020). Global simulation products have the potential to be an important source of insight if their suitability can be established. To this end, we investigate the accuracy of soil temperature from the new ERA5-Land (ERA5L) highresolution reanalysis with a focus on permafrost area.
Reanalysis consists of assimilating a broad range of observations into fully coupled process-based models (land, atmosphere, ocean, sea ice, and often biogeochemical components). It is a valuable source of data for permafrost science. Reanalysis products have been successfully used to analyze and simulate various permafrost phenomena at different scales, such as its spatial distribution (e.g., Cao et al., 2019b;Fiddes et al., 2015;Slater and Lawrence, 2013), thermal state (e.g., Guo and Wang, 2017;Koven et al., 2013), active-layer thickness (e.g., Tao et al., 2019;Qin et al., 2017), ground ice loss (e.g., Aas et al., 2019), and carbon release (e.g., Koven et al., 2015). These applications are mostly restricted to the use of atmospheric variables to drive models. Reanalysis-derived soil temperature is rarely used directly due to its coarse spatial resolution (50-150 km) and bias. For example, over the Qinghai-Tibetan Plateau (QTP), Hu et al. (2019) and Yang and Zhang (2018) reported that the root mean squared error (RMSE) of daily soil temperature from different reanalyses (i.e., ERA-Interim/Land, MERRA-2, and CFSR) ranged between 1.8 and 5.1 • C. This error is most often expressed as a cold bias.
ERA5 is the latest reanalysis product produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). Compared to ERA-Interim, it includes new observations and revised processes, such as surface runoff and snow thermal insulation (Hersbach et al., 2020;European Centre for Medium-Range Weather Forecasts, 2018). Cao et al. (2019a) proposed the suitability of ERA5 meteorological data as forcing for permafrost temperature simulation, and Graham et al. (2019) reported the improved performance of ERA5 in high latitudes relative to other modern reanalysis products. More recently, ERA5L was released as an improved land-only complement to ERA5. It incorporates new soil and snow hydrology (Balsamo et al., 2009;Dutra et al., 2010), revised soil thermal conductivity (Peters-Lidard et al., 1998), vegetation seasonality (Boussetta et al., 2013), and bare soil evaporation (Albergel et al., 2012). These improvements are expected to make ERA5L more accurate for many land applications; with a spatial resolution of 0.1 • , ERA5L is the first global reanalysis product at an intermediate spatial scale between Earth system land surface models (e.g., Melton et al., 2019;Chadburn et al., 2015) and statisticaland/or remote-sensing-based permafrost products (e.g., Obu et al., 2019;Karjalainen et al., 2019b).
Here, we evaluate the soil temperature of ERA5L in permafrost regions using observations and other published permafrost data products. We also investigate temperature bias using statistical analysis and numerical simulation at a wellinstrumented location. The objectives of this study are to (1) assess the accuracy of ERA5L soil temperature in permafrost regions and (2) discuss the usability of ERA5L for permafrost research in light of the revealed bias and its potential causes.

ERA5 and ERA5-Land
ERA5 is the latest generation atmospheric reanalysis produced by ECMWF. Data are currently available from 1979 onward, and availability from 1950 onward is planned. ERA5 is produced using four-dimensional variational data assimilation in ECMWF's Integrated Forecast System; it has a horizontal resolution of 0.25 • (31 km), a temporal resolution of 1 h, and a vertical resolution of 137 hybrid sigma model levels. The 37 pressure levels of ERA5 are identical to those of ERA-Interim (Noël et al., 2020). ERA5 assimilates improved input data that better reflect observed changes in climate forcing, as well as many new or reprocessed observations that were not available during the production of ERA-Interim. Unlike other reanalyses, ERA5 also includes an estimate of uncertainty based on a 10-member ensemble with a reduced temporal resolution of 3 h and a spatial resolution of 0.5 • (Albergel et al., 2018).
The new ERA5L product was created by forcing the land component of the model with the atmospheric models but without coupling them. It uses the Tiled ECMWF Scheme for Surface Exchanges over Land with a revised land surface hydrology (HTESSEL; CY45R1; Hersbach et al., 2020). ERA5L is forced by the atmospheric analysis of ERA5, and hence the assimilated observations indirectly influence the simulations. It is delivered at the same temporal resolution as ERA5 and with a higher spatial resolution of 0.1 • . ERA5L is currently available for the period 1981-2018 and will eventually extend back to 1950 and be updated to the present time with little delay. Note that at the time of writing only ERA5L data after 2001 had been released to the public, and so this evaluation is conducted using data between 2001 and 2018.

Snow scheme
A more realistic representation of snow is used in the ERA5 land surface model compared with its predecessor, ERA-Interim. ERA5L uses HTESSEL which treats snow as a single layer above the soil with independent prognostic temperature, mass, density, and albedo (Orsolini et al., 2019). The description of snow processes in HTESSEL is summarized by Dutra et al. (2010) as the following: (1) liquid water with phase changes coexists with ice in the snow pack and is diagnosed from its temperature, mass, and density (Appendix B1); (2) snow density changes according to overburden, thermal metamorphism, and retained liquid water following Lynch-Stieglitz (1994) (Appendix B1); (3) albedo changes exponentially with snow age and is adjusted by vegetation conditions; and (4) snow cover fraction depends on both snow water equivalent (SWE) and density (Appendix B2).

Soil scheme
Soil heat transfer in ERA5L is governed by the Fourier law. The thermal effects associated with latent heat are accounted for by following the method of Rouse (1984). However, soil thermal conductivity depends only on moisture content, and the influence of phase change is not represented. The upper boundary condition is given by a heat flux at the ground surface derived from a weighted average over eight subgrid fractions (or "tiles"). A zero heat flux is assumed at the lower boundary. The ERA5L soil column is discretized into four layers with node depths (layer boundaries) at 0.07 m  Boike et al. (2018Boike et al. ( , 2019a. The permafrost zone of each site was determined based on its location using the digitized circum-Arctic map of permafrost and ground-ice conditions (hereafter referred to as the IPA map; Brown et al., 1997). The observed mean daily soil temperature of these stations ranges from −42 to 38 • C, and the elevation of the sites ranges from 0 to 5500 m. An additional 931 stations in non-permafrost regions were also used for comparison. All the temperature time series were visually checked to remove obvious outof-range values. The mean annual temperature was calculated for sites with data completeness greater than 95 %. Observed active-layer thicknesses (ALTs) were obtained from Peng et al. (2018).

Existing permafrost maps
Four permafrost maps were used as benchmarks to evaluate permafrost area derived from ERA5L soil temperatures. They are (1) the IPA map, which is based on observations and mean annual air temperature (MAAT), (2) the heuristic 1 km global permafrost zonation index (PZI) map from Gruber (2012) (hereafter referred to as the PZI map), (3) the 1 km Northern Hemisphere permafrost map (Obu et al., 2019), which is based on the semi-physical temperature at the top of permafrost (TTOP) model (TTOP map) driven by Moderate Resolution Imaging Spectroradiometer (MODIS) land surface temperature that is filled by downscaled ERA-Interim air temperature, and (4) the 1 km circumpolar permafrost map (CP map), which is derived from a statistical model (Karjalainen et al., 2019a). Whereas ERA5L, TTOP, and CP maps represent permafrost as a boolean variable (i.e., present or absent according to soil temperature), the IPA map and PZI map represent permafrost using either a categorical variable (e.g., continuous, discontinuous, sporadic, or isolated permafrost) or a continuous index (0.01-1) as a proxy to approximately represent the proportion of an area underlain by permafrost (i.e., the permafrost extent). Following Melton et al. (2019), we apply a threshold of 50 % (corresponding to the continuous and discontinuous permafrost zones) and 0.5 for the IPA map and the PZI map, respectively, to allow for meaningful comparison with the other maps. Values greater than this are considered to represent permafrost areas. The mean annual ground temperatures (MAGT) from the TTOP and CP maps were also used to evaluate ERA5L.

Evaluation
For the purposes of evaluation, temperature observations were only used from depths between 0 and 2.89 m, corresponding to the range of the ERA5L soil column. Temperature values were grouped according to their depth in one of the ERA5L soil layers. When this mapping resulted in multiple depths being assigned to a single soil layer, the one nearest to the ERA5L grid center was selected. The ERA5L soil temperatures were nearest neighbor interpolated to each of the observation sites to avoid missing values caused by adjacent water bodies. The mean bias (BIAS), mean absolute error (MAE), and RMSE were used as metrics to compare observations to ERA5L at the station scale (see Appendix A). In the case where multiple sites were located in the same ERA5L grid cell, BIAS, MAE, and RMSE were calculated for each site individually and then aggregated by averaging all stations in each grid cell with equal weight. For the evaluation at ERA5L grid scale, these aggregate metrics (for example, weighted mean bias, wBIAS) were used.
MAAT bias and maximum snow depth (SD max ) were selected as candidate variables to be assessed as possible predictors of ERA5L soil temperature bias (see Eq. 1). SD max was defined as the median of annual maximum monthly snow depth during the period 2001-2018. The surface offset (SO), which quantifies the influence of surface conditions such as snow and vegetation cover (Smith and Riseborough, 2002), is defined here as the difference between MAAT and MAGT of the uppermost soil layer in ERA5L.
ERA5L ALT was derived by linearly interpolating the ERA5L soil temperature-depth profiles. The TTOP and CP map were derived using an equilibrium model, and MAGT is given as an average of the entire period (MAGT avg ). This corresponds to 2002-2014 for the CP map and 2002-2016 for the TTOP map without uniform or specific soil depth. To better evaluate, we aggregated all available observed MAGTs during the period by averaging and then comparing them against the MAGT avg of these two maps. Note that the performance of CP and TTOP maps may be lower here than reported in the original publications due to the fact that we Table 1. Summary of soil temperature observations in permafrost regions, including the total number of stations (N), the temporal coverage and range of temperatures (Coverage), the corresponding ERA5L soil layers and depth range in meters (SL), and a reference for each dataset when available. evaluate them with a different set of observations (different depths, periods, and proportion of sites in mountains). Permafrost in ERA5L is limited to the near-surface due to the shallow simulation depth; consequently, only sites with shallow ALTs (< 1.89 m) are evaluated here. The ERA5L near-surface permafrost area is evaluated using existing permafrost maps. An ERA5L grid cell is considered to be underlain by permafrost if either of the following conditions are true: (1) soil temperature in any of the four soil layers has an hourly temperature below 0 • C for 2 consecutive years (ERA5L H ) or (2) the MAGT of the fourth soil layer is below 0 • C for 2 consecutive years (ERA5L A ).

Detailed permafrost simulation example
Our results show remarkable bias of ERA5L soil temperature in winter that is thought to correlate with snow depth (Fig. 2). For this reason, the suitability of ERA5L soil temperature and the effect of the snow density bias are further investigated using a site-specific simulation example at a densely instrumented location near Lac de Gras (LdG), N.W.T., Canada (Fig. 1a). This simulation provides an opportunity to evaluate ERA5L soil temperature under different terrain (e.g., vegetation, soil properties) and snow conditions. We used GEOtop 2.0 (Endrizzi et al., 2014), a process-based numerical model, to simulate snow characteristics and soil temperature for 10 terrain types between September 2015 and August 2017, as described in detail by Cao et al. (2019a). Snow compaction due to wind effects is considered in onedimension for all terrain types except for the tall shrub site (Pomeroy et al., 1993). The snow correction factor (SCF) is used to scale modeled snow mass via precipitation. It is used as a lumped variable for representing precipitation bias in the driving reanalysis, as well as differences between terrain types that are caused by preferential accumulation and lateral transport by snow drifting. The ERA5 reanalysis and its 10member ensemble are used as forcing data for the simulation.    (Table 2). Soil temperature is found to have a warm bias in western Canada and Alaska but a cold bias in mid-low latitudes such as the QTP, leading to a near-zero wBIAS of −0.08 • C (Fig. 3). Among the 932 MAGTs from 331 ERA5L grid cells, 20.7 % have an RMSE less than 1 • C, 53.5 % have an RMSE less than 2 • C, and 68.9 % have an RMSE less than 3 • C. The following linear model was used to predict ERA5L soil temperature bias in permafrost regions using MAAT bias and snow depth as predictor variables: where wBIAS MAAT is the weighted bias of MAAT. The model was fit using 239 grid cells and has an R 2 of 0.47.
Both predictors were found to be statistically significant with p < 0.01. The result suggests that MAAT and snow depth both influence ERA5L soil temperature. An increase of 1 • C in MAAT wBIAS corresponds to an increase of 0.76 • C in ERA5L MAGT wBIAS, and an increase of 1 m in wSD max (weighted maximum snow depth) corresponds to an increase of 0.77 • C in wBIAS. The overall wRMSE of SO is 1.94 • C, and wBIAS is 0.21 • C. These results are comparable to those obtained for the land surface scheme (JULES) of the UK Earth System Model (UKESM) (Chadburn et al., 2015). Averaged MAGTs from the CP and TTOP map were bilinearly interpolated to the observed sites and compared against the observations of the deepest soil layer. We found that the performance of ERA5L is intermediate between the two maps (Table 2). Whereas Karjalainen et al. (2019b) found that the predictive accuracy of their statistical model was similar between permafrost and non-permafrost regions, our results show ERA5L and TTOP soil temperature agree less with observations in permafrost regions than in nonpermafrost regions (Table 2; Fig. 3). In addition to the worse performance of MAAT in these regions, the result suggests that HTESSEL may be less suitable for soil temperature simulation in areas with more prevalent snow and soil freezing. The large warm bias of ERA5L soil temperature during winter (Fig. 2) further supports this notion.

Active-layer thickness and permafrost distribution
While ERA5L is not capable of representing deep ALT, our results show that even for shallow ALT grids the mean ERA5L ALT (1.67 m) was more than twice the mean observed ALT (0.82 m) (Fig. 4). ERA5L ALT is substantially overestimated for most (72 of 79) of the grids with wRMSE values up to 0.98 m. Excluding glaciers, the mean nearsurface permafrost area of the Northern Hemisphere was estimated as 6.6±0.6×10 6 km 2 based on hourly soil temperature and 9.9 ± 0.5 × 10 6 km 2 based on MAGT during 2002-2018 (Table 2; Fig. 5). ERA5L underestimates permafrost area compared to previous estimations (e.g., Brown et al., 1997;Gruber, 2012;Obu et al., 2019;Karjalainen et al., 2019b). Near-surface permafrost area of ERA5L as defined in this study decreased at a rate of −0.11 (−0.08) × 10 6 km 2 yr −1 based on hourly (annual) mean soil temperature. This corresponds to a loss of 1.7 (1.4) × 10 6 km 2 of near-surface permafrost area since 2002. The snow characteristics of GEOtop are derived using SCF equal to 1. The range in parentheses represents SCFs between 0.30 and 1.62 depending on the exact value used for each different terrain type in Fig. 6.

Detailed permafrost simulation example
The detailed example simulation indicates that ERA5L soil temperature has a warm bias (from 0.95 to 5.48 • C) in all terrain types, whereas GEOtop forced by ERA5 and its 10 ensemble members show more reasonable results even when SCF equals 1 (Fig. 6). More specifically, ERA5L was only found to be suitable in terrain types with significant snow deposition (e.g., in snowdrifts, tall shrubs, and sedge fen). For all other terrain types, ERA5L showed a significant warm bias during winter and, consequently, in the annual mean. Although the ERA5L results for SWE agreed with GEOtop when driven by the same data (SCF = 1), the mean snow depth was approximately 1.53 times that of GEOtop, and the snow density was much lower (Table 4).

Suitability of ERA5L soil temperature
ERA5L has a number of advantages for permafrost research; it provides a long historical record (back to 1950, eventually), high spatial resolution, and global coverage. While it could be seen to provide an opportunity to study long-term changes in permafrost at an intermediate scale (∼ 9 km) without additional model simulation, our results indicate that significant bias in ERA5L soil temperature limits its utility for permafrost research.
Compared to a coarse-grid (∼ 2.8 • ) simulation (Fig. 4 from Melton et al., 2019), ERA5L often has more reasonable results in its deepest soil layer despite the fact that fewer permafrost-specific physics are included in the HTESSEL. The results of ERA5L are generally worse in the shallow soil layers (Fig. 2). ERA5L does not reproduce ALT well (Fig. 4) likely due to its shallow soil column, coarse vertical discretization, warm bias in soil temperature, and lack of phase-dependent thermal conductivity in soil. Furthermore, ERA5L shows a remarkable underestimation of total permafrost area (Table 3; Fig. 5) when compared with previous estimates. An explanation for this is that large ALTs (i.e., > 1.89 m) that frequently develop in midlatitude mountains (e.g., Zhao et al., 2010;Cao et al., 2017) cannot be represented by the shallow soil column of ERA5L. While this  could contribute to an underestimation of permafrost area on the QTP, where observed ALT is generally large, we observe a simultaneous cold bias in soil temperature which counteracts the first effect. Our results indicate that the cold bias of ERA5L in mid-low latitudes is highly aligned with the MAAT bias (Fig. 1). This is also suggested by the linear model (Eq. 1). On the other hand, ERA5L underestimates permafrost area in Canada and Alaska despite the observed ALT there being mostly low. This is because the ERA5L soil temperature in western Canada and Alaska is too warm with a wBIAS of about +1.5 • C.
Loss of permafrost is an expected consequence of a warming atmosphere. While the loss of near-surface permafrost area derived from ERA5L is similar to previous land surface model simulations (Lawrence et al., 2008;Slater and Lawrence, 2013), the absolute numbers and the rate of loss have little value for further interpretation because the permafrost area has a pronounced bias to begin with and because its temporal dynamics are known to be poorly represented with a shallow soil column and are likely affected by an inadequate representation of snow. Furthermore, because permafrost extent is a variable that cannot be observed, we fundamentally lack possibilities for proper validation (Gruber, 2012).

Model evaluation with sparse data
Looking exclusively at summary statistics from 242 sites in 209 grid cells would misleadingly show that ERA5L has a relatively good ability to represent the thermal state of permafrost. For example, consider that ERA5L outperformed the TTOP map in all evaluation metrics (Table 2). However, its simulated permafrost area is visibly low when plotted on a map (Fig. 5). These contradictory findings can be reconciled because of the warm bias at high latitudes and cold bias in midlatitudes which cancel each other out based on the available observations (Fig. 3). Clearly, an improvement in summary statistics alone is not a sufficient criterion for superior model performance. Notably, the International Permafrost Association action group "Specification of a Permafrost Reference Product in Succession of the IPA Map" reported in 2016 that, in order to make progress, we needed the capability to measure whether a new map or model output was of superior quality compared with an old one. For this, they recommended that the permafrost community develop and provide the necessary data, methods, and standards (Gruber, 2016).

Scale effects
Even for a small area within a single grid cell of Earth system models or reanalyses (10-100 km), evaluation with point observations remains difficult. This is demonstrated by our simulation example at LdG; within an area of about 20 km × 30 km, MAGT and SO can vary by almost 7 • C based on plot sizes on the order of 15 m × 15 m (Gruber et al., 2018). This is important for two reasons. First, the results from statistical evaluations of coarse-scale products such as ERA5L depend significantly on the local selection of observation sites. This issue is known as the spatial effect which is when the lack of spatially distributed measurements consistent with the size of model grid cells (i.e., 0.1 • in ERA5L) is a potential source of error for model evaluation (Gupta et al., 2006;Gubler et al., 2011). Second, ERA5L ground temperatures can only represent at best a small fraction of the area within each individual grid cell.
Consequently, their value as part of a permafrost climate services system for informing local decision making (e.g., for adaptation) is limited.

Snow densification and heat transfer
The seasonal ERA5L soil temperature deviance (Fig. 2a) and linear model (Eq. 1) show a remarkable bias toward high soil temperature in winter that is correlated with snow height. While we do not imply that the GEOtop simulations are correct or accurately represent metamorphism in Arctic snow (see Domine et al., 2019), they do demonstrate that simulations with snow cover of similar mass but different density are able to match ground temperature observations far better than ERA5L. Since snow thermal conductivity can be described as an exponential function of its density (Eq. B12), the low-biased snow density of HTESSEL would contribute to a much lower snow thermal conductivity. With the same SWE, a low bias in snow density implies a high bias in snow depth. In this context, the temperature gradient and hence the heat flux through the snow pack are both reduced. Using the mean snow density in Table 4 as an example, a snow density of 75 % would reduce ground heat loss through the winter to about 44 %. Even though this represents only one local case study at LdG, it sheds light on a possible cause of the ERA5L soil temperature bias in cold regions more broadly. Interestingly, HTESSEL and GEOtop both use the same exponential formulation of snow thermal metamorphism proposed by Anderson (1976) but with different parameters. HTES-SEL uses a value of 460 m 3 kg −1 for c ξ , a parameter controlling change in snow density due to thermal metamorphism (Eq. B5) (Dutra et al., 2010). This value is 10 4 times greater than the value for c ξ in GEOtop (Endrizzi et al., 2014) and Anderson (1976). Consequently, for snow densities greater than 150 kg m −3 , the change rate (s −1 ) related to thermal metamorphism remains near zero in HTESSEL. While this may explain, at least in part, the bias in ERA5L snow density and soil temperature, it is unknown whether the excessively high value for HTESSEL is merely an error in the publication cited or whether it reflects the value in the code. An additional contribution of GEOtop to higher snow densities in tundra environments may be the effect of blowing snow (cf. Pomeroy et al., 1993).

Implications
While global reanalyses provide urgently needed meteorological drivers for permafrost simulation, their soil data are not well suited for directly informing permafrost research or local adaptation decisions. As such, simulations using permafrost-specific land surface models driven by reanalyses (Cao et al., 2019a;Fiddes et al., 2015) will likely be increasingly important in the provision of permafrost climate services. Making future soil temperature products like ERA5L directly usable will require significant permafrost-specific al-terations in model design, especially with respect to snow cover and the total depth of the ground representation for the land surface models that are used. If indeed the value of the parameter c ξ in the snow metamorphism of HTESSEL is in error, then this would be an easy improvement.

Conclusion
Our results support five conclusions.
1. ERA5L soil temperature has a warm bias at high latitude and a cold bias in mid-low-latitude, high-elevation areas. The soil temperature bias in permafrost regions correlates with bias in air temperature and with maximum snow height. Seasonally, soil temperatures in winter are more strongly warm biased than in other seasons. With more prevalent snow and ice, ERA5L soil temperatures match observations less well in permafrostaffected regions than in non-permafrost conditions.
2. Permafrost area is strongly underestimated when derived from ERA5L soil temperature, and its temporal trend cannot be interpreted with confidence due to the bias in absolute area, as well as model limitations.
3. Active-layer thickness is overestimated when derived from ERA5L soil temperature. This is due to the warm bias in simulations, as well as the shallow soil column and coarse vertical discretization used.
4. ERA5L snow density is hypothesized as having a low bias, at least in high-latitude areas, explaining part of the warm bias in soil temperature.
5. Summary statistics comparing ERA5L with other spatial permafrost data based on their skill in reproducing observations do not agree with a geographic comparison of permafrost zones that are known to exist with some (albeit difficult to quantify) confidence. Whereas ERA5L performs well in the statistical evaluation, it severely underestimates permafrost area especially in Canada and Alaska. This highlights the remaining challenges in developing data and procedures for testing permafrost simulation models and products.
Appendix A: Evaluation metrics where T obs is observed soil temperature and T mod is the temperature from ERA5-Land soil temperature, GEOtop, or the literature.
Appendix B: Snow scheme of HTESSEL

B1 Snow densification
Snow density ρ s (kg m −3 ) is constrained to be between 50 and 450 kg m −3 . The compaction of snow density, or change rate (s −1 ), is parameterized as where the first term represents overburden, the second term is thermal metamorphism (Anderson, 1976;Boone and Etchevers, 2001), and the last term is the influence of liquid water in snow (L s ; kg m −2 ) following Lynch-Stieglitz (1994). W s (Pa) is the pressure of overlying snow mass or snow water equivalent (SWE; m) , and η (Pa s −1 ) is the viscosity coefficient of snow.
where g is the acceleration of gravity of 9.807 m s −2 . Snow viscosity is described as a function of snow temperature (T s ; K) and density following Anderson (1976): where η 0 = 3.7 × 10 7 Pa, a η = 0.081 K −1 , and b η = 0.018 m kg −3 . T D (K) is the depression temperature: The change rate in ρ s related to thermal metamorphism is parameterized as where a ξ , b ξ , and c ξ are constant values of 2.8×10 −6 s −1 , 0.042 (unitless), and 460 m 3 kg −1 derived or modified from Anderson (1976) and Jordan et al. (1999). β s (kg m −3 ) is given as where ρ ξ (kg m −3 ) is equal to 150 kg m −3 . L s is diagnosed from snow temperature, SWE, and snow density: where f (T s ) is the snow temperature function and L c s is the liquid water in snow capacity (kg m −2 ).
where T f is 273.16 K. L c s is parameterized as a function of SWE and β s : where r min l and r max l are constant values of 0.03 and 0.1, and C is given as where β l s is 200 kg m −3 .

B2 Snow cover fraction
Snow cover fraction (SCF) can be given as where SD cr , the minimum snow depth that ensures complete coverage of the grid box, is set as 0.1 m.

B3 Snow thermal conductivity
By following Douville et al. (1995), the snow thermal conductivity λ s is treated as a function of snow density: where λ i is the thermal conductivity of ice of 2.2 W m −1 K −1 and ρ i is ice density of 920 kg m −3 .
Code availability. The Python script for downloading ERA5-Land is developed and provided through an application programming interface (API) request by ECMWF Climate Data Store (CDS) service and is available from the Supplement.
Author contributions. BC carried out this study by analyzing data, performing the simulations, and organizing and writing the paper and was responsible for the compilation and quality control of the observations. SG proposed the initial idea and contributed to organizing as well as writing the paper. XL and DHZ contributed to the writing of the paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. The authors thank Joe Melton for his helpful comments. We thank Jaroslav Obu for the guideline of the TTOP map, Kang Wang for the introduction to soil temperature datasets over Alaska, Frank E. Urban and Gary D. Clow for providing access to the NPS datasets, and Xiaoqing Peng for providing the activelayer thickness datasets. ERA5-Land reanalysis data and the ESA CCI LC map are provided by the ECMWF.
Financial support. This research has been supported by the Strategic Priority Research Program of Chinese Academy of Sciences (grant nos. XDA20100000 and XDA20100104), the NSERC Permafrost Partnership Network for Canada (NSERC Permafrost-Net) and the China Postdoctoral Science Foundation (grant no. 2019M660046).
Review statement. This paper was edited by Chris Derksen and reviewed by three anonymous referees.