Articles | Volume 19, issue 12
https://doi.org/10.5194/tc-19-6547-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Assessing uncertainties in modeling the climate of the Siberian frozen soils by contrasting CMIP6 and LS3MIP
Download
- Final revised paper (published on 05 Dec 2025)
- Preprint (discussion started on 10 Feb 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-389', Anonymous Referee #1, 13 Feb 2025
- AC2: 'Reply on RC1', Zhicheng Luo, 01 May 2025
-
RC2: 'Comment on egusphere-2025-389', Anonymous Referee #2, 04 Mar 2025
- AC1: 'Reply on RC2', Zhicheng Luo, 01 May 2025
-
RC3: 'Comment on egusphere-2025-389', Adrien Damseaux, 06 Mar 2025
- AC3: 'Reply on RC3', Zhicheng Luo, 01 May 2025
- AC4: 'Reply on RC3', Zhicheng Luo, 01 May 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Reconsider after major revisions (further review by editor and referees) (17 May 2025) by Philipp de Vrese
AR by Zhicheng Luo on behalf of the Authors (28 Jun 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (22 Jul 2025) by Philipp de Vrese
RR by Anonymous Referee #2 (08 Aug 2025)
RR by Adrien Damseaux (19 Aug 2025)
ED: Reconsider after major revisions (further review by editor and referees) (22 Aug 2025) by Philipp de Vrese
AR by Zhicheng Luo on behalf of the Authors (03 Oct 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (22 Oct 2025) by Philipp de Vrese
RR by Adrien Damseaux (24 Oct 2025)
RR by Anonymous Referee #2 (05 Nov 2025)
ED: Publish as is (05 Nov 2025) by Philipp de Vrese
AR by Zhicheng Luo on behalf of the Authors (11 Nov 2025)
This manuscript presents an analysis of four key variables from the LS3MIP simulations in comparison to their counterpart CMIP6 fully coupled simulations. Detailed analysis of this very important MIP has been lacking so far, and it is very encouraging to see the simulations used. Restriction of the analysis to the eastern Arctic permafrost region is well argued with the availability of comprehensive observations. The manuscript presents sound methodological analysis, and can become an important contribution to understanding the intricacies of land model performance in coupled model setups.
The methodology used in the manuscript is appropriate, well applied and mostly well presented. However, the rest of the paper lacks structure, seems written rather carelessly (citations that don’t contain the information they claim to give, sentences that do not work) and does not present conclusions promised in the introduction in a comprehensive way. While I think in general this manuscript can become a valuable contribution to the interpretation of the LS3MIP simulations, it needs substantial improvement in many parts.
I have two general comments concerning the use of reanalysis and concerning the lack of a proper discussion that can be found below, followed by a large number of specific comments to the text.
General comments:
ERA5-Land: The presentation of the ERA5 data in the manuscript seems pointless. The purpose of the manuscript is not to evaluate ERA5 Land against station data. ERA5 Land is also not used for additional validation of the model simulations on a larger spatial scale than the station observations allow. I suggest to remove the ERA5 Land contributions in the manuscript, or actually make use of them within their limitations (which would then warrant their evaluation against the station data).
Discussion: The manuscript lacks a proper discussion of its results in a comprehensive way, which also leads to conclusions that seem to have no basis. In the introduction, you state that: (1) "We will analyze the discrepancies between the same model in CMIP6 and LS3MIP to quantify the bias and uncertainty present in frozen soil regions, attributing them to land surface models versus those resulting from atmospheric forcings. With identical and more realistic atmospheric conditions, we anticipate that the LS3MIP models will more accurately simulate soil conditions. If these models fail to produce soil variable outputs that align better with observed data than the CMIP6 simulations, it is regarded as an error in the land surface models." (2) "We will discuss the variations among different models of LS3MIP and try to establish a connection between model performance and their specific features." However, the manuscript ends after the presentation of the results, without coming back to the analysis you promise in a comprehensive way. Since the you do not have a discussion section in the manuscript, the conclusions need to contain this discussion (or you need to make a discussion section). Please come back to both points from the introduction, and establish conclusions for both points rooted in your results in an understandable way. Right now, some conclusions and discussion are scattered throughout the results, but it is hard to puzzle them together to a coherent picture.
Specific comments:
1 Introduction
The introduction loosely strings together statements on Arctic climate change and its impacts on the Siberian permafrost region, then jumps to factors that determine permafrost thermal state, and finally barely introduces CMIP6 and LS3MIP. Without knowing all of these things already, and how they are related to the uncertainties eg in the permafrost carbon feedback from climate model projections of the future, it does not tell the reader much, and does not coherently argue the importance of this study. Please outline a clear relation between the facts mentioned in the introduction and the statements about what the paper means to do from the last two paragraphs. I suggest rewriting the introduction completely. In addition, I have a number of specific comments on the introduction below.
Line 20: This is rather vague, please give specific numbers to the magnitude of Arctic Amplification, and cite their sources.
Line 20: While I don't doubt the numbers for Arctic climate change cited from the two papers, and I acknowledge that they are permafrost related publications, these aren't the papers that produced the numbers, and they are seriously outdated. Please cite more recent publications on climate change projections for the Arctic, and cite the direct sources.
Line 22: While it is certainly true that the most distinct impacts occur in the permafrost areas where temperatures are already close to zero, the statement seemingly has no relation to your manuscript, since you focus your detailed analysis on cold regions, not the warmer edge of the permafrost zone, so I don't see the relevance of that statement.
Line 27: Again, this is very vague and lacks an appropriate reference. Please clarify.
Line 28: The point about abrupt thaw is that from models. we can usually only estimate the carbon emission effects of gradual thaw, but the effects of abrupt thaw are expected to be substantially larger than those of gradual thaw. However, instead of saying that, you simply line up facts with no connection or argument. Please rephrase, and cite appropriate sources.
Line 33: There needs to be at least one general, bridging sentence on how heat transfer through the soil is simulated, and that the following paragraph speaks about modelling.
Line 35: “There are differences in the time scales of major physical processes between the soil and the atmosphere.” Vague, please clarify what you mean.
Lines 33-42: These two paragraphs are a weird mix of processes and conditions controlling heat transfer through the soil, and how these are represented in models. Please separate clearly.
Line 43: Please state what CMIP6 means.
Line 46: There are a number of papers that describe these advances that should be cited here.
Line 47: Please state what LS3MIP means. Also, introduce what LS3MIP aims to do before you dive into the protocol.
2.1 CMIP6 and LS3MIP Simulations
Line 75: Land models treat input data differently, and may require different forcing data sets per se. A table would be nice, in particular since you look at tas, which can be close to/identical to the forcing, or quite different, depending on model setup.
Line 89: Ménard et al only show snow properties in their paper. The way you cite the paper implies all information in you table can be found there, which is not the case. Please clarify.
Line 89-90: This is very vague again, and the table misses some of the processes mentioned here. Eg how is vegetation represented, are the Arctic specific vegetation types, are there shrubs? Please clarify and expand your table.
Line 96: I find this sentence misleading, it implies that models that consider the impact of surface organic matter with a focus on hydro-thermodynamics don't include a carbon cycle, which is for example wrong for CLM5. Please rephrase.
Table 2: Power Function and Quadratic Equation: What does that mean? Either explain somewhere, or use a more descriptive term. What does snow conductivity depend on in these equations?
3 Results and Discussions
3.1 Winter 2-m Temperature in Target Area
Line 152: The definition used in Lawrence and Slater is the generally accepted definition of permafrost. Quite a number of the stations denoted as circles are actually situated on permafrost. Please explain potential reasons why they are not categorized as permafrost using this definition on the station data.
Figure 1: The two triangle stations are very hard to see. In general, the figure would convey more information if the stations were colored by bias in comparison to the modelled data instead of their own mean states. Also, It would be useful to show the permafrost boundaries either from Brown et al or Obu et al in the map.
Figure 2: Bars need to be broader, median positions are not visible. Also, the labels have no positions, which makes them meaningless. For precipitation and tas, we could learn a lot from seeing where GSWP3 is, since it is the forcing data.
3.2 Model climatologies
Line 160: Looking at figure 2, I don't see that.
Line 162: This statement is only true for pr. The LSMs compute their own tas. How close that actually is to the forcing depends a lot on what forcing is used (eg temperature at a reference height, or 2m air temperature itself), and on how complex the calculation within the LSM is.
Line 169: What about snow, soil moisture, vegetation? There is a distinct difference between soil temperatures in general and TTOP, which refers to (1) mean annual temperatures and (2) the top of the permafrost table.
Line 171: What does model family mean? Is it based on similarity of the atmospheres, or based on the atmosphere and land components? In your example, both land and atmosphere components of the models you put into one family actually share code history, but since you do not even state if you refer to the LS3MIP or the CMIP6 simulations, the statement is unclear.
Line 179: What is the reason for this difference in snow? Precipitation is similar, at least for winter, and air temperatures differ, but are so far below zero that the difference seems irrelevant. What drives this? Precipitation and temperature in autumn? And why does it only occur for this one model?
Line 180: ±10 cm translates into a relative error of around 33%, which is massive! Please put into context.
3.2.1 Relative Spread and Relative Bias
Figure 3: Caption states you show all seasons, yet there is only winter and summer in the figure. Also, why is snow in winter similar between L and C even though precipitation differs considerably? Because the medians are the same, and that is what drives snow variability? Please expand.
Figure 4: There is no shading. Correct the caption. Also, as Figure 3, this is not showing all seasons.
Line 186: In general, it is really hard to understand the summer parts of Figures 3 and 4 without an equivalent to figure 2. Maybe provide a summer version of figure 2 in the supplement. Specifically, I think this is meant to read "contrary to JJA where" or something similar. The sentence does not make sense as it is.
Line 193: “The pr in Group C exhibits more extensive group diversity than in Group L.” Which is because in group L, the only difference between the different models is different interpolation of the forcing data set, which makes this statement meaningless.
Line 198: “the model’s bias is considered relatively small” I would suggest to rephrase that into something like "the model's performance is considered adequate", because if the IQR is big enough, very big relative biases could still lead to RBs around 1. In terms of model performance, because you only look at 30 years of data, I agree that this means model performance is adequate, however, the bias would not be small.
Line 200: “Almost all CMIP6 and LS3MIP models have a positive pr-bias but a smaller relative and non-systematic snd-bias in winter.” Since snow is not a pure winter phenomenon and snow build up starts in autumn, so I am not sure how much meaning this comparison has. This analysis needs to be extended to snow build up in autumn.
3.2.2 Spatial Heterogeneity
Figure 5: I find this figure extremely irritating. Figure out the orientation, and resort so that maybe there are eight rows and two columns, so that the figure can be read. Also, for tas, the spread in the CMIP ensemble is bigger than the spread in the LS3MIP ensemble. For tsl, it is the other way around. Why? Please expand in the manuscript text.
Line 218: Why would there be a compensating effect like that? The ensemble spread is not particularly strong in your figure. Please explain.
3.3 Permafrost Region
Figure 6: It is impossible to read the labels. If all variables are to be presented in one Taylor diagram, they need to be distinguishable.
3.4 Climate Dependency of Modeled Temperatures
Line 245: In the figure caption, it says 50th quantile, eg median, instead of the mean, which actually makes more sense. Please check.
Line 268: I think this needs to read “Four models …”
Line 270: The reference is misleading, Dutch et al 2022 only discuss simulations with CLM. Please correct.
Line 272: “There is an excessively low tsl shown in Fig.8, possibly due to insufficient geothermal (functions as upward energy flux from the bottom of soil columns). As the decrease in tas has a limited influence on the tsl through high snd, the main source of error is likely from the other side of energy transportation (thermal conditions in the bottom of the soil column).” If that was true, models that consider a non-zero flux condition at the lower boundary would have to perform better than those with zero flux conditions, which is not the case. The depth of the column plays an important role here, as eg discussed in Alexeev et al, 2007 https://doi.org/10.1029/2007GL029536 and more recently Hermoso de Mendoza et al (2020), https://doi.org/10.5194/gmd-13-1663-2020.
Line 275/276: What about the strong underestimation of variability in summer? What is the reason for that?
Line 285: “In contrast, …” I don’t understand that sentence. Please reformulate.
Line 292: That is a really important statement, it should be explicitly taken up in the conclusion, and the implications should be discussed!
3.5 Snow Insulation
Figure 10: It would be really useful to have horizontal grid lines (maybe in light grey) in the figures so the reader can better understand how close to the observed values models are.
Figure 10: CESM: This actually looks a lot better than what is Burke et al, 2020, for just winter. I wonder why.
Line 297: From your figure caption, I assume that you use monthly mean values from all months, not just the winter months, for your plot. However, I assume the classification is still based on the DJF 30 year average of the station?
Line 303: “under sufficiently thick snow, the tsl gradually convergences near 0 ◦C and is primarily impacted by tas in a limited manner.” I don’t understand that statement. Please reformulate.
Line 325: “UKESM1.0-LL consistently demonstrated similar snow insulation effects in both ensembles” From just looking at the figure, so does HadGEM, which is not surprising since the land models are similar. MIROC and IPSL also have very similar curves regardless of the forcing. Please quantify your distinction in model performance.
Line 329: “Despite cold conditions, an increase in snd still affects the snow insulation effect of LS3MIP CESM2.” I don’t understand the statement. Please reformulate.
Line 336: I cannot follow this statement. Both in Wang et al 2016 and Burke et al 2020, previous versions of CLM5 (CLM45 stand alone in Wang et al, CLM4 in the CMIP5 analysis of CESM1 in the supplement of Burke et al) clearly outperform CLM5 with regard to the snow insulation curve. Please explain further what your statement is based on.
3.6 Impact of Land Model Features on Performance
Line 343: “show good performance in reproducing accurate snd” Actually, in Figure 2, the observed median value for snow depth is within the interquartile range of 1!! model in the LS3MIP forced simulations that supposedly do not suffer from biased precipitation. I would not call that good performance. Please add context.
Line 345: “Although IPSL-CM6A-LR employs a simpler spectral averaged albedo scheme than other land surface models, it does not have an observable impact on its tsl simulation.” What data in your analysis is this statement based on?
Line 348: While vegetation is certainly important for accurately calculating albedo, in terms of the surface energy balance in general, the timing of snow cover is important. Please discuss the impact of a wrong timing of the onset of snow cover and melt.
Line 249: “Considering snow conductivity, the Power Function could be why CNRM-CM6.1 and CNRM-ESM2.1 have a negative bias of larger than -6 ◦C in the SON (figure not shown)” In table2 , both the models with best snow insulation performance (the versions of JULES) and the model with the worst performance (Surfex) employ a power function, so this seems unlikely as the reason for the difference in performance. Please explain your conclusion in more detail.
Line 352: Especially in autumn, this could also be an effect of incorrect timing in snow. If snow cover is late in the models, the soil will release heat to the atmosphere for a prolonged time, which could also explain an underestimation of soil temperatures. Since you have not looked at the timing of snow cover, and snow rmse is large for all models in autumn at least in comparison to the stations considered in Figure 6, I think you need to extend your statement.
Line 363: Since you cannot compare the performance of these models to versions that do not contain organic matter, I don’t see how you can draw that conclusion. Please explain further.
4 Conclusions
Please see my general comment on what the conclusion should contain. Specific comments below.
Line 373: “Except in summer months, inaccurate inter-annual variability in the simulation of soil temperature by CMIP6 models is mainly caused by deficiencies in the land surface models and less inherited from atmospheric components.” What is the reasoning behind this conclusion?
Line 378: “The largest model biases of tas and tsl are witnessed under -5 ◦C.” What does this refer to? Winter, summer, LS3MIP or CMIP6 models? And to what do the -5 ◦C refer? Climatological mean of winter temperature? MAGT? Please provide more context to explain the statement.
Line 379: “These indicate a weakness for models reproducing the tsl relationship with tas in freezing conditions” Which could point to deficiencies in soil moisture, which you have not discussed at all, even though it has a profound impact on latent heat during freeze and thaw. Please extend the discussion accodringly.
Line 381: “Land models tend to simulate lower tsl when overlying snow exists.” Do you mean lower than observed? Because as a general statement, that is wrong. Please explain more clearly.
Line 383: “Note that the scope of this study is limited to soil depths down to 0.2 m” You never state anywhere that all tsl metrics you show only refer to tsl at 20cm. Since the RosHydroMet data provides temperatures at 20, 40, 80, 160 and 320 cm depth, I assumed all metrics referred to comparisons of all depths, and that only the snow insulation analysis is restricted to tsl in 20cm depth as proposed in Wang et al., 2016. This would have to to be clearly stated in the data description, but actually, I don't see any good reason for excluding the other depths from the general analysis, especially because you argue the relevance of the soil temperature analysis with the climate change impacts on permafrost, and 20cm depth is above the active layer thickness in large parts of the northern hemisphere permafrost area. Please extend the tsl analysis using all depths from the station data.