Permafrost variability over the Northern Hemisphere based on the MERRA-2 reanalysis

Tao, Jing; Koster, Randal D.; Reichle, Rolf H.; Forman, Barton A.; Xue, Yuan; Chen, Richard H.; Moghaddam, Mahta

doi:https://doi.org/10.5194/tc-13-2087-2019

Articles | Volume 13, issue 8

https://doi.org/10.5194/tc-13-2087-2019

© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/tc-13-2087-2019

© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 13, issue 8

Research article

|

01 Aug 2019

Research article |

| 01 Aug 2019

Permafrost variability over the Northern Hemisphere based on the MERRA-2 reanalysis

Jing Tao, Randal D. Koster, Rolf H. Reichle, Barton A. Forman, Yuan Xue, Richard H. Chen, and Mahta Moghaddam

Download

Final revised paper (published on 01 Aug 2019)
Supplement to the final revised paper
Preprint (discussion started on 21 Jun 2018)

Interactive discussion

Status: closed

AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

- Printer-friendly version

- Supplement

RC1: 'Some interesting new approaches to permafrost model evaluation', Anonymous Referee #1, 14 Jul 2018
- AC1: 'Response to RC1', Jing Tao, 20 Sep 2018
RC2: 'review on tc-2018-119', Anonymous Referee #2, 27 Jul 2018
- AC2: 'Response to RC2', Jing Tao, 20 Sep 2018
RC3: 'comment on tc-2018-119', Anonymous Referee #3, 14 Aug 2018
- AC3: 'Response to RC3', Jing Tao, 20 Sep 2018

Peer-review completion

AR: Author's response | RR: Referee report | ED: Editor decision

AR by Jing Tao on behalf of the Authors (11 Oct 2018) Author's response Manuscript

ED: Referee Nomination & Report Request started (05 Nov 2018) by Ketil Isaksen

RR by Anonymous Referee #3 (22 Nov 2018)

RR by Anonymous Referee #4 (25 Jan 2019)

RR by Anonymous Referee #5 (07 Feb 2019)

Suggestions for revision or reasons for rejection

The authors run a series of simulations of permafrost dynamics using the MERRA2 reanalysis and the MERRA land model. They compare the results to remotely sensed active layer thickness (ALT) and in situ measurements of ALT. The paper has the potential to be a solid model-data comparison of simulated permafrost dynamics. I recommend acceptance after major revisions.
I have three major comments:
1) The authors need to account for measurement uncertainty when comparing to observations and refine their statistical comparison techniques. I found a number of errors in the statistical comparisons that I identify below.
2) The authors need to clarify the role of the remote sensing data in this analysis. They spend as much space comparing the remotely sensed ALT with in situ data as with the model. Is the paper a means to validate the model or the remote sensing data?
3) The authors need to change their spinup procedure or drop the trend analysis. Repeating the full time period for spinup introduces a dynamic response that produces false trends aliased on top of real trends. This pretty much invalidates the trend analysis.
I have the following specific comments:
P2L17-20: Reword. This is a runon sentence with two, double nested parenthetical clauses, making it very difficult to understand.
P2L25: State how models are useful. This paragraph emphasizes resolution as a weakness of models.
P3L2-3: This is not difficult. One must account for representation error when comparing a point measurement to the area average of a model pixel.
P3L7: The resolution of these simulations is essentially the same as for many published simulations, so I am not sure this is the best claim to make.
P45L3: State or described the improved performance.
P4L8: State or describe exactly what is inferior.
P4L10-12: Delete. Each reanalysis has strengths and weaknesses and I find it very difficult to believe that one version of MERRA is truly superior to another, especially considering the scarcity of measurements in the Arctic. I have no objection to using MERRA-2, of course, but claiming superiority is not warranted and best deleted from the manuscript.
P4L18-20: Here the authors state they will use the remotely sensed ALT to validate the model, but later they actually validate the remotely sensed ALT against ground observations. This makes the actual purpose of including remotely sensed data unclear in this paper.
P5L10: The vertical resolution seems too coarse to simulate ALT. The total depth is fine, but other models typically use much higher resolution to simulate ALT. The authors need to explain why this resolution will work.
P5L23: This is a good formulation. Models often use it, but rarely document it.
P6L17: A 180 year spinup is adequate for stabilizing soil temperatures, but not for soil carbon. Does this model include dynamic soil carbon pools? If yes, then a spinup of 1000-5000 years is more appropriate.
P6L20: The chosen spinup technique pretty much invalidates the trend analysis. The typical response time for soil temperature in a model such as this is 20-30 years, exactly matching the length of the MERRA forcing data. If they had spun up using only 1980-85 MERRA data, then the trend analysis makes sense. I suggest either changing the spinup or dropping the trend analysis.
P7L29-30: This means one can use the radar data only where one expects the alt to be less than 60 cm. If the radar cannot penetrate below 60 cm, I question the utility of using it for validation. The authors need to supply a rationale for including it in the study.
P8L11: Please identify which site got covered with lava. This is so unusual that you have to tell the reader.
P8L28: The section on comparison with the radar ALT must include uncertainty. The best that a model can do is match the observations within uncertainty.
P8L28: The authors should include a description of the statistical comparison itself. There are many ways to do this, ranging from a cost function to a regression.
P8L28: The authors need to change the section title. The title covers only comparison with the radar data, but the text covers comparison with CALM data.
P9L7-16: The comparison of a point, in situ measurement to a model or remote sensing pixel must account for representation error. Representation error is the uncertainty when a point measurement represents an average. The standard deviation of the CALM grid measurements is a good estimate of representation error.
P10L25-30: The authors should state this is a standard degree day model and find some references.
P11L5-8: This is a standard degree-day model for ALT with a snow adjustment. There are hundreds of variants of this model in the literature derived from the original thermodynamics equation, models, or empirically from in situ observations. The authors need some references here and text explaining that this is a degree day model.
P11L8: The authors should explain why they included the a0 term. The a0 term is not often seen in a degree day model because one typically assumes the soil starts frozen (a0=0).
P11L8: The authors should explain why they chose Tcum rather than the square root of Tcum. One can derive the sqrt(Tcum) relationship directly from the original thermodynamics equation and the relationship appears many times in analyses of in situ measurements. Because of the strong theoretical basis of sqrt(Tcum), using plain Tcum is rare, so the authors need to justify its use.
P11L12-16: This description of comparing to CALM is out of place and should be moved to section 3.1.
P11L12-16: The authors need to account for uncertainty in the CALM measurements when comparing to the model output.
P11L18-24: The spinup technique invalidates the trend analysis. Either drop the trend analysis or modify the spinup.
P12L6-8: Delete. Unneeded.
P12L10-12: Move to methods.
P12L14: Figure 3 shows the difference between the model and observations, but is this difference within the uncertainty? If yes, then the two are statistically identical and thus a match. If no, then there is a statistically significant mismatch. The magnitude of the difference is unimportant if the difference is less than uncertainty. The authors need to account for uncertainty in this comparison.
P123L21: Explain here why soil type influences the result. The reader should not have to flip forward in the paper to get this answer.
P12L25: Agreement ‘to first order’ is too vague and carries no meaning. The authors need to quantify the agreement accounting for uncertainty.
P13L1-8: The authors need to expand their statistical analysis of the model-data comparison. All they have is correlation, which says nothing about magnitude. They should expand the residual analysis to include bias (mean residual), root mean square error (residual standard deviation), and chi-squared (standard deviation of residuals normalized by uncertainty).
P13L9: ‘Broadly consistent’ is too vague and carries no meaning. The authors need to quantify the agreement.
P13L20: ‘In general’ is too vague and has no meaning. Comparing modeled and observed trend with latitude is perfectly valid here. The limited number of in situ measurements will simply result in higher uncertainty.
P13L23: Again, ‘generally’ is too vague.
P13L23-33: Figure 5 is not the correct format to show the relationships described here. The reader cannot visualize the relationships and correlations from the simple time series plots in Figure 5. The authors should replace the time series plots with three plots to illustrate the relationships: ALT vs. latitude, ALT vs. organic matter content, and ALT vs. air temperature.
P14L20-23: Perhaps, but the author’s argument is not convincing. Shading associated with higher LAI represents an equally valid explanation. Higher water content associated with higher organic matter content could also explain the difference. The authors have the full suite of model output on hand. They should do a statistical analysis of available output to track down exactly what explains the difference.
P14L31: The authors need to identify exactly what soil parameters changed. Porosity? Thermal conductivity? Volumetric water content? Also, the authors need to explain how they specify soil properties in the model. A sharp change as seen here is common when specifying properties by soil type, such as sandy loam defined in the USGS soil triangle. A sharp change would be unusual when specifying soil properties by maps of soil texture (sand, silt, and clay fraction).
P15L33: The reason simulated ALT is deeper is the same reason identified later in the manuscript: the model either has permafrost or it does not because it cannot represent sub-grid scale processes. When the model does simulate permafrost in sporadic regions, it is always greater than observed because it represents an area average of permafrost and non-permafrost areas.
P16L9-10: The authors need to either perform the analysis with air temperature or at least summarize and reference the results of other studies that did perform the analysis.
P16L12: The authors need to remove the trends in ALT, Tcum, and SWEmax before calculating the correlation coefficients. We see nice strong correlations because all three variables show strong trends over the time period of the simulation. Removing the trends will significantly change Figure 9 and its interpretation. If the authors want to isolate the effects of trends on the ALT, then they should include an analysis using the congruent trend fraction.
P17L1: The regions identified on the maps do not correspond to high mountains. Please clarify.
P17L5-6: Delete.
P17L22: ‘Geographically thin’ is too vague. Please reword.
P18L20: The authors cannot make this claim without an actual comparison with other models. Drop the statement.
P18L14 and P18L34: This is the first mention of representation error. The authors need to estimate the representation error of the in situ measurements and include this in the comparison with the modeled ALT. There are several ways to do this and I leave it to the authors to determine the most appropriate method for this paper.
P19L1-2: This statement is not true. A point measurement can represent an area average if one includes representation error in the point measurement.
P19L4: Either change the spinup or drop the trend analysis.
P20L15-16: Again I am confused about the motivation of including the remotely sensed ALT in this paper. This paper compares the model to the RS data, but also compares the RS data to the in situ measurements. Do the authors want to validate the model or the RS data?
P20-P22: Please reduce the summary to one page or less. The current summary is way too long and simply repeats material from the results section. What is the primary, take-away points the authors want to convey? What are their most important or most interesting results? What are the broader implications of their results?

Hide

ED: Reconsider after major revisions (25 Feb 2019) by Ketil Isaksen

Dear Dr Tao,

I am sorry for the time that it has taken me to reach a decision regarding your paper ms No.: tc-2018-119 entitled "Permafrost Variability over the Northern Hemisphere Based on the MERRA-2 Reanalysis”.

We have heard back from Reviewers regarding your revised submission. The reviewers recommend acceptance after major revisions and have provided suggestions for improving the manuscript. I call your attention to all 3 Reports. The reviews are included below.

I have examined your revision and responses to the comments and suggestions from the previous round of review. I find that your aggregate response is satisfactory.

Please inform if there are any news about the progress of the not yet published study by Chen et al. Note that works cited in a TC-manuscript should be accepted for publication or published already.

I will examine the second revision; if it is indeed responsive to the reviewer’s suggestions, it will not need to undergo another round of review.

Once again, my apologies for the delay. I look forward to receiving your revision.

Best wishes,
Ketil Isaksen
Editor
The Cryosphere

Report #1
The problem with the not yet published study by Chen et al. (always cited as Chen et al. 2018 instead of in review) remains. Large parts of the manuscript are based on it. The Cryosphere journal rule is according to the webpage 'Works cited in a manuscript should be accepted for publication or published already.' Results etc. referring to it should be removed.

Other
Table 4 - state source of observation in caption
Figure 8 - properly cite Brown et al.

Report #2
Permafrost Variability over the Northern Hemisphere Based on MERRA-2 Reanalysis
By Tao et al., 2019.
This study uses point measurement and airborn data in combination with the results of global model driven my MERRA-2 reanalysis modeling data to analyze present permafrost conditions and extent. Authors compare datasets from different scales to study the match between them. The main problem is how to compare in-situ data with averaged to 20x60 m2 grid cell data and then averaged to 81 km2 grid cell. Then authors touch on the problem on why global model unable to model permafrost in the Western Russia and Eastern Canada. Global model fail to model permafrost in those regional because those area represent ecosystem protected permafrost zones (Shur et al., 2007). This means that thick organic layer, most importantly including moss layer, protect permafrost below from warm air temperatures. To achieve this increasing the amount of the organic layer as was also done for example global models like CLM and SiBCASA (Nicolsky et al., 2007; Jafarov and Schaefer 2016) is simply not enough. It is important to drive those regions with cold initial temperatures with enough moss-organic insulation on top. In addition deep soil column should allow keeping permafrost in those regions. Overall, the paper indicates some important and interesting analysis, including the effect of soil moisture on the ground temperature and ALT. However, current version of the paper need some major clean ups to improve clarity. I suggest cutting the number of Figures, removing discussion from the conclusion and making results and discussion section, since results already have a lot of discussion. Keep the conclusion straight to the point, do not summarize your work in the conclusion. Instead suggest what improvement can be made to improve discrepancies in the ALT simulation in Mongolia, Russian etc. and how the permafrost extent can be better modeled on the global scale.

Abstract
L27 …some permafrost areas… Be specific, spell out those areas.

Introduction
P3. L26. I suggest acknowledging all the work done ALT measurement using GPR as a part of the
pre-ABoVE campaign. Chen et al., (2016) documented extensive GPR ALT data collection near
Toolik Lake, Alaska. Jafarov et al., (2018) documented extensive GPR ALT data collection near
Barrow, Alaska. These datasets a unique because they represent spatial ALT collection in
oppose to point measurements by CALM. Both dataset available for download from ABoVE
website. These datasets can be extremely useful in this study because they give a better idea on
spatial variability of the ALT on meter scale. The standard deviation from those works can be
used to better constrain the uncertainty in measured ALT at a finer spatial scale.
In addition, I highly suggest checking the most recent and the most complete work on the nearsurface
permafrost data in Alaska (Wang et al., 2018). The data collected in that dataset
provides a wider coverage for Alaska and can be extremely useful for this study.
P4. L 22-30. Do this freeze-thaw formulation allows multiple thaw zones? E.g. talik and seasonal
frost above with the existing permafrost at a deeper depth.
P5.L12 Not sure why the model was spun up for 180 years? Typically spin up means total
equilibrium.

Methods section needs some better organization. For example,
1. In-situ to AirMoss comparison
2. In-situ to CLSM comparison
P7-8. L30-12. The main point of those two paragraphs is the difference. I suggest plotting the
difference between AirMoss and CLSM with 81 km2 resolution, just one Figure instead of ABC.
Then it will be clear when they do not match and then discussion can be more focused on the
why they do not match.
P8. Paragraphs 3 and 4. Similarly don’t need Figure 4 AB. In-situ data has smaller uncertainty
and variability, when scaled up we average the variability into a one grid cell. The question is
what is the uncertainty for CLSM should be, which was answered later in the manuscript by
analyzing the effect of different factors (snow, organic layer, soil moisture). If you plot the CLSM
uncertainty bars and they intercept with the solid lines then this makes the overall results much
better.
P9. L16-30. It mainly depends on the pixel size (grid cell) of the modeled ALT. The authors
should think how they can address the overall uncertainty in the global model, and how that
uncertainty would change when they compare it with in-situ or AirMoss data.
P14. L6-20. Cite Shur et al., (2007) draw the discussion from that work. Refer to my main
comment.
P14. L31. There are many CALM sites within a CLSM grid cell. The variation in CALM sites is a
standard deviation (std). Again this deviation is from hand full of sites where the GPR
measurement provides a wider range of the possible (std) in Barrow and Toolik Lake regions.
P15. L1-3. The soil characteristic in Mongolia might include rocky type environment. In
mountain areas the ALT along the south face slopes might be quite deep. I wonder if that might
explain the deep ALT in those regions.
P15. L30. Do you think if you drive the model with different reanalysis data (ERA-Interim or
similar) it might give you better results?
P16. L19. I would drop unnecessary words phrases like at least to some extent from the text.

References
Jafarov, E. and Schaefer, K.: The importance of a surface organic layer in simulating permafrost
thermal and carbon dynamics, The Cryosphere, 10, 465-475, doi:10.5194/tc-10-465-2016, 2016
Shur, Y. L. and Jorgenson, M. T.: Patterns of permafrost formation and degradation in relation
to climate and ecosystems, Permafrost Periglac., 18, 7–19, doi:10.1002/ppp.582, 2007.
Wang, I. Overeem, E. Jafarov, G. Clow, V. Romanovsky, K. Schaefer, F. Urban, W. Cable, M.
Piper, C. Schwalm, T. Zhang, A. Kholodov, P. Sousanes, M. Loso, D. Swanson, and K. Hill. A
synthesis dataset of near-surface permafrost conditions for Alaska, 1997-2016K.
https://doi.org/10.18739/A2KG55.
D. J. Nicolsky, V. E. Romanovsky, V. A. Alexeev, D. M. Lawrence. Improved modeling of
permafrost dynamics in a GCM land-surface scheme. https://doi.org/10.1029/2007GL029525
Jafarov, E. E., Parsekian, A. D., Schaefer, K., Liu, L., Chen, A. C., Panda, S. K. and Zhang, T. (2018),
Estimating active layer thickness and volumetric water content from ground penetrating radar
measurements in Barrow, Alaska. Geosci. Data J.. doi:10.1002/gdj3.49
Chen, A., Parsekian A., Schaefer K., Jafarov E., Panda S., Liu L., Zhang T., and Zebker H: 2016.
Ground-penetrating radar-derived measurements of active-layer thickness on the landscape
scale with sparse calibration at Toolik and Happy Valley, Alaska. GEOPHYSICS, 81(2), H1-H11.
doi: 10.1190/geo2015-0124.1

Report #3
The authors run a series of simulations of permafrost dynamics using the MERRA2 reanalysis and the MERRA land model. They compare the results to remotely sensed active layer thickness (ALT) and in situ measurements of ALT. The paper has the potential to be a solid model-data comparison of simulated permafrost dynamics. I recommend acceptance after major revisions.
I have three major comments:
1) The authors need to account for measurement uncertainty when comparing to observations and refine their statistical comparison techniques. I found a number of errors in the statistical comparisons that I identify below.
2) The authors need to clarify the role of the remote sensing data in this analysis. They spend as much space comparing the remotely sensed ALT with in situ data as with the model. Is the paper a means to validate the model or the remote sensing data?
3) The authors need to change their spinup procedure or drop the trend analysis. Repeating the full time period for spinup introduces a dynamic response that produces false trends aliased on top of real trends. This pretty much invalidates the trend analysis.
I have the following specific comments:
P2L17-20: Reword. This is a runon sentence with two, double nested parenthetical clauses, making it very difficult to understand.
P2L25: State how models are useful. This paragraph emphasizes resolution as a weakness of models.
P3L2-3: This is not difficult. One must account for representation error when comparing a point measurement to the area average of a model pixel.
P3L7: The resolution of these simulations is essentially the same as for many published simulations, so I am not sure this is the best claim to make.
P45L3: State or described the improved performance.
P4L8: State or describe exactly what is inferior.
P4L10-12: Delete. Each reanalysis has strengths and weaknesses and I find it very difficult to believe that one version of MERRA is truly superior to another, especially considering the scarcity of measurements in the Arctic. I have no objection to using MERRA-2, of course, but claiming superiority is not warranted and best deleted from the manuscript.
P4L18-20: Here the authors state they will use the remotely sensed ALT to validate the model, but later they actually validate the remotely sensed ALT against ground observations. This makes the actual purpose of including remotely sensed data unclear in this paper.
P5L10: The vertical resolution seems too coarse to simulate ALT. The total depth is fine, but other models typically use much higher resolution to simulate ALT. The authors need to explain why this resolution will work.
P5L23: This is a good formulation. Models often use it, but rarely document it.
P6L17: A 180 year spinup is adequate for stabilizing soil temperatures, but not for soil carbon. Does this model include dynamic soil carbon pools? If yes, then a spinup of 1000-5000 years is more appropriate.
P6L20: The chosen spinup technique pretty much invalidates the trend analysis. The typical response time for soil temperature in a model such as this is 20-30 years, exactly matching the length of the MERRA forcing data. If they had spun up using only 1980-85 MERRA data, then the trend analysis makes sense. I suggest either changing the spinup or dropping the trend analysis.
P7L29-30: This means one can use the radar data only where one expects the alt to be less than 60 cm. If the radar cannot penetrate below 60 cm, I question the utility of using it for validation. The authors need to supply a rationale for including it in the study.
P8L11: Please identify which site got covered with lava. This is so unusual that you have to tell the reader.
P8L28: The section on comparison with the radar ALT must include uncertainty. The best that a model can do is match the observations within uncertainty.
P8L28: The authors should include a description of the statistical comparison itself. There are many ways to do this, ranging from a cost function to a regression.
P8L28: The authors need to change the section title. The title covers only comparison with the radar data, but the text covers comparison with CALM data.
P9L7-16: The comparison of a point, in situ measurement to a model or remote sensing pixel must account for representation error. Representation error is the uncertainty when a point measurement represents an average. The standard deviation of the CALM grid measurements is a good estimate of representation error.
P10L25-30: The authors should state this is a standard degree day model and find some references.
P11L5-8: This is a standard degree-day model for ALT with a snow adjustment. There are hundreds of variants of this model in the literature derived from the original thermodynamics equation, models, or empirically from in situ observations. The authors need some references here and text explaining that this is a degree day model.
P11L8: The authors should explain why they included the a0 term. The a0 term is not often seen in a degree day model because one typically assumes the soil starts frozen (a0=0).
P11L8: The authors should explain why they chose Tcum rather than the square root of Tcum. One can derive the sqrt(Tcum) relationship directly from the original thermodynamics equation and the relationship appears many times in analyses of in situ measurements. Because of the strong theoretical basis of sqrt(Tcum), using plain Tcum is rare, so the authors need to justify its use.
P11L12-16: This description of comparing to CALM is out of place and should be moved to section 3.1.
P11L12-16: The authors need to account for uncertainty in the CALM measurements when comparing to the model output.
P11L18-24: The spinup technique invalidates the trend analysis. Either drop the trend analysis or modify the spinup.
P12L6-8: Delete. Unneeded.
P12L10-12: Move to methods.
P12L14: Figure 3 shows the difference between the model and observations, but is this difference within the uncertainty? If yes, then the two are statistically identical and thus a match. If no, then there is a statistically significant mismatch. The magnitude of the difference is unimportant if the difference is less than uncertainty. The authors need to account for uncertainty in this comparison.
P123L21: Explain here why soil type influences the result. The reader should not have to flip forward in the paper to get this answer.
P12L25: Agreement ‘to first order’ is too vague and carries no meaning. The authors need to quantify the agreement accounting for uncertainty.
P13L1-8: The authors need to expand their statistical analysis of the model-data comparison. All they have is correlation, which says nothing about magnitude. They should expand the residual analysis to include bias (mean residual), root mean square error (residual standard deviation), and chi-squared (standard deviation of residuals normalized by uncertainty).
P13L9: ‘Broadly consistent’ is too vague and carries no meaning. The authors need to quantify the agreement.
P13L20: ‘In general’ is too vague and has no meaning. Comparing modeled and observed trend with latitude is perfectly valid here. The limited number of in situ measurements will simply result in higher uncertainty.
P13L23: Again, ‘generally’ is too vague.
P13L23-33: Figure 5 is not the correct format to show the relationships described here. The reader cannot visualize the relationships and correlations from the simple time series plots in Figure 5. The authors should replace the time series plots with three plots to illustrate the relationships: ALT vs. latitude, ALT vs. organic matter content, and ALT vs. air temperature.
P14L20-23: Perhaps, but the author’s argument is not convincing. Shading associated with higher LAI represents an equally valid explanation. Higher water content associated with higher organic matter content could also explain the difference. The authors have the full suite of model output on hand. They should do a statistical analysis of available output to track down exactly what explains the difference.
P14L31: The authors need to identify exactly what soil parameters changed. Porosity? Thermal conductivity? Volumetric water content? Also, the authors need to explain how they specify soil properties in the model. A sharp change as seen here is common when specifying properties by soil type, such as sandy loam defined in the USGS soil triangle. A sharp change would be unusual when specifying soil properties by maps of soil texture (sand, silt, and clay fraction).
P15L33: The reason simulated ALT is deeper is the same reason identified later in the manuscript: the model either has permafrost or it does not because it cannot represent sub-grid scale processes. When the model does simulate permafrost in sporadic regions, it is always greater than observed because it represents an area average of permafrost and non-permafrost areas.
P16L9-10: The authors need to either perform the analysis with air temperature or at least summarize and reference the results of other studies that did perform the analysis.
P16L12: The authors need to remove the trends in ALT, Tcum, and SWEmax before calculating the correlation coefficients. We see nice strong correlations because all three variables show strong trends over the time period of the simulation. Removing the trends will significantly change Figure 9 and its interpretation. If the authors want to isolate the effects of trends on the ALT, then they should include an analysis using the congruent trend fraction.
P17L1: The regions identified on the maps do not correspond to high mountains. Please clarify.
P17L5-6: Delete.
P17L22: ‘Geographically thin’ is too vague. Please reword.
P18L20: The authors cannot make this claim without an actual comparison with other models. Drop the statement.
P18L14 and P18L34: This is the first mention of representation error. The authors need to estimate the representation error of the in situ measurements and include this in the comparison with the modeled ALT. There are several ways to do this and I leave it to the authors to determine the most appropriate method for this paper.
P19L1-2: This statement is not true. A point measurement can represent an area average if one includes representation error in the point measurement.
P19L4: Either change the spinup or drop the trend analysis.
P20L15-16: Again I am confused about the motivation of including the remotely sensed ALT in this paper. This paper compares the model to the RS data, but also compares the RS data to the in situ measurements. Do the authors want to validate the model or the RS data?
P20-P22: Please reduce the summary to one page or less. The current summary is way too long and simply repeats material from the results section. What is the primary, take-away points the authors want to convey? What are their most important or most interesting results? What are the broader implications of their results?

Hide

AR by Jing Tao on behalf of the Authors (04 Jun 2019) Author's response Manuscript

ED: Publish as is (26 Jun 2019) by Ketil Isaksen

AR by Jing Tao on behalf of the Authors (02 Jul 2019) Manuscript

Short summary

The active layer thickness (ALT) in middle-to-high northern latitudes from 1980 to 2017 was produced at 81 km² resolution by a global land surface model (NASA's CLSM) with forcing fields from a reanalysis data set, MERRA-2. The simulated permafrost distribution and ALTs agree reasonably well with an observation-based map and in situ measurements, respectively. The accumulated above-freezing air temperature and maximum snow water equivalent explain most of the year-to-year variability of ALT.