the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Spatial patterns of snow distribution in the sub-Arctic
Greta Miller
Robert Busey
Min Chen
Emma R. Lathrop
Julian B. Dann
Mara Nutt
Ryan Crumley
Shannon L. Dillard
Baptiste Dafflon
Jitendra Kumar
W. Robert Bolton
Cathy J. Wilson
Colleen M. Iversen
Stan D. Wullschleger
Download
- Final revised paper (published on 17 Aug 2022)
- Preprint (discussion started on 18 Nov 2021)
Interactive discussion
Status: closed
-
RC1: 'Comment on tc-2021-341', Anonymous Referee #1, 25 Nov 2021
General Comments:
Bennett et al., perform a detailed experiment at a set of study sites in Alaska to answer the increasingly important question of “how much snow exists here”. Their work examines a suite of regression models of varying sophistication to model SWE based on a set of environmental predictors like NDVI, elevation and wind. The paper was detailed, well written with a novel methodology and promising resulting model performance from the RF (suggesting followup work in this area of research). While portions of the paper are a bit verbose, after some edits I believe that this paper would be an important scientific contribution for the readers of The Cryosphere.
Major Comments/Revisions:
1. While the paper by Bennet et al., is generally well written, it can be overly detailed in certain places. With some restructuring, I believe the paper can be much more concise and effective. For instance, the Introduction from lines 50-95 is likely unnecessary content. I would much rather get right into the meat of the problem at hand starting on line 96, as information about the importance of snow etc. can probably be a sentence or two with details left in references to previous literature. I have similar comments for Section 4 (specifically 4.1, 4.2, 4.3 and 4.4) which should not be in the results section and likely could be summarized in either the methodology or introduction in a paragraph or two. The beginning of 4.4 was answering questions I had about the model setup described in the methodology. These should absolutely be grouped together and the structure revised for clarity. Finally on this point, the discussion section 5 is again far too verbose and should likely be restructured with some of the details moved to the results section or moved to the Appendix. I would recommend limiting the discussion to a summary of uncertainties, sources of error and questions left unanswered from the results.
2. Line 462, I am interested/potentially concerned about the extreme importance of Year on the accuracy of your RF. The authors mention that this Year variable is in some ways a proxy for temperature/precipitation differences between years, and I would ask why not explicitly test for this? While I agree that this conclusion is probably correct, incorporating temperature and precipitation data from a well-validated reanalysis product like ERA5 or MERRA2 could help evaluate this hypothesis. It would also help explain which of these two variables is the most important. Furthermore, a Year variable really limits the robustness of this product for applications outside of your current study and removing it would help in predictions elsewhere. For instance, what if you want to apply your model to data retrieved last year? Would the RF understand the year value of 2020 if fed into the model? However, it could, in theory, incorporate precipitation/temperature data from 2020 without issue.
Minor Comments/Revisions:
3. Can you speak to the different sampling distributions in Fig. 2? While I realize the Kougarok site is much larger than Teller, the spatial coverage of the samples display much more structure and consistency at Teller than Kougarok and I am curious if this sampling discrepancy impacts your results and why the sampling was so different.
4. Regarding model training I had a few questions. First, what sort of hyperparameterization are you using? It appears to be a RandomSearchCV but this isn’t explicitly mentioned. Why not use something a little more sophisticated like a Bayesian Search? Furthermore, why do you use an 80/20 split for train/test instead of a kFold CV like you do in the hyperparameterization step? This way you can operate on the full dataset.
5. A table in the appendix showing the different final models (like the RF) and all incorporated predictors would be helpful for clarity.
6. Regarding the title, is this truly an Arctic analysis? The sites are all straddling the Arctic circle and some may consider this to be in the sub-arctic region.
7. I commend the authors for speaking to the topic of complex terrain in different areas of the manuscript, however I am curious how your model accuracy would change as a function of the complexity of the terrain in mountainous regions. For instance, could this be applied to an alpine area? These regions don’t typically have much plant life so I would expect a predictor like NDVI would be much less useful here and furthermore, the distribution of snow is extremely heterogeneous across these locations.
8. I am curious why you selected the RF over a method like a neural network? With such a large sample, I would expect a deep learning method like a multilayer perceptron would perform as well or better than the RF. This may be outside the scope of your paper, but something to consider.
9. Fig. 3 caption should not have the definition of SWE in it
10. Section 4.2 heading isn’t capitalized while the same words in 5.2 are? Just wondering for consistency.
Citation: https://doi.org/10.5194/tc-2021-341-RC1 - AC1: 'Reply on RC1', Katrina Bennett, 21 Mar 2022
-
RC2: 'Comment on tc-2021-341', Anonymous Referee #2, 13 Dec 2021
Review of the manuscript “Spatial patterns of snow distribution for improved earth system modeling in the artic”
The research presented by Bennet et al., exploits a large dataset of manual snow observations in two sub-artic study areas to understand snow distribution (snow depth, snow density and snow water equivalent) with different statistical methods. The analyses they have applied are correct and rigorous and the results obtained would be of interest to the broad audience of this journal. Additionally the database they have generated is highly valuable for the community. Congratulations for such a big work (more than 23000 manual snow depth and 600 density acquisitions!)
Nonetheless I think the manuscript still needs further work, which I am sure the authors will be able to carry out. This way, I recommend the publication of this work after a major review. Bellow I provide a list of minor points that must be taken into account along the manuscript. These are my major concerns about the work:
- Maybe, the most interesting finding is the importance of the NDVI to explain SWE spatial distribution. However, the NDVI is an index obtained in a particular date (in July). The NDVI in late summer, early autumn might be very different. This point must be discussed, highlighting the importance (or not) of obtaining the maximum NDVI along the year. Please add references to justify NDVI evolution in sub-arctic areas with dominant presence of shrubs.
- Authors claim that the NDVI might describe shrubs presence (“NDVI in our study likely reflected the taller, denser shrubs patterns present in the landscape.” Or “NDVI, that we believe represents shrub patterning”). I think this is interesting, but in the manuscript there is not any analysis to sustain this. Why not to correlate or analyze the distribution of vegetation types (already exploited in the manuscript) with the NDVI? This point must be tackled conveniently or all affirmation regarding NDVI -shrubs “relation” removed.
- Methods section must be reorganized. It is too dense and in some cases, it is not easy to understand all analyses performed. For instance Model Implementation section is too general and in some paragraphs it explains random forest implementation, then came back to GAMs…please present it more organized. Moreover, some analyses applied in results section are not described in methods section (i.e. correlations between snow depth, snow density and SWE).
- The writing is sometimes too repetitive. Many sentences can be removed and some of them might be shortened without losing information. In some sections there are too vague affirmations (“we believe”, “likely reflected ”…), which are not sustained with the results. This type of sentences along the manuscript must be rephrased or removed. In this regard in several sections of the manuscript there are statements of ongoing work or future applications of the results obtained. Some of them are repetitive and can be removed or at least be all of them grouped in a new section in the discussion of “future work”.
Minor comments:
Title: From my understanding the title is too wide. The manuscript does not analyze or work with earth system models. The research analyses different features that controls snow distribution (SWE, snow density and snow depth) with different approaches (GAMs, random forests…). Please change conveniently. Here some suggestions: “Understanding of snow spatial patters with statistical approaches in artic areas”, “Snow spatial patters in artic areas analyzed with random forests”….
Abstract: Authors state that both sites are sub-artic (line 17). Why do you state that his is artic? I don’t see any problem to state along the manuscript that you are characterizing snow distribution in sub-artic areas.
Sentence from line 27 to line 31 is useless. For sure this information will be used to improve other models and the understanding of hydrology, topography…but is not needed in the abstract as far it does not summarizes your study or the results you obtain. Remove it.
Keyword: remove Machine learning and permafrost (it is interesting that permafrost is close or even present in the study areas but I don’t see this as a key word), include random forests and change artic by sub-artic.
Line 75: Include more recent references (and maybe remove the oldest). Even if most of those suggested below are mountain area studies, the information and the results these obtained are highly interesting in this topic. Moreover these references may be useful in the subsequent paragraph where no references are included after the first sentence.
- Mendoza, P. A., Shaw, T. E., McPhee, J., Musselman, K. N., Revuelto, J., & MacDonell, S. (2020). Spatial distribution and scaling properties of lidarâderived snow depth in the extratropical Andes. Water Resources Research, 56(12), e2020WR028480.
- Mott, R., Vionnet, V., & Grünewald, T. (2018). The seasonal snow cover dynamics: review on wind-driven coupling processes. Frontiers in Earth Science, 6, 197.
- Revuelto, J., López-Moreno, J. I., Azorin-Molina, C., & Vicente-Serrano, S. M. (2014). Topographic control of snowpack distribution in a small catchment in the central Spanish Pyrenees: intra-and inter-annual persistence. The Cryosphere, 8(5), 1989-2006.
- Schirmer, M., Wirz, V., Clifton, A., & Lehning, M. (2011). Persistence in intraâannual snow depth distribution: 1. Measurements and topographic control. Water Resources Research, 47(9).
- Trujillo, E., Ramírez, J. A., & Elder, K. J. (2007). Topographic, meteorologic, and canopy controls on the scaling characteristics of the spatial distribution of snow depth fields. Water Resources Research, 43(7).
- Vionnet, V., Guyomarc’h, G., Bouvet, F. N., Martin, E., Durand, Y., Bellot, H., ... & Puglièse, P. (2013). Occurrence of blowing snow events at an alpine site over a 10-year period: Observations and modelling. Advances in water resources, 55, 53-63.
Line 99-101, Also cite other models as, FSM, snowpack and crocus. Despite the article is mainly focus in artic (sub-artic) areas, it is continuously doing references to mountain area works (which from my understanding are needed in this research), so I consider it is worthy to cite these physically based models.
- Essery, R.: A factorial snowpack model (FSM 1.0), Geosci. Model Dev., 8, 3867–3876, https://doi.org/10.5194/gmd-8-3867-2015, 2015.
- Vionnet, V., Brun, E., Morin, S., Boone, A., Faroux, S., Moigne, P. L., ... & Willemet, J. M. (2012). The detailed snowpack scheme Crocus and its implementation in SURFEX v7. 2. Geoscientific Model Development, 5(3), 773-791.
- Bartelt, P., & Lehning, M. (2002). A physical SNOWPACK model for the Swiss avalanche warning: Part I: numerical model. Cold Regions Science and Technology, 35(3), 123-145.
Lines 134 to 140 are not needed in the introduction. Maybe you can include a brief reference to the validation you mention here in the discussion in a “future work” section. Nonetheless in the introduction it does not support the findings of this research.
Line 151. Which is the distance to the airport (in km) from both sites?
Line 162 to 165: This information is not needed in this research. Remove it.
Line 182: Why you didn’t include the data from Teller in 2016? If you don’t use it I think it is not necessary to provide this information here.
Line 188. Add Snow-Hydro reference :
- Sturm, M., & Holmgren, J. (2018). An automatic snow depth probe for field validation campaigns. Water Resources Research, 54(11), 9695-9701.
Line 190: SWE coring tube was evaluated in this study:
- LópezâMoreno, J. I., Leppänen, L., Luks, B., Holko, L., Picard, G., SanmiguelâVallelado, A., ... & Marty, C. (2020). Intercomparison of measurements of bulk snow density and water equivalent of snow cover with snow core samplers: Instrumental bias and variability induced by observers. Hydrological Processes, 34(14), 3120-3133.
This work must be cited here as SWE coring tube was evaluated here.
Line 200: You are using an interpolated snow density value for each snow depth acquisition in order to determine the SWE for each location. I encourage manuscript authors to include a statement about the error introduced with such an approach.
Line 211 Include this reference in which TPI was presented.
- Weiss, A. (2001). Topographic Positions and Landforms Analysis (Conference Poster). ESRI International User Conference. San Diego, CA, pp. 9-13.
Paragraphs from line 210-215 and 216-225: these paragraphs can be reordered and combined. When TPI surrounding square window are defined is a bit hard to understand why distances between 15 m and 155 m are selected. In the next paragraph this is explained, but it is hard to understand lines 213 to 215 as now are stated. Maybe you can change the order of these two paragraphs.
Line 231: here it is stated “…vegetation type was included in the model as…”. At this point of the manuscript models have not been described. I would rephrase this sentence removing the reference to “the model”.
Is the vegetation ranked with yearly SWE values and then this information used as a feature in the models? If yes, I think this is not conveniently done. Why don’t you directly use the map from Konduri and Kumar, 2021?
Line 247: Add these references:
- Winstral, A., Elder, K., & Davis, R. E. (2002). Spatial snow modeling of wind-redistributed snow using terrain-based parameters. Journal of hydrometeorology, 3(5), 524-538.
- Mott, R., Schirmer, M., & Lehning, M. (2011). Scaling properties of wind and snow depth distribution in an Alpine catchment. Journal of Geophysical Research: Atmospheres, 116(D6).
Line 257-265: “W” is West aspect or the Wind factor, please clarify. The method used to define aspect factor is not clear. It seems that different works have already followed this method (Dvornikov et al., 2015; Evans et al., 1989; Liston and Sturm, 1998), however this manuscript may benefit of a more accurate description about this method and how it is applied.
Random forests sections. Here you must cite:
- Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R New s, 2,6.
Moreover I encourage manuscript authors to provide more details about random forests and how these works with an appropriate language (for example I don’t see appropriate to talk about “votes”).
Line 309: Is there any reason to “square root ” the SWE? Please explain why and include references/arguments to justify this decision
Line 310; include here the split sample %.
Line 313-319: The information detailed in this paragraph about all subsets used can easily be summarized in a table. Please do it and give names/acronyms to these subset models to then use them in results section.
Line 321: In line 303 authors state: “feature importance can be difficult to interpret in comparison to linear modeling or GAM approaches” and in this line it is stated “random forest performed the best of the three models and has the most comprehensive feature importance metrics”. These two sentences are contradictory, please change conveniently and be consistent with the statements included in the manuscript.
Line 326: “we measured the value of each input feature in predicting” you mean feature importance or contribution to the model? One may expect that value is referred to the variable values (NDVI, value, TPI value….). Please be consistent along the manuscript with this definition of features contribution to models prediction capabilities.
Line 329: Include this reference here:
- Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in forests of randomized trees. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems 26 (pp. 431–439). Inc.: Curran Associate
Line 337: Add: both importance metrics (MDI and MDA)-
Line 340 to 342: This sentence must be moved before describing all models (line 308), as variance inflation and correlation coefficients are computed previously in order to analyze collineraity before applying the models.
Line 361 to 364: This sentence must be removed; this information is already provided in section 2.2
Line 386: Where are observed the solifluction lobes. Please show it in Figure 4 or remove this sentence.
Line 389 to 392. Where is this information (wind and aspect) shown? It is shown in Figure 4, so please include it also here.
Line 402: How was determined the 20-80% split to train/validate the model? Why 300 trees? Please include some references of previous works or explain why you chose those values
Line 405: Several tests with same configuration except TPI distance? In which test site? With all years?? Please clarify.
Line 409: NDVI model tests. Same questions of previous comment must also be answered here.
Line 426: In figures A3 and A4 are shown the SWE distribution maps obtained with the different models but here it is not clarified the % of the data used for validation and for training. Are you using all data for training and then you plot model results? This point must be clarified.
Line 430: Change this title. You are also doing a prediction of SWE in previous section.
Line 444: Figure 7 SWE maps are obtained with the model that included data from all years and sites. Clarify.
Line 447 Authors know where stream saws are or, where the permafrost slump is located. Oppositely readers, who do not know these study sites, are not familiarized with these landforms. Show it in figure 7 (mark it with lines, an arrow…).
Line 455: Is there any reason to justify why higher SWE have higher errors? This point must be discussed later.
Line 462: The sentence “even though year is ranked…of our study.“ From my point of view this sentence must be moved to the discussion. If the discussion includes a new section in which future work is explained
Section 4.7, SWE correlation between years. I really like heatmaps. However I don’t see why authors didn’t include correlation coefficients (pearson, kendall…) This will help to understand SWE correlations. Please compute correlation coefficients.
Lines 473 to 480 are redundant. This must be detailed in the introduction, not here.
Line 503-504. I have found very interesting previous lines discussing the correlation of snow depth and density. However, I have not found in the manuscript any reference to the “no relationship for shallow snow (<60cm)”. In figure 3b are shown all snow depth value. Moreover there are no previous references to the “60 cm” threshold. This result must be highlighted in results section. I encourage manuscript authors to add a new graph in figure 3 but including only snow depth values above 60 cm to show the positive linear correlation between snow depth and density.
Line 512-516. As previously highlighted, I would appreciate to include here references to mountain area works, where the inter-annual consistency between SWE (and snow depth values) has also been observed.
Lines 520 to 524: These lines must be included in a new section named “future work” (or similar).
Line 535 to 541: This is also future work. Moreover some parts of these lines are redundant with previous statements of the work and can be removed.
Line 543-545: Remove, already explained in results section.
Line 548: Add references to justify that “consistent with those in previous studies in terms of how those factors affected snow distribution in the”.
Line 549; Landforms are not features included in the models. This sentence must be rephrased in order to highlight that “stream bed, permafrost thaw slump edges” tend to accumulate more snow.
Line 551: NDVI values were obtained at a particular period of the year (in July). NDVI in late summer, early autumn might be very different. This point must be discussed, highlighting the importance (or not) of obtaining the maximum NDVI along the year. Please add references to justify NDVI evolution in sub-arctic areas with dominant presence of shrubs.
Line 560-564: This sentence is too long and redundant. Please split it and remove unnecessary statements”
Line 565: 300 m is a gradient in altitude? If yes, please state it, otherwise claify.
Line 568-570: move to Future work section,
Line 572-574: Maybe show the correlation between TPI and NDVI in these study areas is interesting and shows. Just a suggestion.
Line 574-575: Add references to justify that moisture accumulates here and this is associated with higher ecological productivity.
Line 580-581: Sentence for future work section.
Line 590: UAS are drone observations?, These devices are usually named as UAV:
- Adams, M. S., Bühler, Y., & Fromm, R. (2018). Multitemporal accuracy and precision assessment of unmanned aerial system photogrammetry for slope-scale snow depth maps in alpine terrain. Pure and Applied Geophysics, 175, 3303– 3324.
- Harder, P., Pomeroy, J. W., & Helgason, W. D. (2020). Improving sub-canopy snow depth mapping with unmanned aerial vehicles: lidar versus structure-from-motion techniques. The Cryosphere, 14(6), 1919-1935.
- Revuelto, J., LópezâMoreno, J. I., & AlonsoâGonzález, E. (2021). Light and shadow in mapping alpine snowpack with unmanned aerial vehicles in the absence of ground control points. Water Resources Research, 57(6), e2020WR028980.
Nowadays UAV acquisitions of snow depth are accurate and very dense in space I would not cite López-Moreno et al., 2009, as the interpolation methods they described and evaluated is not suitable when working with UAVs. In the contrary I would cite the articles referred above.
Line 595: Remove this part of the sentence (which is not needed in the conclusion): “which is being undertaken in current work by the authors.”
Line 599-600: This affirmation has not been demonstrated in this study and must be removed: “this model may be used to estimate snow distribution beyond the study sites, work that is also ongoing by the authors”.
Line 600-603: This is future work, not a conclusion; I would remove it, or at least shorten it.
Line 610: “that we believe represents shrub pattern”. Believe something is not a conclusion. The results have shown that the NDVI is the most important feature to explain SWE with random forests, this is the conclusion. Similarly, TPI is a very important feature. What TPI represents is not an output of your research (“an index that represented the features in the landscape such as the stream bed and various topographic features including solifluction lobes” is an appreciation of the authors). Please change conveniently this section of the conclusions.
Line 611-615. Somehow this is future work. I would remove this sentence or move it to the discussion.
Conclusions section: I don´t see an interesting outcome of this research: the linear relation between snow depth and snow density for snow depth values above 60 cm. I would include it here.
Figures and tables:
Figure 2: In 2017, 2018 and 2019, was measured snow depth and density in same locations? If now, I encourage showing three separate maps with true locations for each year.
Figure 3 See comments of lines 503-504
Table 1. Several distances to compute the TPI are used in the manuscript. I would remove the 155 m distance of TPI here and state that several distances were tested.
Figure 9: change graphs background to white. This will help to interpret the light yellow areas showing lower points density.
Figure A3, The fact that you did not include year as a feature in the model must be explained in the text not in the caption.
Figure A4 and A5, it is a bit hard to get an idea of error spatial distribution. You can also include a frequency histogram (small panel inside this figure) to provide a better overview of erros.
Citation: https://doi.org/10.5194/tc-2021-341-RC2 - AC2: 'Reply on RC2', Katrina Bennett, 21 Mar 2022