Comment on tc-2021-156

Overall, the paper is a useful addition to microwave remote sensing as well as providing characteristics of tundra snow. The two basic findings are that a) the coefficient of variation (CV) can be used to introduce sub-grid (<3 km) scale realism into microwave retrieval models, and b) using some form of a snow depth-to-depth hoar relationship, the extreme microwave scattering of the depth hoar (and implicitly, the self-emission of the wind slab) can be approximated when using these models. The authors claim an improvement in SWE retrievals (Tb) by (8K). This finding would have been more useful if expressed in terms of depth or SWE improvement.

I found this to be an interesting and reasonably well-written paper. I will restrict my comments to primarily the nature of tundra snow, and the remote sensing of tundra snow, as I am not sufficiently well-versed in Bayesian statistics to comment on those aspects of the paper.
Overall, the paper is a useful addition to microwave remote sensing as well as providing characteristics of tundra snow. The two basic findings are that a) the coefficient of variation (CV) can be used to introduce sub-grid (<3 km) scale realism into microwave retrieval models, and b) using some form of a snow depth-to-depth hoar relationship, the extreme microwave scattering of the depth hoar (and implicitly, the self-emission of the wind slab) can be approximated when using these models. The authors claim an improvement in SWE retrievals (T b ) by (8K). This finding would have been more useful if expressed in terms of depth or SWE improvement.
What is missing from the paper is some deeper inductive reasoning that could take the work farther and make it more general (and less about two particular tundra locations). Personally, I found Figure 3 the most interesting result in the paper and found myself wondering why the CV (as a function of the area measured) appeared to be asymptotic to 1. It was not that I doubted the data, but I wondered if that was some physical limitation to CV, or just some limitations in the available data. The authors stated in their discussion that: However, the resolution of SWE products like GlobSnow are much larger (25km); future investigation of ð ¶ðð ð values at those scales have the potential to help GlobSnow 3 (Pulliainen et al., 2020). and I agree with this statement, but suggest we hardly need to wait for future investigations. I would suggest the authors could address this issue more thoughtfully in this paper using the knowledge base they already have. Let's start (Table below) by examining some extreme depth distributions using Excel. For a completely homogeneous snow depth field, the CV approaches zero. For more realistic heterogeneous snow, and certainly most tundra snow fields, the CV rises with area because (I believe) of snow drifts. For example, in a landscape of mostly very thin snow with with a few very deep drifts I was able to produce values >4 (Case 7). This is exactly the type of situation that exists in tundra snow, particularly in the windier tundra areas (e.g. the Arctic Refuge in Alaska and in the Barrenlands of Canada) where wind scour and drifting is most extreme. I suspect CV values over 2 are often realized, for example the tundra landscape shown below (after the thin snow has melted): (see WORD file for tables and figures) But the authors need not just deal with this CV issue in a theoretical framework: they should have access to the TVC lidar maps we produced in 2012. They could readily run a Monte Carlo simulation, varying the location and area examined, then plot the resultant mean depths and CVs thereby adding to the figure. Once that was done, they could move to more general application of CV to the full range of tundra snow.
By the way, a quick look at Wikipedia indicates that for small samples, CV is low-biased.

(see WORD file for tables and figures)
The other aspect of the paper that bears some thought, and is related to the above point, is how wind slab and depth hoar fractions must interact.
Step 1 in approaching this would be to explain in greater detail how those types of snow were identified in the snow pits in this study. I was struck by the relatively close density values reported in the study for depth hoar and wind slab (means 266 vs. 335 kg/m 3 ). The former value is typical for mildly indurated tundra depth hoar, but the latter is quite low for tundra wind slab, which can exhibit values over to 550 kg/m 3 . Wind slabs of 300 kg m 3 are often soft and hardly wind-worked at all, and in addition, many less experienced field practitioners fail to note small and newly faceted grains in wind slab of this nature. Then there is the problem of "indurated depth" hoar (Sturm et al., 2008;Derksen et al., 2009;Domine et al. 2018), snow layers that were wind slab but have metamorphosed into depth hoar. Presumably the critical aspect of differentiating these textures for microwave remote sensing is that the ornate, hollow and plate-like depth hoar grains scatter microwaves far better than the wind slab, hence subdividing the pack into those two fractions is critical. The relatively similar values of SSA (Figure 4a) for slab and hoar suggest to me the authors were dealing with a of properties rather than a truly distinct bimodal snow pack. I went back to the paper the authors referenced related to a two-component snow model they used: Saberi, N., Kelly, R., Toose, P., Roy, A., Derksen, C., 2017. Modeling the observed microwave emission from shallow multilayer Tundra Snow using DMRT-ML. Remote Sens. 9. https://doi.org/10.3390/rs9121327 and was pleased to see that a long-forgotten paper of mine (Sturm, Matthew, Thomas C. Grenfell, and Donald K. Perovich. "Passive microwave measurements of tundra and taiga snow covers in Alaska, USA." Annals of Glaciology 17 (1993): 125-130.) had been used in developing that model. That work showed that depth hoar volume scattering was more than 6X effective compared to windslab.
It should be possible to go beyond the findings of Rutter et. al. (2019) for TVC, where the DHF was shown to stabilize at 30% for depths over 60 cm, but not why. Figure 2 in this paper shows for both study sites long tails on the distributions out to 150 cm, while the mean depth appears to be 1/3 rd of that value. In a recent paper Parr et al. (2020) defined a drift depth threshold as being approximately the mean plus 1s, so that "extra" depth is statistically likely to be transported snow. A different way to look at Figures 5 and 6 is that for the mean snow depth half the pack is depth hoar; where the pack as been scoured (drift snow removed) that fraction is higher; where the snow is drifted, that fraction is lower. Perhaps the fraction where it is lower would be the mean plus 1s…I am not sure. But some attempt to understand the processes behind the statistics (Bayesian or otherwise) could help generalize the results beyond to very specific tundra locations.
In conclusion, I recommend this paper be published, but not before the authors go back and put more critical thought into what drives their results…the processes and nature of the tundra system. Then try to make the results more applicable to all tundra snow. Whether that is "major" or "minor" revisions is, well, probably just semantics.   : For much of tundra snow, tussocks rather than shrubs, are a control on the DHF. Also, since shrubs can be layed down under the snow (and frequently are), a relationship between depth hoar and/or wind slab and NDVI seems tenuous at best. Line 335: "… while the mean depth (âð ð) is dependent on precipitation at a larger scale…". This is categorically NOT true for much tundra snow, wear I would contend that wind plays as strong, and sometimes stronger, role than the mean precipitation within a domain.

Minor Comments
Line 344: "…potential underestimation of the ð ¶ðð ð parameter." See above discussion of CV. The issue of what constitutes a representative domain ( or snow landscape) is thorny. Clearly if a domain fails to include, say drifts, the CV will be too low. Likewise, if the domain is limited to a coupled drift and scour zone it will be too high.
Language: Mostly clear and reasonably concise; a few minor awkward areas that could be improved.