| Review of 
 Estimating the snow depth, the snow-ice interface temperature, and the effective temperature of Arctic sea ice using Advanced
 Microwave Scanning Radiometer 2 and Ice Mass balance Buoys
 data
 
 by
 
 Kilic, L., et al. - REVISION 1 -
 
 Dear authors,
 
 you did partly a very good job in improving the manuscript so that many passages can be understood much better now. However, I don't think it is ready for publication yet and - in my eyes - should not yet be accepted. The main reason for this is that I am still missing critical reflections on A) the inter-dependence of the used input data, on B) the uncertainty of the input data (namely the potential of OIB and IMB data being biased) and the propagation of this into the final products, and on some of the figures displaying the results (namely Figure 8, 9, and 12). In a regular paper A) and B) would be discussed in the Discussion section - which is however used to present the "end product" and a bit discussion of it. This does not replace a critical review about how reliable this suite of used regressions on partly correlated and partly potentially biased input data is.
 I recommend therefore, that the authors sit together for some more revisions of the manuscript before it can be accepted for publication.
 
 Upfront I have to admit, that I am not particularly happy with the way and the degree of detail the authors have responded to my review. Many questions were ignored and hence not answered - neither in the reply to the comments nor in the paper manuscript. And a considerable amount of the questions and comments was realized in the paper manuscript in a not too convincing manner. This applies, for instance, to using literature which does not provide the basic information required (see the P2,L24 comment (see below)).
 
 - Another example is this one: One of your replies to my review (original manuscript Page 6, Line 14-20) was: "Yes, I computed the snow depth using the AMSR2 Tbs at 19V and 37V following the equations/coefficients described in Markus and Cavalieri, 1998."  and then later "We removed MandC98 comparison. MandC98 is not designed for Arctic. It was here as a reference for the comparison but we do not want to evaluate it." The way you reply to the comments is confusing. In addition, what you write is not correct. Yes, MandC98 is not designed for the Arctic. But one can of course use it over first-year ice. This has been done, there are papers about it and there are even data sets of it which are freely available. So, please also in the reply to reviewers' comments pay attention to what you write - particularly for a journal like "The Cryosphere" where the discussion can be seen by others.
 
 - And finally this one, where you seemingly ignored a few of my questions and comments. My comments:
 P5, L25 to P6, L6:
 - Please explain why you use the OIB product with the much better spatial coverage and hence representation of the satellite footprint conditions only for the forward selection. Would it have been more straightforward and logical to carry out both, the forward selection AND the regression using the OIB data? What is the added value using the IMB snow depth values?
 - Please provide an additional table in which the results of the statistical forward selection are summarized.
 - Please explain the statistical measures used in the forward selection. May I ask whether you tried all frequency and polarization combinations? How many in total did you try?
 - Please provide at least an example, e.g. a scatterplot or 2-dimensional histogram, in which you illustrate the relationship between the 3 channels used for the best retrieval and the OIB snow depth data. It would be very intriguing to see how much the measurements scatter around the regression lines.
 
 Your reply: "We want our snow depth algorithm to be optimized for IMB measurements. The IMBs also measure the Tsnow-ice which is one of our interest variable and the Teff is derived from the Tsnow-ice. So the OIB data were chosen for the forward selection only because the forward selection was not satisfactory with the IMB data as the snow depth variability is limited.
 It is a stepwise forward selection. To select the most relevant AMSR2 channels, the stepwise regression (Draper,N. R., and H. Smith. Applied Regression Analysis. Hoboken. NJ: Wiley- Interscience,1998. pp. 307-312.) was used. It is a sequential parameter selection technique designed specifically for least-squares fitting. The method begins with an initial model, at each step p-value are computed and predictors included in the model are adjusted. We can constrain the number of predictors (here AMSR2 Tbs at different channels) to as many as we want."
 
 But these are just reflections of my impression of how the authors dealt with the review. More important are the following concerns:
 
 What the authors still fail to provide - in my eyes - is a proper discussion about the uncertainties involved. The linear regressions are taken as if they provide the truth and I have difficulties to see a critical review & discussion of the results which go into that direction. Training and evaluation with independent parts of a data set which potentially has biases does not improve the result. It only tells that the results obtained potentially have the same bias as the data used for "evaluation".
 - OIB data have a certain uncertainty, are known to underrepresent thick snow over deformed sea ice and also over MYI, and to have problems with a particularly thin snow cover - as evidenced in the literature in the recent 5 years.
 - IMB snow depth estimates have an uncertainty which is briefly mentioned in the data/methods section but which impact on the results is not further discussed.
 - The interdependence of the products and methods (see my first review) by using the same channel combinations or derivates in almost all steps of the production chain is not discussed; adding one sentence I find a bit short.
 - A detailed discussion of Figures 8 & 9 is still missing - even though the authors started to give some information in the reply to my review. I want to encourage the authors to write more of this into their paper! It will give the reader the impression that the authors critically thought about the results obtained and that the authors are aware of the limitations and caveats in input data and methods. Publishing a resulting data set on a web page is, by the way, not a quality marker and cannot replace an independent quality assessment.
 
 Other than that I have the following remaining concerns:
 - Still an illustration of the suite of methods used with a diagram would strongly aid in understanding the paper. It would also illustrate much better (and perhaps solve my concern with this regard) about which parameters and input TBs enter which part of the retrieval. Such an illustration could serve as a perfect starting point to better estimate the uncertainties of the retrieved parameters which are partly depending on each other and hence errors in one parameter propagate into the next one.
 - I hoped that the concept of using a "centred" TB would have re-formulated the way it is understandable better, i.e. writing about "deviations from a mean TB" or "residual TBs" or similar.
 - Once again: The produced quantities rely heavily on the usability and applicability of IMB and OIB measurements for your purpose. Therefore, even though these data are taken from the RRDP, it is in my eyes not sufficient to refer to the documentation there - particularly in case of the OIB data. This is a remote sensing product of the snow depth and not an in situ snow depth measurement.
 - Still, it is not clear what the error in the IMB snow depth measurements is and what the impact on the results is. I mean, perhaps everything is in place and readily explained but I did not find a statement like: "Deployment of the thermistor chains used in the IMBs is always such that one known thermistor is placed exactly at the snow-ice interface." *** That way the 5 mm precise measurement of the location of any surface (snow or ice) to the acoustic sounder would allow a precise measurement of the snow depth - however, at one single place only - not like with the AWI buoys deployed in the Southern Ocean where 4 snow depth measurements are made and averaged to avoid biases due to snow drift or similar - which is another error source ignored by the authors. Any statement like the one mentioned at *** would also help in the discussion of how accurate we know the snow-ice interface location and temperature.
 - Still, I am not particularly satisfied with the discussion of Figure 12, your main end product. One thing I have difficulties with is the strong gradient in Tsnow-ice in an area where the MYI concentration is high and where the snow depth is >= 0.4 m for both January and April cases. In addition to the >= 0.4 m snow depth areas I comment about further down in this re-review I am wondering also about those quite large areas with a snow depth below 5 cm and about the dynamics of the snow depth distribution over the course of the winter shown.
 
 P2, L24: Although one can find information about sea-ice emissivities in the paper by Spreen et al. (2008) I would definitely prefer citing an older reference, potentially one where these emissivities have actually been measured.
 
 P3, L8-10: I am very happy with the more detailed description what Teff is. The only problem in understanding is the expression "integrated" being used for temperature as well as emissivity. What the difference between integrated and just, e.g., the mean temperature or emissivity of the respective layer?
 
 P4, L8-18: In Line 12 you write "The acoustic sounder measures the position of the snow and ice surfaces with a ... computed". In the response to the reviewers' comments you clearly state that it is two sounders and one measures the location of the snow (or ice) surface and the other one measures the location of the ice underside. Both together results in the total sea ice plus snow thickness. Why is the response more detailed than what you write in the text?
 - The accuracy of the snow depth (and ice-snow interface location) is unknown, right? - unless one of the thermistors is placed exactly at the ice-snow interface.
 
 Page 5, L4-5: Since the OIB data set is an important ingredient of your soup it might make sense to be very careful (and critical) with statements about resolution and uncertainty of the OIB snow depth products because in addition to (and since) Kurtz et al. (2013) quite some research has been conducted and results of that point to i) rather a double vertical resolution of the one quoted (see e.g. Kwok and Maksym, 2014) as well as ii) a minimum snow depth of about 8 cm required to be detected unambiguously (again Kwok and Makysm, 2014; Holt et al, 2015). A summary of various approaches and their limitations has been given by Kwok et al. (2017, https://doi.org/10.5194/tc-11-2571-2017). This latter reference would also be perfectly suited - particularly for OIB - for Line 10, in addition to the reference given already.
 
 Page 8, Line 1-3: I suggest to provide the months here, i.e. March/April for OIB and October (?) to April for IMB.
 In addition: What you write here, lets me conclude that the "forward selection" of the relevant channels mentioned earlier, based on OIB data, is perhaps not having that much of an influence on the results.
 
 Page 9, Line 9-12: Yes, thank you. Exactly. And the same is true for the snow depth because a surface measurement with 5 mm precision is worth nothing (harshely put) if the location of the snow-ice interface is only known with 50 mm accuracy.
 
 Page 12, Line 2: I doubt that Spreen et al. (2008) do make any statement about ice types in relation to 10V and 6V GHz channels of the AMSR-E instrument. Please correct and choose a correct reference.
 So, following your statement it doesn't matter whether I have 3 m thick MYI or 50 cm thick FYI, with a 10V or 6V GHz channel both look the same and there are no issues with different penetration depths and salinities?
 
 Page 15, L1 and 4/5: I would find it more logical to have the details of the MYI product mentioned directly behind its first occurrence here, i.e. the sentence in Line 4/5 should go to Line 1 after the URL of the MYI concentration data set.
 
 Figure 12: Is the maximum snow depth to be retrieved 40 cm? I am asking because I am wondering about the snow depth distribution which according to the color scale the snow depth is "exactly" 40 cm. Perhaps you either shed light on this in the text or provide a histogram of the snow depth distribution for the three maps shown so that a reader can discover more details.
 - In addition to the white dots denoting, e.g., the meridians, there is quite some other noise in the maps of the snow depth and possibly also in the maps of the snow-ice interface temperature . Can you comment on this? Or is this simply a low-quality figure used for peer-review and the final figure will be of enhanced quality without noise / gaps?
 |