Inter-comparison and evaluation of Arctic sea ice type products

Ye, Yufang; Luo, Yanbing; Sun, Yan; Shokr, Mohammed; Aaboe, Signe; Girard-Ardhuin, Fanny; Hui, Fengming; Cheng, Xiao; Chen, Zhuoqi

doi:https://doi.org/10.5194/tc-17-279-2023

Articles | Volume 17, issue 1

https://doi.org/10.5194/tc-17-279-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/tc-17-279-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 17, issue 1

Research article

|

20 Jan 2023

Research article |

| 20 Jan 2023

Inter-comparison and evaluation of Arctic sea ice type products

Yufang Ye, Yanbing Luo, Yan Sun, Mohammed Shokr, Signe Aaboe, Fanny Girard-Ardhuin, Fengming Hui, Xiao Cheng, and Zhuoqi Chen

Download

Final revised paper (published on 20 Jan 2023)
Preprint (discussion started on 19 May 2022)

Interactive discussion

Status: closed

RC1:
'Comment on tc-2022-95', Anonymous Referee #1, 17 Jun 2022

Review of “Inter-comparison and evaluation of Arctic sea ice type products”, by Ye et al.

Summary

This paper compares different sea ice type products currently available to the community. The products are based on passive microwave data, scatterometer data (C or Ku band), or a combination of both. The products have been developed empirically via training data. The type fields are inter-compared and evaluated against a widely-used sea ice age product and SAR retrievals. The products perform better in mid-winter than in early or late winter when melt/re-freeze may occur. Ku-band scatterometer generally is better at type discrimination. Combination of passive microwave and scatterometer data can yield better performance at times, but not in all situations.

General comment

This is a fairly comprehensive review of the primary sea ice type products available. There are notable differences in how the products are assembled, the input source data, and their performance in different conditions. Thus, this paper is a valuable contribution to the community be providing such an assessment. The paper is quite thorough and overall it does a good job in presenting the inter-comparison and evaluation of the products.

Specific comments are below, but one overall comment is on the SAR data used for evaluation. In general, SAR is going to be the best “truth” for comparison. It is high resolution, so it can delineated even individual floes often. And it is all-sky, so retrievals of type are available anywhere the sensor collects imagery. However, the challenge with SAR is interpreting the imagery. The authors interpret the SAR imagery and classify various locations as a given ice type, but they don’t give a particular rationale or provide references for their classification basis. Often, expert ice analysts interpret SAR fields for operational ice charts. They have deep experience in understanding the imagery and properly defining features. It appears the authors here classify the imagery themselves. This is okay, but I would like to see more substantial justification for their classification.

Another weakness with the SAR comparison is that it is just a few scenes in selected regions and selected periods. And even within the SAR scenes, a few specific locations are picked out as “pure types” for comparison. Ideally, a full SAR image would be classified and compared. I know automated SAR classification algorithms for sea ice are troublesome, so I can understand the approach taken, but it results is a fairly ad hoc and qualitative evaluation. Since this paper is otherwise quite comprehensive, I won’t request more evaluation, but ideally (perhaps in a future paper), it would be good to get classified SAR images – perhaps from an expert ice analyst at an operational ice center – and conduct a more comprehensive and quantitative evaluation of the ice type products.

A final note is that there is a need for a thorough copy edit for English language style and grammar. The issues are mostly minor – in particular, there are numerous missing articles (“the”, “a”, “an”) – but they are widespread throughout the manuscript. I don’t bother to point them out individually as they are too numerous, but they need to be addressed before final publication.

I recommend publication after minor revisions.

Specific comments (by line number):

11: The authors definite “sea ice type” as “SIT” here. This is fine and it is used consistently throughout the manuscript. However, as a sea ice scientist, “SIT” means “sea ice thickness” to me. And particularly with numerous thickness products coming out from altimeters, “SIT” is becoming quite common in the community to denote thickness. I can understand wanting to use an abbreviation and “SIT” makes sense for ice type, and the context is clear throughout the manuscript. So, I can’t say it needs to be changed, but it might be something for the authors to consider. For me, every time I saw it, “thickness” popped into my mind first until I recalibrated. I can’t think of another good abbreviation myself, but one could just use “type” or “Type” as a short-hand, instead of “SIT”.

28-30: I’m struck by the use of more than author listed and then “et al.” in the citations – i.e., “Comiso, Parkinson, et al., 2008”. Generally, if there are more than two authors, just the first author is listed followed by “et al.” – i.e., it would be “Comiso et al., 2008”. In looking at The Cryosphere guidance for citations, I don’t see anything that indicates two authors should be listed, so I’m not sure of the rationale. This seems to be done throughout the manuscript. (If there are only two authors, you list both, e.g., if it were “Comiso and Parkingson, 2008”.) Not a big deal and I assume the copy editing will decide the proper citation format. I just haven’t seen this before and it struck me as odd.

31-32: Be careful about terminology. “Thin” and “Young” ice are standard stage of development classifications. I think here you mean “thinner and younger” for FYI, and then “thicker MYI”. I’m also not sure what you mean by “firm” in relation to MYI?

57: “ergodic” is an obscure word – I was not familiar with it. Based on my understanding after looking it up, I’m not sure it is used properly here. Regardless, I think a simpler word is appropriated here or I wonder if it is needed at all – “combined use of both data” is clear to me.

62-63: “While ice chart…” is a confusing sentence – not sure what it is say. I would suggest revising.

72: Just one example of grammar/style issues: “…are detailed investigated.” – It should be “are investigated in detail.”

107-109: Is AMSR-E used in the product? The description indicates only AMSR2 is used. So, why describe AMSR-E characteristics? Why not just describe AMSR2 characteristics?

109: Maybe another grammar/style issue: “working” is okay, but typically when describing sensors or satellites, “operating” or “collecting data” are more common. “working” seems a bit colloquial here.

147: This goes for all products, but noting here because NSIDC products have specified references that should be used. For SIA, it is:

Tschudi, M., W. N. Meier, J. S. Stewart, C. Fowler, and J. Maslanik. 2019. . [Indicate subset used]. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. doi: https://doi.org/10.5067/UTAV7490FEPB. [Date Accessed].

This should be cited in the manuscript text and listed in the references. I see that the dataset website is noted in the Acknowledgment section, but where a reference is provided, it should be included in the manuscript proper, including the dataset DOI. I know all datasets do not provide a formal citation and/or DOI – for example for OSI-SAF, their recommended citation is simply: “The type dataset shall be referred to as the .” If that is all that is provided, that is fine, though I would also say that the product ID (OSI-403-d) and version (if provided) should be included. The other datasets used should be cited to the extent they properly can be.

185-186: I think the potential for MYI increase could be explained better here. In practice, overall Arctic MYI cannot increase over the winter – it can only decrease via advection out of the Arctic. “Temporary” increases can happen within products due to divergence – e.g., a 100% MYI pixel diverging into two pixels with 50% ice each; if the threshold for detection is <50%, there will now be two pixels. And regionally, MYI can increase, both due to divergence or due to advection into the region from neighboring regions.

191: This is discussed a bit more later, but this left me hanging: “why such a dramatic peak in the first half of winter?” Maybe provide a brief explanation and then say it will be discussed further later in the paper.

204: I would use “to” instead of a “-“ because it looks like a minus sign. Or use an “em-dash” or “en-dash” with spaces on each side.

219: Figure 5 is mentioned quite cursorily here, but I notice the behavior of several products in BS during 2016-2017. That sticks out compared to other years and regions. Why was the performance so different?

224-225: Okay, the KNMI-SIT increase is mainly in the BS and ESS regions. But why? In general, this paragraph (223-229) feels like it needs to drill down a bit more and give more detail/explanation.

259: Kind of the same thing here. Okay, you have an overestimation of MYI, but that doesn’t specifically explain the abnormal increase in MYI during 2016-2017. Why was the MYI overestimated in the one year versus others.

265: How are cases selected? Were they ad hoc? Random? Was it simply availability of imagery? Or was there some physical rationale to select the scenes? I understand in general wanting different regimes and different time periods, but why those specific images on those specific days at those specific regions? In other words, what “different conditions” were you selecting for here?

268-271: Following from my general comment above, how were characteristics of the SAR images used for visual interpretation. What is the basis? There are no references here to justify the classifications

273-278: This paragraph illuminates the previous comment. The text is very “squishy”. You say things like “appears to be MYI”, “more likely to be MYI”, “which could be interpreted as newly generated FYI”. This is very qualitative and seemingly tentative. I think maybe you just need to say why something “appears to be MYI” – what is in the SAR image that leads to that conclusion and what is the basis for that?

288-293: Same here. You have “could be identified as MYI”, “are a typical feature of FYI”.

299: I guess there is a thematic reason for the order – looking at early and late winter as “edge” cases, but it seems more logical to order these subsections chronologically: early winter, mid-winter, and then late winter.

315-319: Again, very qualitative.

391-446: I can see the logic of discussing the methods here – you are linking them to the performance assessed previously. However, to a large extent, this feels like it should go with the data product descriptions in Section 2.1. I guess moving this there would make that section rather long. But I kind of feel like I get to here and I finally understand how the type products are created – after all of the evaluation and comparison. I’ll leave it to the authors to decide if this fits better in 2.1 or should stay here. Or maybe, put some description in 2.1 and then the relation to the product evaluation here.

400: “[55]”? Is this a numbered reference?

434-436: Melt affects the performance in early and late winter. But melt also basically makes the algorithms ineffective in spring and summer. That is implied, but never really stated explicitly it seems.

Figure 5: Following up from above, I’m struck by the notable increase in many products in BS during 2016-2017. That is not noticeable in another region or year. There is some discussion, in relation to OSI-SAF in Figure 7, but the text doesn’t really discuss this. I think this deserves some explanation in the text.

Figure 5: What is the shaded green region that accompanies OSISAF? Maybe it is in the main text and I missed it, but regardless, it should also be included in the caption for the figure.

Figure 6-7: This is a style/aesthetics thing, but the beige/brown for the OW seems odd. Such a color is more commonly used for land, and definitely not for ocean. I would suggest considering a different OW color – just swapping the land color (light gray) for the ocean color, would be more logical to me. Of course, you want to make sure that the colors contrast and are clearly delineated. But a good solution other than beige/brown for OW seems possible.

Citation: https://doi.org/10.5194/tc-2022-95-RC1
- AC1: 'Reply on RC1', Yufang Ye, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://tc.copernicus.org/preprints/tc-2022-95/tc-2022-95-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/tc-2022-95-AC1
RC2:
'Comment on tc-2022-95', Anonymous Referee #2, 21 Jun 2022

General comment

This manuscript presents the inter-comparison of various SIT products from microwave remote sensing data. The performance of the SIT products was evaluated and the causes of differences in the products were analyzed. SIT has been used as an important information in research for global climate change and future prediction. Therefore, a comparative study on the performance of the operationally used SIT products is of high importance.

The manuscript is well written, and appropriate tables and figures are used to explain the results. It seems very meaningful to analyze the comparison results in time and space. However, in order for this manuscript to be published in the Cryosphere, more descriptions should be added about the data and methodology used in the study (requires a section on methodology). More discussion of the results is required.

Specific comments

Abstract: Specify the names of the SIT products (algorithms) analyzed in this study.

Line 27: Please state clearly why sea ice is a sensitive indicator of climate change.

Line 29: It would be nice if it quantitatively indicated how much the thickness and volume of sea ice decreased.

Line 35-37: Please specify how sea ice patterns affect Arctic and mid-high latitude regions and how they affect Arctic ecosystems.

Line 68: The authors did three scientific questions, but the second question (how we choose SIT product for different applications) is lacking in discussion.

Line 83: How are the microwave scattering and radiometric characteristics of MYI and FYI different?

Line 77: Each product has a different grid size. It should be explained how it was dealt with in the comparative evaluation.

Line 78-143: Please describe in more detail how FYI and MYI are distinguished due to which characteristics in each SIT algorithm. For example, if a SIT product is produced based on PR and GR, an explanation is required for the differences between the values ââof PR and GR of ice types ââand why the differences occur.

Line 150: NSIDC-SIA was used as reference data. How accurate is NSIDC-SIA?

Line 153: How and what information was retrieved from the SAR images for the SIT products evaluation should be described.

Line 173: Is it the result of this study that different SIT distribution patterns were found in the regions selected by the authors?

Line 185-186: Is ‘divergent movements’ the only cause of increase in the MYI extent?

Line 263: The authors compared SIT daily products with the SAR images. It is necessary to discuss the comparison between the image captured at a specific time and the daily product.

Line 263: The authors identified the distribution of MYI by visually analyzing the SAR image. It would be better if MYI could be determined by quantitatively analyzing backscattering or textures from the SAR images.

Line 368-371: How are the input parameters affected by atmospheric factors and surface features? More discussion is needed.

Line 393: Explain clearly about the training dataset.

Technical comments

Line 157: SAR Wide B à SAR Wide Beam

Line 199: What does (2000) mean?

Line 402: Is [55] a reference number?

Figure 4: Delete ‘Jan’ on the horizontal axis.

Citation: https://doi.org/10.5194/tc-2022-95-RC2
- AC2: 'Reply on RC2', Yufang Ye, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://tc.copernicus.org/preprints/tc-2022-95/tc-2022-95-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/tc-2022-95-AC2
RC3:
'Comment on tc-2022-95', Anonymous Referee #3, 22 Jun 2022

Review of

Inter-comparison and evaluation of Arctic sea ice type products

by

Y. Ye, et al.

Summary:

A large, albeit shrinking portion of the Arctic Ocean sea ice cover is made of multiyear ice (MYI) that has survived at least one summer melt season. In order to more accurately assess the trend in Arctic Ocean MYI cover and the coverage of first-year ice, and to more reliably use these ice type fractions in other research areas, such as sea ice thickness retrieval, it is important to evaluate the existing sea ice type products. This study is an attempt into this direction. Nine different sea ice type products based on five different algorithms are compared with the NSIDC sea ice age data set and the MYI extent derived from it as well as with a set of five qualitatively interpreted satellite synthetic aperture radar (SAR) images. Time series of the MYI extent at daily and monthly temporal resolution are shown, inter-compared and discussed qualitatively in the light of the different algorithms, their potential limitations and post-processing steps. The performance of the different products is compared for specifically selected sub-regions of the SAR images.

I have a number of concerns with this manuscript which I summarize in my general comments and detail in my specific comments.

I also would like to note that the manuscript is difficult to read because of quite a number of strange formulations and problems with English grammar.

General comments:

GC1: As the authors state, this is one of the first (kind of) comprehensive evaluation of sea ice type products. This calls for provision of a solid physical background of the sea ice and its snow cover as relevant for its remote sensing using active and passive microwave instruments. This element is missing and jeopardizes the usefulness of the entire manuscript.

GC2: The description of the input satellite data and the algorithms used in the products as well as in the one major evaluation data set used is very heterogenuous and not complete for the understanding of the manuscript and its results. At least two products (NASA-Team MYI concentration and ECICE MYI concentration) are missing in addition.

GC3: The inter-comparison contains, if at all, little quantitative results. The results often appear to be quite hypothetical. As I see it, there are two main reasons for that. At first, the NSIDC sea ice age data set used as the main evaluation data set requires an evaluation that justifies its usage for the purpose of this manuscript. In addition, there is a methodological inconsistency behind comparing daily sea ice type products with weekly sea ice age data. Secondly, the SAR images used are only interpreted in a qualitative way. With that they can be used as a means for a consistency check of the general performance of the sea ice type products - but only within the error margin proposed by this manual interpretation. Both together clearly reduces the value of this manuscript, which has the character of a pure, qualitative inter-comparison study with little in-depth recommendations resulting from it for i) which product to pick and ii) how to improve which product in which way.

GC4: The discussion of the results is not well linked to the existing literature.

Specific comments (contain some typos / editoral comments):

Abstract:

- I recommend that you consider to find and use a different acronym for sea ice type because I find "SIT" very often used as an acronym for sea-ice thickness. A possible alternative could be SITY. Or, since "type" is not really that long compared to the words, e.g. thickness or concentration, you might also consider write the full expression all the time. But "SIT" is a bit unfortunate.

- I also recommend that you very briefly describe the various products named in the abstract. Perhaps they can be categorized into those products that rely solely on C-Band or Ku-Band data and/or products that use both active and passive microwave data? Please check the maximum allowed length of the abstract and perhaps delete details towards the end for more clarity of what types of products you did compare.

- I recommend to state upfront that by "sea ice type" you merely refer to multiyear ice and first-year ice. As you know, there is a number of other sea-ice types which you, however, not appear to take into account.

- L13/14: "towards sea ice ... images" --> "against a sea ice age product and compared with five Synthetic Aperture Radar images"

- While you write in Lines 14/15 about results found at daily and monthly temporal resolution it is not clear whether all products used come at daily temporal resolution. I also note that the sea ice age data set comes at weekly temporal resolution.

- L14/15: Please also see my over-arching comment to the conclusions.

- You might want to re-phrase "anomalous fluctuations" because it is not clear what you mean by that in the context of an underestimation (Line 17).

- Under (3) you write about details with respect to the classification (Line 23). Is the retrieval of all products investigated based on a classification approach?

- I have the feeling that the "Additionally, the change of separation pattern ... SIT method" (Lines 24/25) could be deleted for the sake of having more room for the above-mentioned suggestions.

Lines 41-57: I suggest to better structure this paragraph and in addition provide more background information. Specifically I recommend to

i) Tell the reader that by sea ice type discrimination you are referring to distinguishing between FYI and MYI;

ii) Write what the fundamendal differences in the physical properties of these ice types are that allow us to separate them by means of their microwave signature (be it for active or passive microwave sensors);

iii) Explain more clearly - but still briefly - what the different retrieval approaches are. It is for instance not clear whether the main approach used is a classification. The NASA-Team algorithm (see below) does not use a classification, neither does ECICE.

iv) Move information about evaluation results obtained by others so far into the next paragraph (see Lines 47/48: "By comparing ... Kwok 2004)."

v) Mention that different methods exist which either provide a fractional MYI/FYI coverage or a binary classification (or assignment of one or the other ice class to a grid cell).

- In addition, I recommend to delete the Lomax et al. 1995 paper and instead include literature related to the NASA-Team algorithm and to the ECICE algorithm which both permit to compute FYI and MYI fractions and are both so far missing completely in your list. I am wondering why you are not considering these products as well in your inter-comparison. I am also wondering whether it would not make sense to get hands on the MYI data sets created by Ron Kwok and used in various publications of him and his group.

Lines 58/59: "Comparison ... methods" --> While an evaluation of products is per se an excellent idea and the improvement of the used retrieval methods a good motivation, I strongly suggest to provide 1-2 sentences that specify more clearly why it is important to (finally) provide a more comprehensive evaluation of these products. The first paragraph of your introduction only tells the reader that sea ice type is important. But requirements about the accuracy and a specific example where an error in the sea ice type discribution of, e.g. 50%, would have which implications is not yet given in a convincing way.

Line 61: Why "limited"? There are plenty of ship observations (see e.g.: https://www.cen.uni-hamburg.de/en/icdc/data/cryosphere/seaiceparameter-shipobs.html )

Lines 62-64: "... some MYI ... in ice charts." --> I don't understand this sentence; please consider to re-phrase it.

Line 67: "operational" --> please check what you mean here by operational. Do you mean existing? Or are you really referring to all sea-ice type products that are currently operationally (aka daily) produced and provided to the users?

Lines 83-86: "Microwave radiometer ... 2016)" --> As stated already in the context of the introduction, it would make a lot of sense to include a paragraph that clearly describes the relevant physical properties of the different sea ice types that are relevant for their discrimination in the different active and passive microwave signals. This is required to understand the algorithm details and to understand their limitations (also during the freezing and/or shoulder seasons) and would be important for the discussion section as well. Since this is the first paper of this kind it is certainly worth to dig into physics here.

Line 91: "on coarse resolution" --> Please write in the text more clearly the grid resolution of the data and, if relevant, also the native resolution of the data used as input.

Lines 93/94: I can understand that Tb measurements are corrected for the atmospheric influence because it disturbs the sea ice signal. I cannot understand why you need to correct the Tb measurements for sea ice concentration ... What I can imagine is that you use an additional sea ice concentration product to restrict the analysis of the sea ice type on the sea-ice covered area. If this is the case then please write it accordingly. However, admittedly this would contradict a bit the next sentence about the Bayesian appraoch do discriminate open ocean and sea ice. In short: You need to rephrase these statements.

Line 96: I suggest to remove the "further" and to also provide an equation of how the gradient ratio is computed.

Line 98: Where are these "fixed target areas" located? How are these selected? How large are these? Do these change annually? And: Why are these fixed?

Lines 100-102: "In 2021 ... scheme" --> My impression is that you are not including data of this new version into your comparison. Therefore I recommend to move this announcement towards the end of your paper, e.g. into the discussion where it could fit with your outlook / description of which improvements are (already) underway. But perhaps 2021 was a typo ...?

Lines 103-115: I recommend to comment on / give more details on:

i) the fact that the OSISAF-SIT is based on a very heterogeneous set of input parameters and on changes in the training data set (L111/112), which both could have an impact on the sea ice type product in terms of its consistency over time;

ii) what a "sigma_nought" is (Line 112) and in which way this variable is used (is it corrected towards a common incidence angle? for instance); what is the incidence angle range used? (Compare the next paragraph where you are comparably detailed as far as it concerns the Ku-Band scatterometers.);

iii) what the native resolution of the scatterometer data is;

iv) what a "swath projection" is (Line 112).

- I furthermore find the introduction of AMSR2 and AMSR-E the way done confusing. AMSR2 is available since July 2012 but it is included since 2016; whether AMSR-E data were at all used is not clear but AMSR-E is introduced.

- You describe the different sensors used with different degrees of detail; for instance you do not mention that SSM/I and SSM/IS are multi-channel radiometers with a number of frequencies while you do so for AMSR-E. You refer to "coarse" (previous paragraph) and "medium-resolution" (Line 106) as well as "higher spatial resolution" (Line 109) without a specific motivation. Why is it important to know the spatial resolution? How does the product (also the other products) actually deal with input data being available at different spatial resolutions?

- What is the spatial resolution achieved by ASCAT and what is the polarization used?

- What are "given weights" (Line 113)? How are these defined?

- L113-115 you could rephrase for improved clarity along the lines: Both C3S-SIT and OSISAF-SIT provide, in addition to the pure ice type classes FYI and MYI, an ambiguous ice type class that represents an unknown mixture of both ice types, referred to as "Amb". The products are provided with ...

Lines 120/121: "ASCAT is ..." certainly belongs either to the paragraph where you introduce ASCAT data for the first time. Or, alternatively, you could think about adding a sub-section wherein which you introduce all sensors and their specifications as far as relevant for this paper. Table 1 provides not enough information.

Lines 116-126:

- I recommend that within this paragraph you underline more clearly that KNMI-SIT is actually a synonym for three different sea ice type products of which you include two into your evaluation. I would then also avoid speaking of "the KNMI-SIT" but in general speak about KNMI sea-ice type products and then define KNMI-Q and KNMI-A as those you are referring to henceforth.

- While your refer to swath and grid in the previous paragraph you don't do this here. In which form are the data of the different scatterometers used within the sea-ice type retrieval? What is the grid resolution? What is the native resolution of the OSCAT and QuikSCAT data? Please refer to Table 1 / Figure 1 for clarification in terms of the time periods the different satellite data and hence sea-ice type products are available. Reading the text it is not clear which time periods the different (?) products cover.

- L123/124: "In KNMI-SIT ..." --> Does this apply to all three products? Or is there a merged product? Is this classification done after FYI and MYI have been separated? What is the difference in the microwave signal that is exploited to separate SYI from older ice?

- L125/126: "In this study, backscatter ... SIT products." --> I don't understand this sentence; please re-phrase it.

Lines 127-131:

- Like for the previous paragraph it is not entirely clear whether IFREMER-SIT is again just a synonym for the two other products IFREMER-Q and IFREMER-A or whether these two are merged to form one product.

- I note that you give a few details about IFREMER-A but not about IFREMER-Q.

- I am not sure I understand what you mean by "series of time-varying thresholds" ... What is this "series"? Are you referring to a time series of backscatter data for several winters as written in Line 130? What do you mean by "seasonally consistent"? That the values agree with each other through the course of the freezing season?

- While we learn here that the product is gridded to a polarstereographic grid, there is no information about the grid in the previous paragraphs.

Lines 132-137:

- "employs adaptive" --> "employs an adaptive"

- "based on the thought of clustering" could possibly be re-phrased. What kind of clustering approach is used? K-means?

- For the other approaches listed above that utilize radiometer data you state that the gradient ratio of the 37 and 19 GHz channels with vertical polarization are used. Which channels are used here?

- Is it correct that the approach combines coarse resolution radiometer data (what is the resolution? How is the difference in spatial resolution between SSM/I / SSMIS and AMSR-E/2 taken into account?) with fine resolution scatterometer data? What kind of radiometer data are used? Daily gridded? Swath? Which grid?

- You write that QuikSCAT and ASCAT are used successively. Does this mean that you use QuikSCAT data until the very end of its nominal time with regular data provision in 2009 (?) and only afterwards ASCAT? How does the algorithm deal with the substantial difference in sensing geometry and coverage?

Lines 140/141: "climate consistent data record SIT products" --> Given the heteogeneity of the products described in terms of the spatial resolution of the input data and the various combinations of frequencies and potentially also polarizations used, I doubt that any of the above-mentioned products deserves yet an assignment into the group "climate consistent data record". Therefore, personally, I would skip this whole last paragraph (Lines 138-142); I don't think it is relevant for the paper.

Lines 147-150:

- The description of the sea-ice age product should be revised according to the information given in the more recent paper by Tschudi et al., Cryosphere, 14, 2020. In particular statements like "tracking of ice trajectories" should be avoided as should be wrong information about how the data set is derived like "passive and active microwave observations". This ice age data set is derived from the NSIDC sea ice motion data set (which in some way is described in the same paper).

- The paper by Korosov et al, 2018, is about the deficiencies and limitations of the NSIDC sea ice age data set but should not be cited in the context of its description. I can kind of guess that you added this information "limited by the simple drift model and the oldest ice age assignment of grids" to illustrate that the NSIDC sea ice age data set may have its limitations but this would need to be explained in far more detail than in half a sentence. In fact, it is likely that the sea ice age product overestimates the presence of old ice and therefore is biased towards old ice. Whether this already applies to the discrimination between FYI and SYI I don't know; this I leave to you to think about.

Line 157: "Images are" --> "All five SAR images are ..."

- Was any filtering (speckle?) applied?

Lines 159/160: "the geolocations and acquiring dates of the SAR images" --> "the location of the five SAR images". There is no acquisition date given in Figure 2. Hence the acquisition dates are missing and the time difference with respect to the sea ice type products the SAR images are compared to is unknown. This needs to be included in the revised version of the manuscript.

Lines 161-165:

- "For better interpretation of SAR images" --> This motivation needs to be explained better. It is not at all clear why, for the interpretation of the few SAR images used, these two additional data sets are required. What is the problem with the SAR images that such data are needed?

- Why do you use the CERSAT/Ifremer product - which appears to be quite heterogeneous in terms of the input data when you could have used the NSIDC sea ice motion product coming at 25 km grid resolution on an EASE grid and with daily temporal resolution.

- What is the grid resolution of the ERA5 data and how did you co-locate these data with the sea ice type products and/or the SAR images?

Line 172: The naming of the regions is partly wrong and needs to be corrected. What you call ESS is actually the combined area of the East Siberian Sea and the Laptev Sea. What you call BS is not the Barents Sea but the combined area of the Beaufort Sea and the Chukchi Sea.

Line 180:

- Here, in this line your write "extent is calculated by general extent of pixels", in line 176 you write "MYI extent is estimated as the integral of all pixels specified ..." . Both formulations are not to the point and not specific enough. I recommend to re-phrase in both cases along the lines "We computed the MYI extent as the sum of the area of all grid cells classified as MYI."

- In this context I have two questions. 1) Did you use a common land mask? Or is this not required because the region of interest that is delineated by the red line in Fig. 2 is clear of any land influence? 2) Did you take into account that the grid cell area is only a constant in the EASE grid projection while it changes with latitude for the products in polar-stereographic projection? If you did not take this into account yet you must correct your computations.

Lines 183++: I have a conceptual difficulty with comparing daily sea ice type maps with weekly sea ice type maps derived from the NSIDC sea ice age. The comparison would be much more meaningful if you would average all daily products over every single week also used in the sea ice type product derived from the sea ice age. After all, this is your main data product for the inter-comparison.

Line 184: "decreasing trend" --> What you possibly mean is a decrease or an increase in the MYI extent (over time) or a positive or negative trend of the MYI extent (over time). A decreasing (or increasing) trend, in contrast, is a trend that changes its value with time or, in other words, if this was a linear trend then the slope of the trend line would decrease (or increase) with time. Therefore, please correct your writing accordingly throughout the manuscript.

Line 186: "the divergent movements" --> Which movements? Movements of what? Are you referring to divergent sea ice motion?

Line 194: "to the NSIDC-SIA extents ... 2-3 years" --> You stated earlier that you compute the MYI extent from the NSIDC-SIA data set by summing over all grid cells exhibiting a sea ice age of 2 years or older. Hence you can simply write "to the MYI extent derived from the NSIDC SIA."

Lines 199/200: Given the fact that the entire Arctic Ocean (i.e. approximately the region of your study) has a size of about 7 x 10^6 km2, and the fact that rarely the entire Arctic Ocean is covered with MYI, this difference is far above being reasonable and requires more explanation. It is a 100% error.

Line 200: Are these extent estimates for class "ambiguous" reasonable? How do these values relate to the entire MYI extent? Please be more critical about and more specific within your interpretations.

Lines 204-210: "This is expected ... summer and winter." --> These lines call for the more careful delineation of the physics behind the various retrieval methods which I asked for earlier. Without that physical background these statements all remain hypothetical and are not sufficiently backed up by existing knowledge and hence not in line with good scientific practice.

Line 214: What do you mean by "most distinct variations"?

Subsection 3.2.1:

- One could have expected that you dedicate a bit more time to comment on the details such as the drop of the MYI extent to zero in some winters, when looking at the NSIDC SIA MYI extent of region ESS.

- What explains, to your opinion, the observation that especially for BS and ESS NSIDC SIA MYI extent is often considerably larger than the MYI extent offered by all other products? This is less pronounced for region CAO where we can also see numerous cases where the other MYI extent products exceed the NSIDC SIA MYI extent.

Lines 220/221:

- "The former ... Stream" --> This is not entirely correct and requires re-phrasing. You have chosen your region CAO such that the Transpolar Drift Stream goes right through it ... from the Pacific side towards the Atlantic side. Therefore, what explains the decrease in MYI extent is i) the export through Fram Strait and, by smaller fractions, into the Barents Sea and through Nares Strait, and is ii) the export driven by the Beaufort Gyre towards the South along the Canadian Arctic Archipelago.

- Note, in the sentence before "BS keeps constant or increasing" should also be re-phrased. You want to state something like the MYI extent in the BS/CS region remains constant or is increasing.

- Finally, what explains the decrease in MYI extent in region ESS? This is not clear yet. If it is exported towards the CAO, then this is going to be a northward flow ... a direction not yet mentiond in your description.

In short: Please be more accurate in the description of the results of your work.

Line 223: What are "varying evolution trends"? Either do trends vary between the different winters of years. Then this could be termed inter-annual variation of the trends describing the evolution of the MYI extent in the respective region. Or you want to comment that within a season, the evolution of the MYI extent from month to month differs between different winters or years. Then you need to specify that you are referring to the intra-seasonal variation of the MYI extent and need to drop the word "trend". Please be more clear in your writing.

Lines 238/239: "the discontinuous ... C3S-SIT" --> This is a too global statement because it reads as if daily MYI extent fluctuations are always explained by this discontinuous FYI delineation. You should not forget that this is a scene at the verge of freeze-up and therefore one cannot expect that all of the MYI has a "mature" microwave signature yet which would the algorithms let define it as such. In addition, I'd say this is an issue that is possibly limited to the late October / early November cases and is not of general validity. Please correct your writing accordingly.

Line 245: "with exceptional MYI distributed ... as ESS." --> I can agree on the 2nd largest MYI extent but there is only a quite small part where the finger-like structure of MYI extends through Chukchi Sea into the ESS region. The other finger-like structure at Severnaya Zemlya can be observed in basically all products and is hence not exceptional. Perhaps you want to state that this protrusion from the Chukchi Sea into ESS is not in agreement with the NSIDC SIA sea ice type?

Line 256: Looking back at this paragraph and the top two rows of Figure 7 you could also state that this is a good example where assigning the ambiguous ice type pixels to MYI actually improves the agreement in the spatial pattern with NSIDC SIA sea ice type.

Lines 268-271:

- This information should be placed in a section about methodologies of the inter-comparison, where you describe how you co-located data, and how you computed the MYI extent from the different data sets (and grids).

- There you also should reflect upon why and how you selected the boxes as you did and why these have a different size.

- It would be furthermore more than beneficial if you would elaborate on the way how you decided, based on the SAR images, which part of the ice is FYI and which MYI. Your sentence "Characteristics of brightness, texture, gemometric shape and context ..." is not sufficient for a journal such as "The Cryosphere"; it rather reads like written for a public science magazine, I am sorry. You have decibel values at hand and by digging into published literature you can get a much better, even quantitative handle on the interpretation of the SAR images.

- I note that you used HV-pol data from Sentinel-1 SAR. Why did you use cross-polarized images instead of co-polarized images? What is the advantage using those? Can I assume that the RADARSAT-1 images were HH-pol? You could note this additional information in the respective figures.

Lines 273-278:

- The description of what is seen in terms of ice types in the SAR image appears to be hypothetical and descriptive. There are tables and publications from which you can learn about the typical signatures (sigma nought) of MYI and FYI at C-Band HH-polarization. You should find and use these to put your assumptions on solid ground. Otherwise also these SAR images cannot serve as an evaluation or even validation data set but rather represent a vague inter-comparison source. And with that you can by no means adequately draw conclusions about the quality of the sea ice type products you are investigating here. You then also need to change the title of the manuscript, leaving out "evaluation". Also the usage of the NSIDC SIA MYI extent does not warrant so because it is known to be biased (this is visible in your manuscript as well) and is not a good source for evaluation in the way carried out by you.

- Please add month and year of the scene to the text.

Lines 279++: I am wondering whether it would make sense to not comment on / discuss every product here in the figures showing the comparison to the SAR images. Perhaps the most striking discrepancies would be enough to mention.

Lines 284/285: "which might be caused ... the product" --> It is not clear how these two issues can lead to an overestimation of the MYI extent derived from the sea ice age product when the delineation relevant to state whether a grid cell contains MYI or FYI is between FYI and SYI, and is therefore only influenced by the time from the last fall until the date this example is from ... hence basically 6 weeks in this case. Only a small amount of FYI is grown until then in that region and one can be confident that the majority of the grid cells is in fact predominantly MYI as seen in the sea ice age product.

Lines 289-293:

- Same comment as for Fig. 8 with respect to how to assign features and/or brightness distributions to ice types. This is a purely qualitative inter-comparison and not an evaluation.

- When I was working with SAR data during my PhD days I was always urged to denote the sensor flight and look direction by arrows. You could do so as well so that it is more clear where the low and where the high incidence angles are located.

Lines 297/298: "The MYI underestimation ... weekly temporal resolution." --> Why? What is the physical process required to have large discrepancies between a weekly ice type map and a daily ice type map? Is there evidence in the additional data used by you about this physical process?

Lines 301-305: Again the interpretation of the SAR image (and the boxes zoomed into) appears to be very hypothetical and is not well backed up by what could be taken from published literature (if the authors would have considered to use HH images instead of HV images). This would also have resulted in less processing artefacts in the image.

Lines 306-311: I would say this is a classical example where the NSIDC-SIA is one of the more useful data sets here. Looking carefully it is clear that regions C and D are both FYI. You can check the minimum extent end of summer 2014 please to check whether in that area close to Severnaya Zemlya sea ice survived the summer melt. I doubt so. Hence these two areas are located within the landfast sea ice (FYI) cover that develops there ususally - as is also well backed up by a sea ice drift speed of zero. While I could agree that region B is in fact MYI I doubt that region A is MYI. This is certainly an area where i) deformation and ii) deep snow plays a significant role in shaping the different microwave signals contributing to the (every) sea ice type classification.

Line 312: Is there any reason why you put mid-winter after late-winter?

Line 322: "Compared to the SAR image ... is overestimated ..." --> This is only part of the story. I would see this more differently and urge the authors to have another look to see that the agreement between NSIDC SIA and the supposedly FYI - MYI distribution in the SAR image is only acceptable in the bottom part of the SAR image whereas towards the top and top left there is both an underrepresentation of MYI and an overrepresentation of MYI, respectively.

Line 323: Please look at my comment to a similar statement made by you further above.

Lines 336-339:

- I don't see how your results underline or agree with the results of Korosov et al. and I also don't see how your results confirm that the NSIDC SIA data set is a cross-validation data set. It is at most a data set for consistency checks and inter-comparison. I will detail why below.

At first: none of the sea ice type products investigated has a finer temporal resolution than the NSIDC SIA product (and hence the MYI extent derived from it). Hence you cannot look at the sub-grid scale distribution of sea ice types (and age) in the NSIDC SIA maps and the information you claim to have at hand originates from the publication mentioned above and is not your own result.

Secondly, even though you have SAR images at hand you did not make the effort to first perform a high-level evaluation of the NSIDC SIA producte BEFORE you use it as a data set for inter-comparison. You could have carried out a dedicated pixel-wise comparison between the NSIDC SIA product and the SAR images used. But this would require i) more SAR images covering the same region over the weekly period represented by the NSIDC SIA product (i.e. ideally one at the beginning and one at the end of the 7-day period) for ... say ... 50 cases (which is a big project) and ii) using SAR images in a quantitative way, i.e. using the sigma nought values to delineate FYI from MYI, and in addition taking carefully the drift and deformation history of the respective regions into account to ensure that areas with a bright signature caused by deformation are not misinterpreted as MYI. Only with such a comparison, looking at the sub-grid scale distribution of the different ice types within single NSIDC SIA 12.5 km grid cells, you can shed more light about the "cross-validation" potential of these maps.

Line 367: "stability of the sea ice types" --> what do you mean with that? FYI will not disintegrate spontaneously and MYI will not become FYI over night. Please rewrite.

Lines 368-370: "This parameter ... or high frequency channels" --> I agree to this statement; however, I am wondering what the magnitude of cloud liquid water values typically observed during winter in the Arctic would be and what the impact would be specifically on the GR. I am pretty sure you can dig out this information in the available literature and back up your statement adequately. There are sea ice concentration algorithms that specifically make use of the two channels that form this GR, e.g. the Comiso algorithm frequency mode; perhaps the paper by Andersen et al. from 2006 in Remote Sensing of Environment could enlighten you here. In short, unless the impact of atmospheric parameters such as cloud liquid water and water vapor on the GR at these two frequencies is really measurable I would remove this piece of information. If kept it needs to be backed up by adequate literature.

Line 368: I am surprized that one of the sea ice type algorithms uses this GR ratio the other way round, i.e. 19 V minus 37V. Please check. This (again) calls also for a better and more comprehensive description of the algorithm behind the products inter-compared in this study.

Line 371: "ice layering" is one component of the snow properties and should not be mentioned as if it is a different thing.

Lines 373-375: "when air temperatures fluctuates around freezing point and triggers snow metamorphism" --> Apart from the fact that this is another example of bad English grammar this statements needs to be formulated in a less global way.

A) What you call snow metamorphism with a likely impact on brightness temperatures particularly at 37 GHz are melt-refreeze cycles caused by elevated solar radiation during spring (April); during these cycles the air temperatures do not necessarily fluctuate around the freezing point.

B) In October solar radiation is absent, hence cannot be the trigger for snow metamorphism. Melt-refreeze cycles are also absent. What can happen in October is advection of warmer air masses and precipitation falling as wet snow or freezing rain - which admittedly can have an impact on the microwave signature of the sea ice cover. But without working with the theory (missing in your manuscript) you cannot explain it properly. Possibly wet snow masks MYI underneath, letting it look like FYI. But you don't present evidence for this in your data / results. While warm air and hence wet snow might be the reason for the underestimation of the MYI cover in the CAO using C3S-SIT it is not sufficiently clear why snow metamorphism should lead to an overestimaton in the BS and ESS in late winter. What is the physical process that drives which change in the relevant microwave properties that cause the microwave observations to trick the algorithms, leading to an overestimation in MYI?

C) Another issue you did not yet bring up is the fact that parts of the ESS but also the parts of the CAO facing the Atlantic may experience particularly thick snow loads. Since the GR used here is not only sensitive to the sea ice type but it is also sensitive to the snow depth it is not surprizing that a sea ice type algorithm that uses the GR at 37 and 19 GHz tends to classify FYI as MYI as a result of a thick snow cover.

Lines 375/376: Please explain to the reader what the effect of the temperature correction scheme and the "upgraded tuning of atmospheric correction for Tb" [better --> the improved correction of the Tb for the atmospheric influence] is on the GR used so that the reader gets a credible piece of information here which you again ideally back up with appropriate literature.

Line 377:

- "backscatter (sigma^o)" --> either "backscatter coefficient" or "sigma nought"

- "which has good separability between MYI and FYI." A backscatter coefficient cannot have a good separability between MYI and FYI. A backscatter coefficient might be suitable to separate MYI from FYI.

Lines 378-381: "In comparison ... Fig. 12)" --> Also these lines should be re-written and re-phrased investing more space to describe the issues behind.

- In addition, you might want to provide an explanation why Ku-Band scatterometer measurements appear to be less sensitive to the surface roughness than C-Band scatterometer measurements. How about the sensitivity to the crystal structure of the MYI compared to the FYI? Is the contrast in the backscatter coefficient between MYI and FYI larger or smaller at Ku-Band compared to C-Band? Does this depend on the polarization? Does this depend in the incidence angle? What is the role of the different penetration depths into the snow and into the sea ice?

Lines 388/389: "In Beaufort and ... classification ..." --> please also see my comment for Line 373 further above.

Line 394:

- Either: "employ a dynamic threshold" or "employ dynamic thresholds"

- What do you mean by "variability of [the] training dataset"? Do you mean the spread of values around a chosen threshold brightness temperature or backscatter coefficient?

- What do you mean by "seasonality"? I recognize that sea ice type retrieval is limited to the freezing season, hence one season; you should be more specific here. It is also not clear to what the seasonality refers to ... to the MYI extent? to the physical properties of the sea ice and its snow cover? to the thresholds used?

- I don't understand what you mean by "shift in sensor type". Could you please elaborate on this in the text? I can guess that you perhaps mean the shift between using SSM/I or SSMIS data or between using ASCAT C-Band and QuikSCAT / OSCAT Ku-Band. But to me this is not a shift in sensor TYPE because it is either radiometers or scatterometers. Please be more specific here.

Lines 398/399: It is not clear what you mean by "takes sea ice variabilities into account". What "sea ice variabilities"? Are you referring to the spatiotemporal development of the physical properties of the sea ice and snow cover that influence it microwave backscattering characteristics and/or the microwave emission? Then please write it specifically. Currently, "sea ice variabilites" can mean anything from variations in sea ice thickness or concentration, different ice drift patterns, floe-size distributions, degree of deformation whatsoever ...

Lines 409/410: "can be partly ... more obscure"

- "obscure" --> "difficult" or "problematic"

- The statement as written is not conclusive because you are not providing the key message that the Arctic Ocean has lost a lot of its oldest ice AND that the difference in the radiometric and microwave backscattering properties is usually more pronounced between FYI and these older ice types than, e.g. second and third year sea ice. This feeds back again to the missing description of the physical background behind the sea ice type retrieval earlier in your manuscript.

Lines 418/419: In all of the regions mentioned here MYI ice can occur once in a while and hence a MYI ice signature in these regions certainly is not unphysical. Apart from that is the Chukchi Sea part of your region BS. This needs to be corrected in the text.

Lines 420-422: "Statistical thresholds ... the ice edge" --> Please provide a plot which illustrates how PDFs of the respective parameters used in the retrieval (i.e. backscatter coeffient or Tbs or GRs) of the MYI overlaps with the PDFs of ice types typically encountered along the ice edge so that the reader understands what you are referring to. Ideally, you have this figure along with the revised description of the sea ice type algorithms earlier in the manuscript so that here you simply need to refer to that figure.

Lines 422/423: "exclude ... distributions." --> I don't get what you want to state here. If the MYI extent in the above-mentioned peripheral seas and/or along the ice edge would be added to the MYI extent in your region of interest this would mean a considerable change in the overall SIT distribution. Therefore, please re-phrase your statement as it is currently not clear enough.

Lines 424/425: "reassign ... intrusions." --> Not sure what you want to state here. Do you mean "assign grid cells erroneously classified as FYI as the result of warm-air intrusion induced changes in the surface snow properties to the ice type MYI." ?

- How is this temperature based correction done?

- Aren't there other algorithms (published by one of the authers) that use this temperature based correction as well?

Line 425:

- What does an "ice motion confining procedure" do? I have no clue. Please explain it to the reader.

- "anomalous MYI overestimation" --> What is this? What is a "normal MYI overestimation" and what is the difference to an "anomalous" overestimation? Please re-phrase.

Lines 428/429: It is not sufficiently well described how the correction based on a median filter (spatial or temporal) works.

Line 432: "the five series SIT products ... are defined." --> I don't understand this sentence. Please re-write. It is possibly a problem of the grammar.

Line 434: "Typically ..." --> I don't see that your manuscript warrants yet to state the reason given for the larger spread in MYI extent during early and late winter as being typical.

Lines 440/441:

- It is sufficient to write "grid resolution", spatial can be omitted.

- "foot print" --> "footprint"

- What is the "true" spatial resolution of the ASCAT data? What is the "true" spatial resolution of the QuikSCAT data? You should please not forget that the finer grid resolution provided by the SIRF products (4.45 km) is the result of heavy smoothing and other signal reconstruction steps.

- Another issue that you did not take into account here are the different incidence angles of - especially - the ASCAT C-Band data compared to the microwave radiometer data and QuikSCAT / OSCAT.

Line 452:

- Add "five" in front of "SAR images".

- Any reason why you are not mentioning the NSIDC SIA product here?

Line 453 / the conclusions in general:

- I am not sure I would select a sea ice type product based on the maximum difference that a product might have compared to another independent data source. I would be interested in whether there are regions and time periods where there are systematic errors (and how large these are on average so that I might be able to correct them). In addition, I would be interested in the average performance of the product over a longer time period, i.e. whether there are artificial trends.

- I suggest to re-write your conclusions accordingly, focussing less on the individual products as you do in the list 1) to 4) (which should in any case contain 5 or even 9 entries according to what you write in Line 451), and instead concentrating on the larger picture provided by your qualitiative results. It might help in this context to again take a look at your time series plots and focus less on the inter-comparisons with the SAR images.

- I like the bullet point list further down on the next page. That one looks good but could be written even better by including specific details and referring to the existing literature.

Line 477: "extensive misclassification with higher uncertainties" --> So, the misclassification in itself is highly uncertain? Please re-write.

Line 480: "Ku-Band ..." This statement is not new and has to be backed-up by existing literature.

Lines 488-490: "On the other hand ... become obscure." needs to be re-written. The meaning is not clear and the grammar is not correct.

Line 492:

- Apart from the fact that we still don't know how "ice motion confining" works, it is not clear what "accumulative errors" are. Consider re-phrasing for improved understanding.

- "These post- ..." --> "Any post- ..."

Line 491: What is meant by "should be accounted with caution"? Please consider re-phrasing for improved understanding.

Lines 495/496:

- "This study ... of SIT retrieval approach" --> I don't agree. This study does not contain an "evaluation"; it is an inter-comparison study, mostly involving qualitative results. It provides hints of the quality of the sea ice type products investigated RELATIVE to the NSIDC SIA data set (which in itself is not well evaluated) and relative to only five SAR images which are not interpreted quantitatively.

- I further object to the notion "most popular". Please consider re-phrasing.

- I cannot see the "hints for further improvement". While you state where some of the sea ice type products have deficiencies, you neither come up with specific suggestions about how to improve (e.g. use a SIRF-like product as an input to the OSISAF sea-ice type product to improve the grid resolution) nor does the nature of your results being based on an inter-comparison to qualitative data support to draw conclusions into this direction. I warmly suggest to tone down the value and potential impact of your results.

Line 497: Please share with us which two frequencies WindRAD is going to use.

- "the potential of scatterometer on ice type discrimination" --> "the potential of scatterometer measurements for ice type discrimination"

Lines 499/500: "low frequency microwave measurements" --> "low frequency microwave radiometer measurements" because ASCAT already has been using C-Band for 15+ years which is also a "low frequency microwave measurement"

Figure 3: I suggest to reduce the number of colors used by getting rid of the NSIDC sea ice age and instead show the MYI extent derived from it as a black line - like you do in Fig. 4. If you want to show examples of how the different sea ice type products deal with different sea ice age then I suggest to show just the respective year - ideally a year where almost all sea ice type products provide MYI extent so that you can compare between the products. Alternatively, you could consider showing only MYI extent differences. If you want to keep the sea ice age information then I recommend to use shades of grey instead of colors for the sea ice age.

Figure 4: Why are IFREMER-A data missing for January and April in 2014 & 2015?

Figure 6 and 7:

- It is very counter-intuitive to show open water in brown, land in light grey and FYI in blue. Please use a more intuitive coloring such that, e.g. land is brown, open water is blue and FYI, Amb, and MYI are perhaps medium grey, light grey and white; the observation gap at the pole can then be colored black.

- I recommend to enlarge the figure as a whole.

- In the caption you could cross-ref to Table 1 or Figure 1 to make clear why there is a different number of maps for the two dates shown. In addition you need to refer one more time to the meaning of the red line and you need to comment on the different coloring of the observation hole.

Figure 8:

- What is the motivation to show boxes A to D with a different size?

- I have the same comment with respect to colors as for Figures 6 & 7.

- I suggest to show the NSIDC-SIA map in a different color code as well. What is important for you is to discriminate FYI from older ice which currently is difficult to delineate because the colors used for FYI and SYI are quite similar.

- I recommend to rename "sea surface wind" to "10m wind" because I guess this is what it is. Also make clear that "air temperature" possibly is the "2m air temperature". The additional information that these are daily averages would be appreciated as well.

- I would replace the legends for those sea ice type products that do not provide the ambiguous ice class with a legend which only shows the two ice classes present. It might make sense - in general - to then also include the class open water in the legend.

Figures 9 to 12:

- I have the same comments with respect to colors, legends, and ERA5 data naming as I had for Figure 8.

- In addition, delineation of the boxes in the Sentinel-1 image in a different color than black would help to locate these better.

- It might make sense to not use a continuous color table for the legend of the ice drift field, as the values are increments of 0.1 km/day.

Table 2: Cases where there is a "+" and a "-" indicate that both performances exist?

Table 3: I guess the GR listed in the context of OSISAT SIT is not correct?

Typos / editoral comments:

Line 59; "indirect validation" --> perhaps better "inter-comparisons"?

Line 89: "KNMI-" --> "KNMI-SIT"

Line 144: "study, sea ... were used ..." --> study, we used a sea ice age (SIA) product and five SAR images ..."

Line 154: "with SAR ... of HH" --> possibly better: "providing C-Band (5.3 GHz) SAR images at HH polarization."

Lines 155/156: "providing cross- ... ranging from" --> possibly better: "providing C-Band (5.4 GHz) SAR images at co- and cross-polarization (HV and HH) with incidence angles between"

Line 170: "polar hole of 87degN" --> "data acquisition gap north of 87degN centered at the pole"

Line 176: "within studied area" --> I picked this as one of the examples that underline the need for considerable English editing of the manuscript. The authors must check for usage of "the" and "a" which is often missing.

Line 181: You stated already in Line 170 that you excluded that area centred at the pole. Therefore you can delete this sentence.

Line 185: "The MYI" --> However, the MYI"

Line 185: "regional" --> "regionally"

Line 225: "is mainly resulted from" --> check grammar.

Lines 239-241: This part does not belong to the top row of Figure 6, right? It belongs to the data from 2007 and should be placed into the next paragraph.

Line 280: Typo: "boarder" --> "border"

Line 288: Add the year.

Line 289: "... in the western part were higher than in the eastern part."

Line 294: "Slightly underestimation of MYI" --> check grammar.

Line 296: I would say that "thin" could be misinterpreted as "thin MYI" in terms of its thickness. You might want to consider using "narrow" or "filament-like" or "finger-like" or similar.

Line 297: "can partly be resulted" --> check grammar.

Line 300: "A Sentinel-1 SAR image covering the southern part of the ESS near the coast acquired on April 27, 2015 is shown in Fig. 10."

Line 313:

- "transit zones" --> "transition zones" or "zones of mixed FYI - MYI coverage"

- "steady discrepancies" --> re-phrase please.

Line 315: Delete "validation and"

Lines 324-325: "The MYI feature ... round MYI floe" --> please check grammar.

Line 364: "serial" ???

Line 378: "when using backscatter" --> when using backscatter coefficient measurements of an active microwave instrument."

Line 382: "confirmed" --> "shown"

Lines 397/398: "vary ... ASCAT" --> "are different, especially at C-Band."

Line 402: "speculate" --> "hypothesize" ?

Line 405: Either "to a sea ice type distribution" or "to sea ice type distributions"

Line 406: Either: "An adavtive clustering algorithm is used" or "Adaptive clustering is used"

Line 408: "thin ... seas" --> "narrow MYI tongues in the peripheral seas"

Line 427: "continuous underestimation" of what?

Line 430:

- "over-correction problem" --> "over-correction.

- "thin MYI ... seas" --> We had that expression earlier. Please look up my comment there.

Line 436: "fully evaluated" --> "done"

Line 481: What is "small FYI in MYI pack"? Do you mean: "comparably small areas of FYI within a region dominated by MYI?"

Line 487: "deep " --> "mid-"

References: You need to check your reference list. For a considerable number of the entries the records are not complete; for instance is the year missing quite often. At least one of the references appears twice.

Citation: https://doi.org/10.5194/tc-2022-95-RC3
- AC3: 'Reply on RC3', Yufang Ye, 25 Aug 2022
  
  The comment was uploaded in the form of a supplement: https://tc.copernicus.org/preprints/tc-2022-95/tc-2022-95-AC3-supplement.pdf
  
  Citation: https://doi.org/10.5194/tc-2022-95-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (03 Sep 2022) by John Yackel

AR by Yufang Ye on behalf of the Authors (15 Nov 2022) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (24 Nov 2022) by John Yackel

RR by Anonymous Referee #2 (12 Dec 2022)

RR by Anonymous Referee #3 (16 Dec 2022)

Suggestions for revision or reasons for rejection

Review of

Inter-comparison and evaluation of Arctic sea ice type products

by

Ye, Yufang, et al.

I am not providing a summary of this manuscript because I reviewed a former version of it.

The authors have improved the manuscript considerably and have taken into account quite a number of the concerns that were brought up during the previous round of reviews.

The readability of manuscript and the credibility of the results presented do, however, still suffer from deficits in some parts of the description and from overrating of the mostly qualitative elements of the intercomparison carried out.

General comments:
GC1: The deficiencies of proper reference of the figures from inside the text and the numerous typos and strange formulations make this manuscript difficult to review and to read. Please next time when submitting a manuscript consider having it proof-read by a native English speaker plus check it for consistency. It cannot be the task of an editor or a reviewer to tick all these. It is an immense work load and distracts from the scientific content of the manuscript.

GC2: While the consideration of the sea ice physics and its relation to microwave remote sensing of sea ice has been improved considerably, there are still elements that should be improved - please see my specific comments.

GC3: Please step away from considering this as an evaluation / validation or assessment study. It is an inter-comparison study, involving qualitative data set for inter-comparison and products which quality you aim to understant and report upon in this manuscript. Rather than attempting to rank the products, I recommend to state clearly that we require more and better specified and evaluated data sets for a quantitative evaluation of the sea ice type products.

Specific comments:

L43: Please check the content of Boisvert et al., 2016, with respect to whether this is indeed the paper you wanted to cite in this context. Perhaps the two papers of her et al. from 2015 fit better: Increasing evaporation amounts seen in the Arctic between 2003 and 2013 from AIRS data" or "The Arctic is becoming warrner and wetter as revealed by the Atmospheric Infrared Sounder"?

L53-59: I suggest to add that these possibilities to discriminate MYI from FYI by means of the different microwave signatures work well (only?) in winter when the snow cover is dry. In summer / during melt events, MYI and FYI often reveal a similar microwave signature.

L66: I suggest to remove Brath et al as this is about helicopter-borne scatterometer measurements and not about satellite data classification.
IsSuggest to remove Hughes as this is grey literature (and relatively old as well).

L75: Isn't the ECICE algorithm cited here using Shokr et al., 2008 belonging to the other type of algorithms (SITC)?

L83-85: "some areas ... of MYI in ice charts" --> I would be careful with this statement because after all, what is done in the binary assignment or classification of a grid cell as FYI or MYI bears the same potential for over- or under-estimation of the actual fraction of the respective ice type. Hence, in light of this retrieval uncertainty it is perhaps ok to use data for the inter-comparison that have a similar drawback?

L85/86: The sentence mentioning SAR could i) include the detail that SAR - like scatterometers - is an active microwave instrument but with a spatial resolution several orders of magnitude finer and ii) back up the information that SAR images are used for this kind of evaluation with literature.

L91: "for winters" --> just a comment: If you include the information that the discrimination between MYI and FYI works well by means of their different microwave signatures during winter suggested further up, then this notion of "for winter" is a logical consequence of what is physically possible.

Table 1: Please check the dates for SSMIS (2000 as the starting year is wrong) and for AMSR-E (12 as the end month is wrong).

L106: SMMR did not use a conical scan. Please check the respective documentation. One paper where you find helpful information about the series of SMMR, SSM/I and SSMIS sensors is this one: https://essd.copernicus.org/articles/12/647/2020/

L113: In view of the frequencies listed in Table A1, don't you think it makes sense to state in one sentence that for the generation of the SITY products introduced in section 2.2 merely the near 19 and near 37 GHz channels of these instruments are used? This would explain at the same time why you only list these two frequencies in Table A1.

L119: Note that there have been two different ERS satellites ERS 1 and ERS 2, both equipped with that AMI instrument, but only ERS1 started operations in 1991. You might want to make this clear in your text.

L122: OSCAT: There have been more than one OSCAT sensor. Which one is this?

L128: While I agree that there is a linear relation between TB and emissivity I doubt that from the physical / mathematical sense this statement is correct. I'd rather see it that way: TB is linearily proportional to the physical temperature of the object and the emissivitiy is the proportionality factor which magnitude is determined by the physical / chemical properties of the material that are relevant for its microwave emissive behavior.

L130-138: This is a good starting point, containing already parts of the relevant information. I suggest to
i) separate FYI better from MYI in your explanation and correct the statements of how salinity changes [brine can only be "expelled" towards the surface of the ice, a process that if at all occurs during very cold conditions and for rather thin ice; most of the change in the brine content is via gravity drainage during winter and via flushing out the brine by meltwater during summer - which is the process where brine pockets become air pockets (or air bubbles as you write). These processes should be explained more correctly.]
ii) try to make clear what the role of snow might be.
iii) try to make clear which of the geophysical properties determine changes in the emissivity on the one hand and in the backscattering properties on the other hand - also taking into account frequency dependence and different polarizations.

The last sentence where you cite Vant et al. is not clear. What is meant by "demonstrated" in this context and what are "high frequencies" in this context?

A very good review of such properties is given in the book by Carsey, F.D., "Microwave Remote Sensing of Sea Ice" from 1992 which is available online.

L140: Certainly GR is used in SITY products (I am not sure about PR but fine) ... but why? What is the advantage of using a GR over a TB? Ideally this is going to result from the revised section about emissivities and TBs before line 139.

L143/144: You need to state that the subscript "p" in Eq. 2.2 stands for polarization and can either be H or V. You also need to state clearly which of the two frequencies f1 and f2 given references to the higher frequency - unless this is not (why not?) important.

L147-152: "Meanwhile ... SITY product" --> Also this I rate as a good start but the description needs to rephrased still to be clear enough. I recommend to i) clearly state what the normal radar backscatter behavior is for FYI and MYI and why and only then ii) point out in which way deformation would change this general view. Please work with expressions such as surface scattering (dominant for FYI) and volume scattering (dominant for MYI).
When it comes to differences between frequencies, polarizations, etc. it makes sense to introduce (perhaps again) penetration depth of microwave radiation into sea ice of different types. I again warmly recommend to take a look at the various chapters of the Carsey book mentioned above.

L153: I might be wrong, but your description above is rather qualitative and I did not really get which specific "signatures" characterize MYI and which FYI. Does FYI exhibit a higher or lower emissivity than MYI? What about the PR? What about the GR? What about the radar backscatter coefficient? I would say that there is enough literature out to - for this first ever inter-comparison of SITY products based in algorithms that use these specific signatures - provide a summary table of relevant signatures at C- and Ku-Band for scatterometry and at near 19- and 37 GHz for microwave radiometry.

L154: "top layer" --> Do you have an estimate of the vertical dimension you are talking about? Are these millimeters? Centimeters? Does this behaviour depend on the frequency?

L176-178: Could it be that the Maaß and Kaleschke 2010 reference solely applies to the land-spill over correction and the Wentz 1997 one solely applies to the RTM based correction of the atmospheric noise? Please check and revise. Also "land spill-over due to the influence of land" should perhaps simply read "land spill-over effects on the measured TBs"

L184: "Tb observations" <----> "using the classification parameter GR" --> It is not entirely clear whether and if so why the training dataset comprises TB values or GR values.
Comment: The above-asked-for table with typical values of the various input parameters taken from literature would assist very well here in understanding that this is obviously a rather simple approach involving the GR and a GR-threshold value including its typical variability for MYI and FYI.

L201: "reassign misclassified FYI" --> Sorry, not entirely clear. Is this filter taking care of pixels that are erroneously classified as MYI but are in fact FYI? Or is this filter taking care of pixels that are erroneously classified as FYI but are in fact MYI?

L204: Which polar stereographic projection?

L214: The reader might like a hint why OSISAF-SITY uses a slightly different GR than most other products based on the GR at 37 and 19 GHz channels.

L220: Given the fact that C3S-SITY used quite a number of filters and corrections it would be good to confirm here in the text that OSISAF-SITY only uses the geographical mask to correct for eventual misclassifications of pixels as MYI and does not perform any other filtering.

L227+: I note that here and further down you do not specify the form in which the scatterometer data are input into the pre-processing stage - unlike for the radiometer data where it is clear whether these are swath data or not and at which stage of the processing you compute a daily map. Please therefore consider to also here, for the scatterometer products, specify whether these products are based on swath data or daily gridded data or whatsoever.

L229: I suggest to back up this statement about the incidence angle dependency with a reference.

L233: "data of March of each year" --> Would you mind to also tell us about the geographic region from which these thresholds are determined?

L234: I recommend to not speak about "MYI signatures" but of "grid cells" or "pixels erroneously classified as MYI". In addition I am wondering whether these pixels are really removed or whether they are set to ice type SYI or FYI. If not, then I assume the resulting ice type maps may have gaps?

L247: Is it correct to assume that all radiometer data come as daily gridded maps? Please mention this accordingly in the text.

L254: How is the re-gridding to the finest spatial resolution among the input data realized? Bilinear? Nearest-Neighbour? Others methods?

L267: I don't find the description of the data set sufficient enough. It is in particular not clear how a discrimination into FYI (SYI?) and MYI is made? This information should be given as a minimum - together with the granularity (in time). About what temporal resolution are we talking here? Weekly? Monthly? Annually? It should become more clear that this data set offers the spatial distribution of different ice age classes (younger than 1 year, 1-2 years, 2-3 years ...).

L271/272: Not clear what you mean with "middle-of-the-road scheme". Also, the fact that satellite data are combined with buoy data (and data from atmospheric re-analysis by the way) does not apply to the SIA data set but applies to the sea-ice motion product that is used to derive the SIA data set. I suggest to rewrite this statement.

L278: I suggest to delete this sentence because this is based on observations in the Weddell Sea, Antarctica. If you want to provide information about the uncertainty of the kind of ice motion product used here I recommend to search for publications by Sumata et al., and take a look at the two publications by Lavergne et al., one in 2010 and one in 2021.

L279-282: While the statements provided here are certainly correct they apply to an older version of the ice motion product (v3) which in fact contained artifacts. The data you are using are based on version 4 and therefore I consider the discussion given in these lines as not relevant. These are also not connected to the issues Korosov et al. found; these point to methodological shortcomings of the derivation of the SIA product from whatever ice motion product.
What is lacking for the SIA product is an evaluation beyond the published inter-comparison study results. The reasons for this is clear - a lack of proper data sets that could be used as a source for evaluation - but it should nevertheless be made clear that we are still waiting for an adequate evaluation of the NSIDC-SIA data set. This is something you could (and should?) mention here because it explains well why you do not consider the SIA data set as an evaluation data set, i.e. kind of a ground truth.

Section 2.4: In which form did you get and use the SAR data?
Which (pre-)processing was already applied before / has been applied by you?
What is the spatial resolution in these SAR images?
From the description in L284 it seems clear that the RS-1 images are at HH-pol; it is however not clear what the polarization of the used S-1 SAR images is.
Unless you refer to a much more detailed description of how the SAR images were "visually interpreted" further down in your paper, I strongly recommend to also here not use the term "validation". So far it again simply seems to be an inter-comparison of the SITY products with another unvalidated product which accuracy is not clearly specified. Hence, in L285 and in L290 "validation" needs to be replaced with "intercomparison".

L290: While one can guess it - would you mind to tell the reader the dates for which you used these additional data? Is it the same days as you have SAR images of?

L307-309: "To account ... " --> While this is an ok solution it introduces uncertainties from the re-projection. A more easy and straight-forward way would have been to use the grid cell sizes published by NSIDC for several derivatives of this polarstereographic grid

L323-330:
- This reads like a sufficiently good qualitative recipe to interpret SAR images visually. The only part that I find is missing is the discrimination between sea ice and open water - the latter also having a mixture of brightness levels in SAR images depending on i) the size of the openings with open water and ii) the wind speed. You may want to add this here to complete this description. Please take a look at my comments directly to Figure 3.
- Also, please add one sentence informing the reader that you are performing a binary classification here (in the next paragraph you write about Kappa coefficient and accuracy, so I assume to classify the SAR images in FYI, MYI and, if need be, open water).

L333-335: While the advantage of using HV over HH polarization is clear from the old literature cited, I strongly recommend to also take a look into the more recent literature and cite at least one paper where this has been applied to either RADARSAT SAR or Sentinel-1 SAR imagery.

L337: Please add 2-3 sentences explaining what the Kappa coefficient is and how you compute an "accuracy" from two binary classified images. What is the credibility of these two parameters when computed for a pair of non-evaluated data sets?

L339-342: Given the nature of the two kinds of data products you use for the intercomparison of the SITY products I strongly recommend to change the wording away from validation, evaluation, and reference towards inter-comparison.

L347: As noted in the context of Figure 4: I would get rid of the dashed lines and the daily data in this figure. It adds noise instead of value.

L351: "due to ice divergence" --> I am wondering whether it would be worth to include a schematic illustration which shows what you mean by this.

L356: "of FYI similar as that of MYI" --> It is exactly the other way round: the MYI signature becomes similar to the FYI signature. Melting snow, e.g., results in an increase of the microwave emissivity and hence an increase in the TB. Melting snow also results in a decrease of the penetration depth of microwave radiation so that a scatterometer does not sense the ice type underneath the snow anymore but only the snow.

L382-387: This paragraph targets the daily MYI extent values. I suggest to either delete it ... or keep it with the notion that these daily extents are not shown (see my comments with respect to Figure 4). Alternatively, you could show an example figure where you compare day-to-day MYI extent variations independent of the NSIDC SIA MYI extent.

L388/389: This sentence is reporting the maximum deviation - which is fine. You might want to include a statement about the average difference between the products during those periods of winter where all run comparably stable in their performance, i.e. November through March.

L396 / L402 / L413: Number of the figure is missing. --> GC1

L414/415: "The former" refers to which region? As written it refers to CAO and ESL and then your statement is wrong because there is no southward flow of MYI from region ESL.

L416: I suggest to delete the "following the Transpolar Drift Stream" because it is confusingly used in the context of what is stated here. For sure the TDS has nothing to do with MYI export through Nares Strait and also whether MYI is exported into the Barents Sea has in first instance to do with whether sea ice survived summer melt just north of the Barents Sea and is then pushed into it by northerly winds.

L422/423: "it is less pronouced in the CAO region." --> What is less pronounced? The MYI extent? Or the difference between NSIDC SIA MYI extent and the MYI extent of the SITY products investigated? Please be more specific in your writing.

L423-425: I am not sure your argumentation holds the way written. I recommend that you are more specific about the process that actually leads to a potential (...) overestimation of MYI extent in the NSIDC SIA product. And I also recommend that you try to relate this potential overestimation also to the actual MYI fraction instead of generally writing about a "mixture of MYI and FYI". You might ask yourself the question whether this potential overestimation is the same at 80%, 50% or 20% MYI concentration.

Subsection 4.2.2: Because of my inability to link the text of this subsection adequately to the figures because numbers are missing for the latter, I do not comment on this subsection. I am sorry. --> GC1

L468+: Just again the comment that the way the SAR images are interpreted does - in my eyes - not warrant to use the term validation or evaluation. It is an intercomparison between two kinds of ice type maps. I therefore once again recommend to get rid of the terms evaluation and validation and switch to intercomparison.

L478: It is a particular day in November. Therefore please mention it to avoid the impression that we are looking at a monthly average. The year given in the text does not match with the year given in the figure.

L478-483: The way you interpreted this SAR image makes a lot of sense - still bears the chance that i) the area you classified as FYI in the eastern parts of the image is actually indeed MYI or that ii) a substantially wider fringe of the area north of the open water is actually compressed FYI in form of brash ice, possibly undistinguishable from MYI under the conditions shown. What I would like to state here is that the intercomparison between your SAR image analysis results and the SITY products is very difficult to put into a reliable or credible quantitative measure as you try to do with the Kappa and OA values. I recommend that you at least mention that because of the lack of contrast in backscatter in the SAR image shown the actual border between FYI and MYI might vary considerably. While currently KNMI-Q looks ideal in comparison to your SAR ice type map, another interpretation of the SAR map might have resulted in a FYI / MYI distribution that resembles the C3S or OSISAF products closer. There is a lot of ambiguity - not just in the products but also in your SAR image interpretation which I recommend not to hide.

L508: Kappa and OA values for NSIDC-SIA are similar to Case 1 where you did not comment about the mobility of the ice even though that case is at the ice edge AND you have substantially higher winds. Here, in case 2 the MYI tongue is embedded into a matrix of growing FYI and I doubt that within the time frame of one week there was too much movement.

L518-519: The failure of these products to detect MYI is really strange and difficult to understand. It involves both, a product only based on QuikSCAT data and products based purely on radiometer data. It might be a very stupid question from my side but did you double-check whether the re-projections that were involved in one or the other product to do the intercomparison did not jeopardize your results? You know, if it would only be the radiometer based products or only the QuikSCAT based products I would understand this failure ... but we are talking about old ice (according to the NSIDC SIA map) which has a clear signature in passive microwave imagery during winter.

L521-523: Not sure whether inside the pack ice in winter the statement about moblity as a means to explain discrepancies holds.

L531-533: This failure to classify a large part of the ice as ice in the middle of winter is a no-go for such a product. Strange.

L536/537: "westward shift" --> This might be in fact one of the cases where the NSIDC SIA product "classifies" a grid cell as MYI ... even though the MYI fraction in some of the grid cells is certainly barely above the 15% threshold. Just a comment, no action required.

L537/538: Why do you highlight the ice age here but not for case 3?

L544/545: I am not overly convinced that the sea ice with this bright signature is associated with land-fast sea ice. This kind of ice is usually level ice with little deformation. In SAR images it often is represented as rather dark and homogeneous patches along the coasts / around Islands; actually the HH-SAR image of Figure 13 shows land-fast sea ice in the immediate vicinity of Severnaya Zemlya islands.

L550/551: Not sure this statement about that this "lasts for the whole winter" is appropriate given that the SAR image is from April 27 and therefore at the verge to spring.

L599: When does precipitation appear "suddenly"? What is "thick precipitation"? Please revise your wording; it is not clear what you want to state here.

L605/605: "As a results ... is used." --> How about the other way round? Isn't is as likely that MYI is misclassified as deformed FYI?

L610: Have you looked at MYI ice cores? Did you find air bubbles of 2 cm diameter? I think this is an order of magnitude too large. I suggest that you dive into papers / book chapters describing ground- or air-borne active microwave measurements over different ice types and attempts to understand the reasons for the observed differences. The book Microwave Remote Sensing of Sea Ice by F. D. Carsey might be a good source - as well as papers dating back to the 1980s.

L612: While in the context of brightness temperatures you refered to melt-refreeze cycles and the role of snow wetness this is missing completely here in this discussion. What happens to the radar backscatter signal if the snow gets wet?

L617:
- Why this add-on "that is ice-free during summer"? What is the problem here?
- Also: "does not help" appears to be too global a statement. For the case you chose, the combination does not reveal advantages but there are enough other cases where it works very well. I recommend to tone this statement down by writing, e.g., "does not always help" or similar.

L619: "the worst SITY classification" --> I can buy this statement for Zhang-SITY but while agreeably OSISAF-SITY is not correct IFREMER-A is clearly worse. Consider rephrasing please.
Note that April 27 is already a time of the year where solar radiation induced snow metamorphism can play a large role in shaping both passive and active microwave signatures of sea ice - even though 2m-air temperatures are still below zero.

L645/646: "thus smaller microwave signature differences between MYI and FYI" --> I don't think that the Tschudi et al paper is an appropriate reference for this change in the physical properties of the multiyear ice when becoming younger. You might look into earlier work published by Comiso et al. or others in this regard.

L675/676: "surface processes such as wet snow attenuation and changes in brine salinity" --> You mix to kinds of processes here, it seems. One is the inter-action between surface properties and microwave radiation (the attenuation part) and one is a physical process (salinity change). I recommend to first give examples of the changes in physical properties and then write about the impact these changes could have on the microwave signature.
Are you sure that the salinity of the brine changes? Isn't it rather the bulk sea ice salinity or the brine volume?

L682: "finer resolution does not ..." --> You might want to bear in mind that the 4.45 km grid resolution of the Zhang SITY product is based on using satellite data that were resolution-enhanced using the SIRF technology. Such a procedure does not improve the information content ... hence it does not matter how fine the grid resolution is, if the relevant information is blurry - not resolved properly at the original coarse spatial resolution - then it will remain blurry.

L683-684: Why mentioning the near-90 GHz channels here? Is it relevant? Are these used somewhere for ice type discrimination? If not then I suggest to delete this sentence.

L690: "quantitatively" --> for me the inter-comparison of visually interpreted SAR images with the SITY maps is not a quantitative evaluation, also the computation of a Kappa coefficient does not help in this regard. I therefore again recommend to tone down your statement towards inter-comparison.

L693-695: Please make clear whether the ranges of years denoted together with the two products selected refer to the period for which these products are available or to the time period over which you have averaged the differences.

L701: I find the numbered list of paragraphs that follows ok in terms of where which products seems to perform good or not so good. I don't think, however, that you should mention your attempts to explain the differences in the performance because i) your inter-comparison is based on a small set (5) of qualitatively interpreted SAR images (which would be classified differently in an independent follow-on study), resulting in qualitative statements and because ii) you did not investigate / show evidence of misclassification of ice types due to the three mentioned main influencing factors. Also here you remain rather descriptive and do not go into depth. Hence most of the "explanation" given here is rather of hypothetical nature which proof requires further work.

L723+ --> A lot of what follows here is a repetition of what is written in the discussion. Please condense and only provide 2-3 key points which you can also very well back up with the results you obtained. Make sure to highlight the nature of the results (qualitative / intercomparison) and give the outlook towards what would be needed to carry out a quantitative intercomparison or even evaluation. This is in my eyes much more important than, as you do at the end, to highlight which future satellites might be available. First we need to find procedures and well-evaluated data sets for a quantitative evaluation of the ice type products. Without these any novel ice type products from new satellites will be as useful (useless?) as the existing ones.
One obvious step would be to improve sea-ice motion estimates to improve the sea-ice age product - ideally with a smaller temporal resolution. And then: evaluate it adequately.
Another step would be to use well-evaluated ice type information from SAR images that underwent an unsupervised classification. Such information would make that part of the inter-comparison work involving SAR much more credible and the results could potentially even be interpreted quantitatively.

Typos / Editoral comments:
Abtract:
L17 / L18: Specify the QSCAT and the ASCAT periods as many readers will not be aware of these.

L17 / L18: "agree best" and "perform the best" --> Consider rephrasing such that you provide actual numbers of over- or underestimation.

L20: "their performances" --> What is "their" referring to? Products? Sensors? Algorithms?

L38: I suggest to add the reference to Tschudi et al., 2020 to the one of Maslanik et al.

L42: I suggest to delete "such as the Atlantic Meridional Overturning" because this is a circulation in the ocean.

L51: "forecasting(Jung" --> "forecasting (Jung"

L64: "... other on input ..." --> better "... other in terms of input ..." or "... other regarding input ..." or "... other with respect to input ..."

L73: "It is found the ..." --> "The ..."

L77/78: "there is rarely ..." --> suggest to delete this part of the sentence since it is in contradiction to the previous one and begin this sentence with "The performances ..."

L90: "among the SITY" --> "among some existing SITY"
and further : "and give comprehensive evaluations on the ... " --> "and to assess the quality of the"

L113: "SSMR" --> "SMMR"

L119: "measures" --> "measured" because it is not operating anymore.

L123: "for inner ... from ..." --> "of 48.9 degree and 57.6 degree for the inner HH-polarized beam and the outer VV-polarized beam, respectively, from ..."

L124: "antennas, whose incidence angles varies between 25 degrees and 65 degrees" --> "antennas, each measuring backscatter over the incidence angle range of 25 degrees to 65 degrees."

For consistency with the previous subsection you could add the periods of operation for QuikSCAT and ASCAT.

L146: "microstructure" --> such as density, grain size and orientation ...? Consider adding.

L153: "it" --> "these ice types"

L170/171: "and the climate record covered the period 1979-2020." --> "and was updated until 2021, covering the period 1979-2020."

L172: "is not be" --> "is not"

L180/181: "The swath data ..." Since this seems to be the final step of the pre-processing I would formulate it that way. For example: "As the last step of the pre-processing the corrected TB swath data are gridded into daily 25 km EASE2 grid TB maps." --> using which kind of gridding?

L194: "Water" --> "water"

L209: "is switched to AMSR-2" --> "was switched to AMSR2"

L210/211: Since the pre-processing of the C3S-SITY product is also done on the swath data I suggest to reformulate accordingly: "Unlike C3S-SITY, the core Bayesian ..."

L224: "KNMI-A respectively, available during" --> "KNMI-A, respectively, available for"

L268: "and active" --> delete, this is wrong. Such data are not used in that product.

L270: I would not call the work cited here as "assessments" and suggest to rephrase along the lines: "has been shown to provide very useful additional information about the changing Arctic sea ice cover because ..."

L274: "and ice motion data" --> "and the quality of the ice motion data."

L284: "Two" --> "two"

L288: "acquiring" --> "acquisition"

L297-299: "... Basin and limited ... . Note ... and analysis." --> I suggest to shorten this by writing: "... Basin excluding the area north of 87 degrees North with its observation data gap due to the satellites' inclinations (see Belmonte Rivas et al., 2018 and Fig. 2)."

L299: "as the integral extent of pixels" --> "as the sum of the area of all grid cells"

L303/304: "by the integral extent of pixels" --> "as the sum of the area of all grid cells"

L309-311: "Besides ... " --> "In order to compare the MYI extents at the same temporal resolution, SITY product MYI extents are averaged weekly to match the temporal resolution of the NSIDC SIA MYI extent."

L331: "to the UTM projection" --> "to the respective UTM projection" as this varies with longitude and latitude.

L364: "with bias" --> "with a bias"

L366 and L367: "bias" --> "biases"

L373: "scatteromter" --> "scatterometer"

L375: "product are" --> "products are"

L393: "difference" --> either "the .... difference" or "differences". Also note the wrong superscript "e" of the "10" in this line.

L398/399 and L400: "deviation of" --> Either it is a deviation of a quantity from another quantity ... which is not the case here, or it is the "deviation between" different products ... which is the case here. Please rephrase accordingly, because what I guess you want to tell is that, e.g., in mid-winter the difference of the MYI extent derived from the different SITY products is small - or in other words: The deviation of the MYI extent between the different SITY products is small.

L404: "For the" --> "Regarding the" or "With respect to the"

L404: What do you refer to by "the latter"? Is it "the other products"? Or are you refering to OSISAF-SITY?

L405: "mild ... trend" --> weather can be mild ... how about "weak ... trend" or "small ... trend"; furthermore: "rapid ... trend" --> rapid has something to do with speed and time, hasn't it? How about "large ... trend"?

L414: "while that ... increasing". --> either: "while the trend in the BCS regions is either zero or positive" or: "while the MYI extent in the BCS region remains constant or is increasing."

L418: "torwards south" --> either: "towards the South" or "south"

L419/420: "In the BCS region, ... out of this region ... from the CAO region." --> consider rewriting this sentence. Certainly MYI drifts in the BCS region following the Beaufort Gyre. It enters the BCS from the North along the CAA and it eventually exits the BCS westward into ESL or back northward into CAO at the western borders of the BCS region - eventually entering the TDS. It also simply melts there (in summer).

L421: The statement of an increasing MYI extent in the BCS region should be supported by a notion of seasonality. Most likely MYI extent in this region is at a minimum in September and increases towards winter by MYI drifting into it from the North.

L493: Again, please enclose the full date.

L498: "As shown in ..." --> I guess that fact that Table 3 shows up here was not planned?

L505: "of SAR image" --> "of the SAR image"

L506: "to the case in" ... something missing here?

L525: "as" --> "is"

L549: "nearly" --> "near"

L552: "northeast of the image" --> Did you mean "in the northeastern part of the image"?

L559/560: What do you mean by "distinct"? Do you perhaps mean "different"?

L561: "as a cross-validation dataset." --> "as an inter-comparison data set."

L564: "o of" --> "of"

L594: "...capability to separate and physical ..." --> Please check this sentence; its meaning is not clear.

L598: Over sea ice I would speak of melt-refreeze cycles only and hence remove the "wet-dry cycles".

L602: "in C3S-2" can be deleted here.

L604: "disparate" --> I know, this comment comes somewhat late in my review but I recommend to check the meaning and usage of "disparate" versus the meaning of "different". I doubt that the microwave and/or scattering properties of MYI and FYI can be termed "disparate". They are different but they share similar (but different) basic scattering mechanisms. Unless you can be sure that the differences are so substantial that they exclude each other, i.e. absolutely no volume scatter in case of FYI or absolutely no surface scatter for MYI, or the like, I recommend to always rather speak of "differences" - here and everywhere (...) else in the paper.

L623: Either "on an a priori training dataset" or "on a priori training datasets"

L625: "dataset" --> "datasets"

L639: "variabilities in the" --> I guess "variabilities as in the" is better.

L642: "fails to identify narrow MYI tongue in peripheral seas" --> "fails to identify features such as a narrow MYI tongue often observed in the Arctic peripheral seas ..."

L662: "aims to reassign the erroneously classified FYI" --> Not clear, better: "aims to re-assign the ice type MYI to grid cells where MYI was erroneously classified as FYI"

L673: "the five series SITY products" --> ?? What series?

L686: "... especially the fraction of MYI. The change of the SITY ..."

L687: "... inter-comparisons and analyses of SITY products ..."

L689: Please state that the NSIDC-SIA product is a weekly one and that you averaged (?) all SITY products to the same temporal resolution before the comparison.

L719 "disparate" = completely different? Really? See my previous comment about usage of "disparate".

Table 1:

"SSMI/I" --> "SSM/I"; "AMSR-2" --> "AMSR2"; "ASMR-E" --> "AMSR-E"

In the text you describe OSCAT but I cannot see it used in any of the products listed here. Consider removing it in the text?

You state in the text that all SITY products provide daily estimates. You can therefore delete the column denoted "Frequency".

I suggest to change "grid size" into "grid resolution". Is the type of all grids the same (all EASE or all polarstereographic)? If not it might make sense to add a column where this is specified.

Table 2 footnote: "to verify ... open water" --> better "to assess the correct discrimination of sea ice from open water."
You might want to explain also what "ice motion confining" means.

Fig. 3:

- Please for all scenes add a date.

- Do all scenes have the same spatial scale? If not please provide a scale along with every scene, if yes provide it once.

- I suggest to include in (b) that the bright features may also be due to openings in the ice cover under high wind speed conditions.

- I suggest to add in (f) that these are MYI floes in a matrix of younger, presumably FYI.

- Whether (c) indeed shows brash ice between ice floes depends a lot on the location and the season (which are unknown?).

Fig. 4:

- I suggest to delete the dashed line with the daily MYI extent. It does not add value to the figure now that you have the weekly average - rather it adds noise.

- "the shaded area represents" --> "the shaded area in the same color as the respective solid line represents"

Hide

ED: Publish subject to minor revisions (review by editor) (19 Dec 2022) by John Yackel

AR by Yufang Ye on behalf of the Authors (05 Jan 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (05 Jan 2023) by John Yackel

AR by Yufang Ye on behalf of the Authors (06 Jan 2023)

Short summary

Arctic sea ice type (SITY) variation is a sensitive indicator of climate change. This study gives a systematic inter-comparison and evaluation of eight SITY products. Main results include differences in SITY products being significant, with average Arctic multiyear ice extent up to 1.8×10⁶ km²; Ku-band scatterometer SITY products generally performing better; and factors such as satellite inputs, classification methods, training datasets and post-processing highly impacting their performance.