Comment on tc-2021-21

The authors investigate the development of land-fast sea ice around Svalbard using a geographical random forest model. The model is built based on ice charts and freezing and thawing degree days. Results are then used to discuss fast ice extent in the period 1973-2019. The approach is interesting but the manuscript has major shortcomings both in the explanation of the method and the overall structure of the manuscript. I therefore cannot recommend publication in its current form.

The authors investigate the development of land-fast sea ice around Svalbard using a geographical random forest model. The model is built based on ice charts and freezing and thawing degree days. Results are then used to discuss fast ice extent in the period 1973-2019. The approach is interesting but the manuscript has major shortcomings both in the explanation of the method and the overall structure of the manuscript. I therefore cannot recommend publication in its current form.

Major concerns:
A proper results section is missing. Instead, the authors go straight from the description of the random forest model to a discussion, interspersing bits of results with interpretation. It is impossible for the reader to get an impression of the model performance as the main results -the fast ice extent in Isfjorden and around Svalbard over time -is not shown anywhere. The conclusion section contains more results and little in terms of actual conclusions. I strongly recommend restructuring the manuscript: firstly, a basic results section which shows the model results both regarding the spatial distribution of fast ice in the two model domains and the temporal development, and provides a clear overview and summary of model errors, is needed. Then, a comparison of the model results to already published records of fast ice extent would further strengthen trust in model performance before one tries to extend the time series of fast ice extent backwards in time. A proper discussion of strengths and limitations of the model would be useful.
As far as I can see, the only variables included to predict fast ice formation are (positive and negative) freezing degree days. To derive freezing degree days, the authors use temperature records from Hopen, Isfjorden and Kongsfjorden, i.e. in the central Barents Sea and on the west coast of Svalbard. As the spatial distribution of the model performance is not shown, I cannot help but find this highly problematicconditions are vastly different along the western and the northern and eastern side of Svalbard due to the different hydrographic regimes (periodic Atlantic Water inflow preventing ice formation in the west vs predominantly Arctic Waters in the north and east). In line with the comment above, I suggest to improve the presentation of model results and errors, and to include a discussion regarding validity of the chosen predictor variables in different regions of the model. Line 46: "There has been less fast ice off the northern coasts" -is this really what the authors mean? Or rather that there have been fewer studies on fast ice along the north coast? Fast ice is rather extensive in the north, and more so than in e.g. Kongsfjorden or Isfjorden due to a more Arctic climate.

Variability and decadal trends in the Isfjorden (Svalbard) ocean climate and circulation -an indicator for climate change in the European Arctic
Progr Oceanogr, 187 (2020), 10.1016/j.pocean.2020.102394 Section 2: Please clarify: Is the record from Barentsburg used for Isfjorden? And from Ny-Ålesund for Kongsfjorden? Section 3: I don't understand why the authors estimate missing TAVG from interpolated maximum and minimum temperatures -why not interpolating TAVG directly?
Are FDD and TDD estimated for Isfjorden only or for the other locations as well? And how is this then combined in the model for the entire Svalbard archipelago?
Please provide a proper reference for the choice of T_f = 0 degree C. While this is true for freshwater, is is not accurate for fjord waters, but given the wide range of possible salinities in fjords, it probably wouldn't make much difference.
There is a lot of unnecessary detail in the last part of Section 3 which could be shortened.
Line 120: In Figure 5, it looks like ICESD is used as feature, not ICESN. Which one is correct?
Line 127-128: Here, it is stated that one of the model's shortcomings is that extrapolation is not possible -however, isn't that exactly what the authors are attempting? Please clarify, also regarding which range of values is meant (ice-cover vs no ice cover, or range in FDD/TDD, or time period?).
Line 134-135: Please provide an explanation what these results mean and how they are estimated.
Line 137: How is the limit for "satisfactory" set? And what about the Svalbard model? Since the aim is to model fast ice, is there a check whether a hexagon with ice cover is connected to land through other hexagons with ice cover? Line 160-161: There needs to be a discussion whether this holds true for the entire period, given the known influence of Atlantic Water inflow periods on ice cover in western Svalbard fjords (e.g. Pavlov et al., 2013, Tverberg et al., 2019, Nilsen et al., 2008, Skogseth et al., 2020  Line 203: I guess it's debatable whether some of the features along the east coast could be considered fjords… However, there is considerable fast ice in Storfjorden.
Section 5 & 6: I'm confused by the different periods considered: How are they chosen? And why is there suddenly in the Conclusion the period 2014-2019? Please provide a justification for the different periods and then use them consistently throughout results, discussion and conclusion.