tc-2021-396

The authors present a novel method for emulating ocean-induced sub ice shelf meltrates based on image segmentation and an autoencoder network, which takes the nearby ocean state and ice shelf cavity information as inputs. The methodology shows promising results at emulating the meltrate produced by a full ocean model (NEMO) and appears to provide superior predictions when compared to the "medium-range complexity" PICO and PLUME parameterizations. A separate, idealized, analysis indicates that this conclusion is appropriate, since the ML-based methodology reproduces what we would expect from theory despite being quite different (e.g. in terms of geometry, thermohaline forcing) than the training and validation data. Additionally, in the final discussion, the authors consider potential avenues by which this approach can be used as a baseline to develop more advanced sub ice shelf meltrate parameterizations.

The authors present a novel method for emulating ocean-induced sub ice shelf meltrates based on image segmentation and an autoencoder network, which takes the nearby ocean state and ice shelf cavity information as inputs. The methodology shows promising results at emulating the meltrate produced by a full ocean model (NEMO) and appears to provide superior predictions when compared to the "medium-range complexity" PICO and PLUME parameterizations. A separate, idealized, analysis indicates that this conclusion is appropriate, since the ML-based methodology reproduces what we would expect from theory despite being quite different (e.g. in terms of geometry, thermohaline forcing) than the training and validation data. Additionally, in the final discussion, the authors consider potential avenues by which this approach can be used as a baseline to develop more advanced sub ice shelf meltrate parameterizations.
The paper provides a valuable contribution to the field of ice-ocean modelling, as it establishes that a neural network architecture such as MELTNET can be used to provide computationally efficient, yet accurate, representations of basal melting to an ice sheet model -a critical external forcing mechanism. Moreover, it will be interesting and valuable to see how this framework can be used to aid our development of meltrate parameterizations in future work. The manuscript is methodologically sound, and it is a pleasure to read. I would like to compliment the authors on making figures that convey the main ideas of the methods (which involve many details) concisely and effectively. I have just a few relatively minor comments and suggestions that I think would help improve the paper.

Specific Comments
Making predictions with an ML surrogate is much cheaper than running an ocean model, but the training is not free and can be quite computationally demanding. The authors allude to this in Line 64: "Since the computational cost of a machine learning algorithm is insignificant once it has been trained," However, I think it should be mentioned in this way in the abstract and earlier in the introduction, e.g. around Line 39, since model training can be a major computational expense in ML for high dimensional problems such as those in the geosciences. Moreover, I think the paper would be strengthened by providing some estimate of the computational costs of training and and making predictions with MELTNET. This could be as simple as a table with training and validation walltime for each network, along with the architecture (e.g. was this on a laptop or run in the cloud? how many nodes/cores/threads were used?). Providing these details would help quantify the statement that predictions are almost free, and would help establish to the community that, generally speaking, ML based emulators are worth pursuing.
Lines 101-105: I have a philosophical disagreement with using the temperature and salinity conditions at the icefront instead of using the open boundary conditions to NEMO. MELTNET is an emulator for NEMO (as far as I understand), and therefore it should not use anything that NEMO produces as an input. Rather, it should be given the same boundary conditions and then bypass NEMO altogether. It is fortunate and also useful to note that using either of these conditions provides essentially equivalent result, since this provides an exciting opportunity in the case where ample icefront T/S data are available and could be used as inputs to MELTNET. However, in this paper MELTNET is being presented as a NEMO emulator, so I recommend keeping this note, but using results based on using the same forcing for NEMO and for MELTNET.
Lines 262-270 and 309-314: The comparison to PICO and PLUME in this paper is entirely appropriate. However, in some sense the comparison is unfair since these models are calibrated by tuning 2 global parameters while MELTNET has many degrees of freedom which are optimized during training. One could argue that to make the comparison as fair as possible PICO and PLUME should have spatially varying parameters, which should be calibrated. PICO and PLUME are not used in this way, so I don't think this should be implemented, but it raises a couple of points that I think are worth discussing.
Some details on MELTNET should be included, such as: the degrees of freedom (i.e. number of nodes) for each layer, the number of layers in each stage of the model, and the cost function that is optimized during training (is it simply the norm of the model/data misfit? is there regularization used to penalize large weights?) The PICO and PLUME parameters are not really optimized, but "hand tuned" so I would suggest changing that wording, especially since MELTNET *is* optimized (trained). Additionally, I think that the difference in degrees of freedom could be worth mentioning. In a sense, one could make the argument that the neural network is a way of capturing the additional degrees of freedom that we would want to have in the PICO or PLUME models (for instance with spatially varying parameter fields) but that we don't know how to specify. All in all -I think some discussion or hypotheses for potential reasons on why MELTNET outperforms these models would improve the paper.

Minor/Technical Comments and Suggestions
lines 35 and 37: Please fix citations: "e.g." comes after the citation but should come before line 59: "lower complexity parameterizations", I suggest to make the minor clarification that these are ice sheet model parameterizations Fig 1: I suggest adding a note in the figure caption mentioning that the GAN step is merely a method to generate many realistic T/S profiles for training, but is not necessary for making predictions once MELTNET is trained, with a reference to section 2.3.2 (and possibly Appendix A). The training and prediction stages are clearly delineated in the figure, and your figure caption is well written, but I think adding a note like this will help a reader who is skimming through the figures as quickly as possible (which, of course, will be many people ...).
line 120: I recommend being a bit more specific than "these filters are learned", for instance something like "the weights that make up these filters are learned" line 148: I recommend referencing Fig B2 before making the parenthetical note comparing swish are ReLU, since I went to Fig B2 looking for a comparison between the two, rather than a description of the normalisation and layers line 159-161: For the ocean modellers, could you provide some citations for the specific subgrid-scale parameterizations used? E.g. it sounds like the vertical mixing scheme is from (Gaspar et al, 1990, https://doi.org/10.1029/JC095iC09p16179). What is the scheme for generating lateral viscosity coefficients (Smagorinsky, Leith, etc)? What horizontal and vertical diffusivities are used? I think these details will be nice without being overloading.
Line 169: what is the vertical spacing for each of the 45 vertical levels?
Line 208: wouldn't the constraint on ice shelf area be a maximum, rather than a minimum ?   Fig 3: Have these ice shelf images been rotated to all be in the same orientation, since you mention that you provide ice shelves in all cardinal directions to MELTNET? If so, you may want to mention that some of these are rotated (e.g. "north isn't always up") in the caption. Line 291: I would recommend putting this part of the paper (discussing the idealized geometry experiments) in its own subsection or even section. This would help the reader since you are testing a new hypothesis.