Let me start by saying that the work, i.e. the generation of a stable tangent linear and adjoint sea ice model with EVP along with the twin experiments, represents a valuable contribution to the field and is definitely worth publishing. I still have some severe issues with the presentation of the material that need to be addressed.
In general the manuscript has improved greatly over the first version, e.g. most grammar and spelling errors have been removed and the text is much easier to read. Thank you! There are still small problems, some of which I found and marked in the text. I suggest another through internal review to solve these small problems.
The authors responded and reacted to most if my comments adequately. Thank you! There are some points where I disagree with the answers. Some of these answers can be interpreted as misunderstandings.
Major points of critique:
The motivation of the experiment choices is not clear (and when I said “somewhat random collection” I didn’t mean the principle twin-experiment setups as the reply of the authors implies, but the choice of parameters, configurations etc):
The most important term in the sea ice momentum equations is the stress divergence term which includes the rheology formulation. Finding appropriate parameters $P^*$ and $e$ is a fundamental problem in sea ice modelling and has been the subject of many previous papers (many of which are now cited, others are missing, which could have been used for further motivation, btw I am sorry about the van Leuven comment, I didn’t realise that this was never published) and future papers will follow. It is easy to motivate including these parameters into the control vector and the authors do so, but only ***after*** motivating extensively the inclusion of land-fast ice parameters. $P^*$ and $e$ are important and relevant for anyone working with sea ice models and should come first, in the motivation, in the presentation of the experiments, in the (basically absent) discussion, and in the conclusions.
The discussion about land-fast ice parameterisations is relatively new (I am aware of some work in the early 2000, and then Rozman/Itkin around 2014/2015, and the cited papers by König-Beatty, Lemieux and the not cited Einar Olason, who was the first to do a realistic simulation of land fast ice following König-Beatty’s idea of added tensile stress, the k_T parameter, to a Mohr-Coulumbic yield curve, he, btw did not need any tensile stress for the elliptical yield curve to get fast ice in the Kara Sea), and to my knowledge CICE and the sea ice model of the MITgcm are the only sea-models with a larger user community to use the parameters. In this sense, it is totally unclear to me, why this parameterisation receives so much attention in this paper, even before the more central rheology parameters $P^*$ and $e$. Further, while the tensile strength parameter $k_T$ is part of the “rheology parameters”, the other parameters $k_1$ and $k_2$ of Lemieux’s scheme are definitely not “rheology”, but parameters of an “ad-hoc” (and very effective) parameterisation of grounding. For comparison, I haven’t heard anyone calling the wind stress or ocean stress term, which have the same form as the basal/bottom stress term for grounding, part of the rheology.
The generation of the TLA and regularisation scheme is now described in the appendix, but the description is in an incomprehensible and unacceptable form. Instead of throwing most/all of the available information at the reader, it would have been useful to sketch the derivation and the form of the TLA equations in a compact form. Then it would have also been possible to show and discuss the additional Newtonian damping term, which is now hidden somewhere in eq. A18-20 (I guess) with no discussion of the coefficient values etc. I would have tried to do this in a compact form, maybe more explicitly for a 1D example. In fact, if this were my paper, I would try to find a compact form with symbolic equations including the regularisation for the text in section 2 (the reason for this being the regularisation term, not the TLA derivation, which one could in principle find elsewhere), and then maybe have everything spelled out in detail in the appendix. There’s still text in section 2 that’s unclear (i.e. the transpose of the sparse model matrix of the TL operator implies that this matrix is explicitly formed, which I don’t believe is what the authors meant) and that is not supported by the information in the appendix. The description of the adjoint of the Lax-Wendroff scheme is confusing, because it suddenly involves non-linear terms that are not introduced in A9 and 10.
Again, I don’t think that there’s anything wrong with the presented material, but the presentation does not help much to understand the derivation of the model.
The model in this work is an EVP model that does not have much to do with CICE, except fo the EVP scheme and the B-grid: (1) The default CICE strength parameterisation is different from what is discussed here (and yes, CICE does have the option of using the P* parameterization, but I have not seen many papers explicitly doing so except for the ECCC-group, also the “C_f” parameter in the Rothrock 1975 formulation is has a similar scaling function as P*, but that needs to be argued for in the text, if you want to relate to CICE), (2) there is not nearly as much complexity here as there is in CICE in all aspects, (3) the advection scheme in CICE is totally different from the Lax-Wendroff scheme used here. Relating to CICE so often makes very little sense and should be dropped in most places.
In summary, I think that the content of the manuscript is a very valuable contribution. The presentation still requires work, mainly (1) giving appropriate weight to the different parts of the manuscript (2) proper description of TLA generation and regularisation in a comprehensible compact form that has enough details so that it can be reproduced, but not too much detail that clutters the presentation.
Minor points and suggestions follow. Note that these are based on notes that I took during reading the manuscript and some of the points raised above are repeated here. I will also attach the manuscript pdf with (unrevised) comments for better context (to be viewed with Acroread or Skim).
page 1
ll7: “Taking into account that sea ice observations are available daily, the experiments are configured for a 3-day data assimilation window in a rectangular basin”
Unclear, why this is connected. I would think a 1day window would be the natural choice if there’s new data every data. Also Cryosat2 data is not available in gridded form every day. Needs a better explanation.
ll21: (e.g. Heimbach et al, 2010; Zhang and Rothrock, 2003; Vancoppenolle et al. 2009; Massonnet et al. 2015)
not clear if you refer to model or to models with DA capacities. Heimbach et al describes an adjoint sea ice model simulation (but not DA), Zhang+Rothrock, and Vancoppenolle describe a forward model, Massonnet et al describe solutions of models with some sort of DA or state estimation.
page 2
l39: P* and \alpha are not part of a typical (default) CICE simulation. See major comment.
page 2
l45: (RPs) unnecessary abbreviation? I guess the authors have a different opinion.
page 3
ll82: “Note, that optimization of the RPs through the 4Dvar DA approach allows us to efficiently use all available sea ice observations including sea ice velocity, that are rarely used for assimilation in sea ice DA systems. The latter is due to weak sensitivity of the sea ice state with respect of the ice velocity (e.g. Kauker, et al., 2009). Roughly speaking, the 4dVar DA approach allows us to use sea ice velocity observations for adjustment of the RPs and/or atmospheric forcing in an appropriate manner resulting in a better sea ice forecast (Stroh et al, 2019).
Not clear why the “weak sensitivity” can be brushed aside for this approach, which is essentially the same as that of Kauker et al 2009.
l89: I find OSSE a bit of an overstatement for the type of idealized experiments that are presented here. I would use this term OSSE only for realistic applications when the design of the observing systems is the subject, for example, resolution or accuracy requirements for a new satellite system to be designed, etc.
This work is not about observations, but about TLA model development and testing.
page 4
ll96: Currently, satellite sea ice observations are typically available daily with a reasonably dense spatial resolution. Analysis of the SAR images (e.g. Panteleev et al., 2019) indicates that in the marginal sea ice zone, the pancake/cake ice with floe sizes of ∼1-20 m may be easily replaced by floes exceeding 1 km in size in one week. As a consequence, we configured the OSSEs with a 3-day DA window assuming that such approach should have more impact on short term sea ice forecast.
Unclear reasoning.
page 5
I agree that Eq(6) is correct, but it’s unfortunate, because it hides the form of \Delta^2 as the sum of ice divergence + ice shear/e^2
ll145: the explicit advective time step
-> an explicit time stepping scheme
why not say “a Lax-Wendroff time stepping scheme” here, as you don’t say more in the appendix, either. Linear or non-linear?
l150: “The adjoint code was obtained by transposition of the sparse matrix in the code simulating the action of the TL operator on a perturbed state vector”
In spite of the large amount of information in the appendix, this is still not clear. Is this matrix ever formed explicitly so that you can transpose is? Or is this done only symbolically by re-ording the operations as in “automatic differentiation”?
page 6
Tab1: turning angle 0.4343 rad = 24.88 deg
why such a number?
page 7
ll165: “Martin Losch observed similar instability of the TL EVP solver in the MIT sea ice model (personal communication).”
Not sure if that’s an appropriate reference here? It’s also not necessary. If you really need this, the proper way would be something like this:
A similar instability of the TL EVP solver has been observed in the MITgcm sea ice model (M. Losch, personal communication).
l170: (e.g., Yaremchuk et al, (2009))
referencing scheme: too many parentheses? (e.g., Yaremchuk et al, 2009)
l180: more simple
-> simpler? or just simple?
l180: time scale T_d:
is this the same as in eq(2)? That doesn’t make sense
l182: a Newtonian
page 8
l200: MIT
the model code is called “MITgcm”, not MIT, please correct everywhere
l216: missing parentheses around https://icesat-2.gsfc.nasa.gov/ ?
page 9
l219: A similar error level was …
l225: “with spatial decorrelation scale of 150 km and temporal decorrelation scale of 7 days” two missing articles?
l228: where do these numbers come from? ( 600×8990 ≈ 5.4 · 10^6 )
Section 2.4 OSSEs
I maintain that the systematics between the first group and there rest remain unclear. These are two very different experiments and it’s not clear why the LF scheme of Lemieux et al 2016 receives similar attention as the optimisation of the more universal parameters P* and e.
page 10
ll247: The maximum number of control variables associated with the initial conditions (the number of ice model grid points occupied by the SIT, SIC and SIV fields) was about 9000. The RP control fields were defined on coarser (δxp=15 (or 7)δx) grids with bilinear interpolation on the model grid of the respective OSSEs. Thus, the maximum dimension of the RP control vector never exceeded 36 elements.
This remains very unclear.
261: of kT was set to 0.6
I maintain that this is an unusually large value, except for explicit land fast ice simulations. You wouldn’t use that universally in a Pan-Arctic simulations. (see Olason 2016, A dynamical model of Kara Sea land-fast ice, JGR, doi:10.1002/2016JC011638, where k_T=0 with the elliptical yield curve in spite of the focus on land fast ice). Tremblay and Hakakian (2006) report upper and lower bounds for compressive and tensile stress, I am not sure if this can be used to infer kT=0.6. In König-Beatty and Holland, the noisy EVP solver prohibited land fast ice for smaller values of kT, Lemieux et al (2016) discuss the value of kT and obtain better agreement with observations with small values of kT = O(0.1).
l270: “due to the absence of tensile strength in ice (kT =0)”
and due to the non-converged EVP solution, see König-Beatty and Holland
page 11
Caption of Fig2 does not refer to (e) and and (f) explicitly (but to a-d)
page 13
l284: 3m -> 3 m? In fact, the use of a space before a unit is inconsistent throughout the text.
l295: and SIV
that’s not different than the “true solution”, where “initial velocities [...] were set to zero” (l288)
l304 the true solution
page 15
paragraph starting at l319:
Rather than using the numerical difficulties associated with the Heaviside function to waive the optimization of k1, one could use this analysis to argue for a smooth parameterisation, that would also be more physical, because it is very unlikely that in a grid cell of a finite (usually large) extend of order (km) all ice ground at the same time and instantaneously (like in a cloud scheme not the entire grid cell of a AGCM will suddenly go through a phase transition). Instead a smooth transition is more appropriate as we expect the physics of sea ice to be smooth.
I can see that this is beyond the scope of the paper, but I also think that the scope of the paper is fuzzy with these two types of experiments. It’s not clear why the optimization of well established and very important parameters P* and “e” is juxtaposed with two parameters of a relatively new parameterisations that are hardly used.
page 15
Section 4 “eccentricity” is not what “e” is. It’s has been defined correctly before.
l335: had the form of a Gaussian-shaped cyclone
ll339: the trace of the stress tensor Ptr = −trσ/2
this is commonly called $\sigma_I$, the first invariant of the stress tensor, or divergence of stress. Not sure why this needs a new name here.
l348: as most of the currently available observations.
ll348: In the experiments we did not introduce any bias to ice observations since it is a common assumption in the existing DA systems.
rephrase so that it says what you mean (bias free observations are a common assumption in DA systems)
page 18
l364: definition of $S_u$: why did you use this norm and not the more common RMSE? Or is this implied? Unclear
page 18
l368: std(hkopt − htrue)
In what sense is this form different from the definition of S_u? If it is different, why would you use different forms?
ll369: I find it much more plausible, that the impact on velocities is largest by optimising rheology parameters that directly affect the momentum equations. The effect on the thickness field will take longer assimilation windows because the changed dynamics needs time to advect thickness (and concentration). [And later you say so, why not here?]
l380: “that spatial locations of extrema agrees well with true distribution”
fix grammar
page 19
ll387: “that inaccurate position of the atmospheric cyclone”
there’s an article missing
l389: “( 0.1 m/s)”
the extra space after “(“ happens very often, but not always. I am not sure if this is intentional or not, but I guess, the space should be removed here and everywhere else.
l396: the wind appears to dominate the optimisation. Without proper wind, the optimisation appears meaningless pointing to severe problems in the sea ice model parameterisations (why should they depend on the wind?). It thinks that’s worth pointing out, maybe in the discussion?
page 20
l400: to a less degree than e.
fix grammar
ll403: “To mimic these conditions, we conducted another OSSE with spatially and temporally invariant sea ice concentration A=1. Numerically this was achieved by removing the advection equation from eq. (4), and removing initial A0 from the control vector C0.”
That changes the model and makes it difficult to understand the generality of the results. Why not construct an experiment where the ice strength is strong enough (relative to the wind) so that the system does not move? That would be more “realistic”.
l413 “about 0.5 N/m2”
that’s a lot!
l415: “SIT increases over almost everywhere”
this raises the question of volume conservation. Does the model conserve volume, and is this important for the optimisation?
l418: maximuma
-> maxima (a spell checker would have found this)
page 21
l431: “because sub-optimal Ptr distribution fails”
missing article
l431: maximums
maxima (used previously)
l435 “std[P 2 (opt) − P (true)]”
Ptr squared?
l437: in Figure 9e,f
ll439: “The effect could probably be attributed to the region with zero convergence along the western boundary where the rheology does not play a significant role.”
rephrase to be more specific.
page 23
“Section 5 Conclusions and discussion” promises the unusual order of conclusions at the beginning followed by a discussion (why not do it conventionally with discussion first, followed by conclusions?), but then starts with a long summary of the results where the actual conclusions are hidden in the details, and at the end there are two paragraphs of discussion.
I suggest to re-write this section entirely for better clarity.
l445 with respect to
l445: “all rheological parameters”
k2 (and k1) is not a rheological parameter and “k_T” is not a parameter that is commonly used (i.e. different from zero).
l448: “Lemieux et al., (2015, 2016) and Konig Beatty and Holland, (2010)”
(and elsewhere) remove “,” before “(“
l455: “Analysis of the TL approximation accuracy has shown that Newtonian stabilization has errors similar to the ones observed in the case of diffusion-based stabilization, and thus the Newtonian scheme can be successfully used in sea ice models based on the EVP solvers.”
This analysis/discussion would be very important to understand in detail, because it is potentially something that readers may want to apply themselves. However, the “analysis” is extremely short, leaving out important aspects, like a stability analysis based on the value of the damping coefficients, etc. It would be interesting to understand, why this term successfully stabilizes the system, etc.
l471: Since you have cited Losch et al (2014) already, they could be cited here again, because there AD tools were already used to compute the matrix times vector operation necessary for the matrix free JFNK with FGMRES in a sea ice model, however without significant improvement over simpler finite difference schemes.
page 24
l480: “kT and k2, responsible for grounding and arching phenomena”
unless intended as a “chiasms”, I would turn the order around so that the first symbol corresponds to the to the first explanation (k2: grounding, kT: arching)
ll491: “In the second group of OSSEs, we analyzed the possibility of reconstructing spatially varying sea ice strength P* and ellipse axes ratio e distributions.”
I would argue that this is the more important part of the paper and should come first.
l500: collocated in the regions of strong …
collocated with regions of strong …
I am also not sure if “collocate” is the correct word here. I would use “co-locate”
l502: provides a slightly more accurate reconstruction
l504: “Accurate forecasting of Ptr is very important because it better informs avoidance of regions with excessive compressive stress.”
not the right context.
page 25
ll523: This is a totally generic and even inaccurate statement (in spite of its generality) that should be removed or adequately modified: On the one hand, it is not clear to me, how the MOSAiC observations that are very local should help to constrain a local model, where open boundaries and boundary conditions as control parameters should be an important aspect that is not even mentioned in this work. On the other hand, the MOSAiC observations are regionally too confined to serve as a serious data source for a large scale model. I do agree that MOSAiC will be substantial to future sea ice parameter studies, but mostly for slow thermodynamic process and for fast stress parameters, because local drift and deformation as well as stress measurements are obtained. How these data will enter the presented system, is not even addressed in this work.
page 26
l541: the slightly different form of \epsilon vs. \varepsilon is unfortunate, because both are “epsilon” and it’s easy to confuse this with strain rate tensor.
Are Eq A7+8 solved implicitly? u^s and v^s appear in both equations on the left hand side
l550: the Lax-Wendorff scheme
the scheme is called “Lax-Wendroff”!!
l559: MIT -> MITgcm (Heimbach et al 2010)
l560: TAMS -> TAMC
page 27
eq A18-24: I don’t understand, why the time stepping algorithm has been used here to illustrate how the TLA works. It makes the entire presentation far more complicated and difficult to understand.
The for of eq. A180-A24 is unacceptable. I don’t think that any of the reviewers meant this, when asking for a better description of the TLA.
eq. A18-20 shouldn’t the damping term be highlighted somehow? I am not even sure if -eps delta sig1 is the term.
l584: “variational assimilation experiments we used the strong constraint state-space formulation of the problem, minimizing the cost function”
doesn’t the adjoint method start from the cost function defining the scalar product with respect to which the entire equations are derived? The presentation appears backwards through the eyes of the practitioner, but for understanding the principles of what has been done it’s not very useful.
page 28
l601: “with the additional range constraints for the selected control fields (Section 2.3)”
If you are using bounds, a better choice would have been an L-BFGS algorithm (as in m1qn3) with bounds, e.g. L-BFGS-B
Byrd, R. H.; Lu, P.; Nocedal, J.; Zhu, C. (1995). "A Limited Memory Algorithm for Bound Constrained Optimization". SIAM J. Sci. Comput. 16 (5): 1190–1208. doi:10.1137/0916069.
Zhu, C.; Byrd, Richard H.; Lu, Peihuang; Nocedal, Jorge (1997). "L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization". ACM Transactions on Mathematical Software. 23 (4): 550–560. doi:10.1145/279232.279236
http://users.eecs.northwestern.edu/~nocedal/lbfgsb.html
l612: paralell -> parallel
Eq A27-31: the symbol $\mathcal{I}^T$ is not introduced. I have no idea what this is. It should be the “adjoint” model, i.e. the transpose of the tangent linear model matrix, which is never explicitly formed.
l623: adjont -> adjoint
l624: “the Newtonian damping given by the terms −εδσ1,2,3 in eq. (A18-A20)”
this information should’ve come much earlier …
page 32
l723: Lemieux,J.F.,
missing space |