Interactive comment on “ Of the gradient accuracy in Full-Stokes ice flow model : basal slipperiness inference ”

Introduction Conclusions References


Introduction
The main available observations of the cryosphere are generally obtained from remotesensed techniques and are thus essentially surface observations.In other respects, the ice dynamic is known to be highly sensitive to the state of the bedrock and therefore to its modeling.The basal slipperiness is consequently a crucial parameter in the perspective of controlling ice flows.
This raises questions about, on the one hand, the ability for the surface data to encompass basal conditions and, on the other hand, the potential for inverse methods to recover information.
The first question has been treated by many authors through the problem of the effect of the bedrock topography on the surface.Balise and Raymond (1985) propose fluctuations in the case of a Newtonian fluid, using perturbation methods.The nonlocal aspect of the transmission of the variations of the friction coefficient at surface is established in Raymond (1996) depending on the slip ratio (ratio between mean sliding velocities and mean ice deformation velocities).These queries are extended in Gudmundsson (2003), still under the Newtonian hypothesis using perturbation methods.
One of the main conclusions is the increase of the transmission of basal variability at surface with an increased sliding.
The question of the representability of the friction coefficient through surface velocity observations (horizontal and vertical) using an inverse method is studied by Gudmundsson and Raymond (2008).The method, based on a Bayesian approach, is used to study the effect of density and quality of surface velocity data on the estimation of the friction coefficient for a Newtonian fluid and a linear sliding law.A limit wavelength in the reconstruction of small amplitude variations of the friction coefficient is found to be around 50 times the ice thickness.A similar method in the case of a non-Newtonian fluid and a non-linear sliding law is developed in Raymond and Gudmundsson (2009).
In other respects, the identification method based on Macayeal (1993) and widely used (see e.g.Larour et al., 2005;Joughin et al., 2004;Morlighem et al., 2010) makes the assumption that viscosity is independent of the velocity and a limited attention has been paid to the quality of the resulting estimations in terms of spatial variability of the friction coefficient (see Gudmundsson and Raymond, 2008).Comparisons with this so called "self-adjoint" method and the use of an exact adjoint are made by Goldberg and Sergienko (2011), based on a vertically integrated approximation.Limitations for the minimizing process are highlighted when using the "self-adjoint" method.To the best of our knowledge, the use of an exact adjoint for the non-Newtonian Stokes problem has been done only by Petra et al. (2012).A comparison to the results of Gudmundsson and Raymond (2008) on an academic problem bring them to conclude that the exact adjoint method is able to recover wavelengths in the friction coefficient of approximately 20 times the ice thickness in the case of a linear sliding law.The purpose of this study is the numerical evaluation of the limitations of the "selfadjoint" method compared to the exact method.The "self-adjoint" approximation is presented as a limit case of the reverse accumulation method used to compute the adjoint when obtained using source-to-source automatic differentiation.From a strictly numerical perspective, numerical tests on the reachable accuracy of the gradients for both methods are performed, demonstrating an important limitation for the gradient computed by the "self-adjoint" method.We then study the identifiability, for a non-linear sliding law, of high frequencies in the friction coefficient depending on the level of noise considered on synthetic data.The quality of the estimations provided by both methods is compared in the case of dense horizontal surface velocity observations for a quasiuniform flow and then for a realistic flow presenting an important spatial variability.The realistic case is eventually treated for less dense data.

Forward and backward model
In this section, we briefly present what shall be referred to hereafter as the forward model and describe the derivation of the adjoint model and the computation of the adjoint state.

Forward model
The flow model considered here is the bidimensional power-law Stokes model applied to a gravity driven flow, discretized using triangular Taylor-Hood finite-elements and solved on a given domain Ω:

Conclusions References
Tables Figures

Back Close
Full A velocity profile corresponding to the solution of the Stokes problem for a uniform steady flow in a flat channel with non-linear friction defined by Eq. ( 4) at the bottom is prescribed on the inflow boundary.This solution u = (u x, u z), expressed in the "meanslope" reference frame (x, z), writes (see e.g.Martin and Monnier, 2013): with θ defining the slope of the channel, H the height of the upper-surface and h the thickness of the channel.
A hydrostatic pressure is considered on the outflow.All the simulations are performed with an exponent m = 3 for the sliding law.The solution of the continuous forward problem is obtained using a classical fixed point algorithm. The

The basic principles of the adjoint model
The output of the forward model is represented by a scalar valued function j called cost function, depending on the parameters of the model, which evaluate a quantity to minimize.In presence of observations, part of the cost measures the discrepancy between the computed state and an observed state (through any type of data).
The parameters of interest are called control variables and constitute a control vector k.The minimizing procedure operates on this control vector to generate a set of parameters which allows a computed state closer to the observations to be obtained.In the sequel, the control vector includes only the spatially variable friction coefficient β.The optimal control problem we solve reads: This optimization problem is solved numerically by a descent algorithm.Thus, we need to compute the gradient of the cost function.This is done by introducing the adjoint model.

Cost function, twin experiments and Morozov's discrepancy principle
The cost function used for the identification is defined by: where the synthetic data u obs s are the horizontal surface velocities perturbed with a random Gaussian noise of varying level δ.The term T (∇β) called Tykhonov's regularization controls the oscillations of the control variable gradient.It is defined by:

Conclusions References
Tables Figures

Back Close
Full where the parameter γ quantifies the strength of the imposed smoothness.This term plays a role of convexification for the optimal control problem and thus restricts the region of admissible parameters to smoothly varying fields.The tuning of these weights can be achieved from various considerations generally related to the quality of the data (or the noise level) and the degree of smoothness sought on the control variable.
A classical approach, referred to as the Morozov's discrepancy principle (see e.g.Vogel, 2002;Martin, 2013), consists in choosing γ such that the final cost j (β; γ) matches the noise level on the data.The methodology of twin experiments is used as a first step towards real data processing and calibration model.Twin experiments are defined as follows: reference parameters of the model k ref are used to generate observations u obs .Then, the goal is to retrieve the set of parameters k ref starting from an initial guess k = k ref using the minimization of the cost function j .Thus, the algorithm compute at each iteration i a new set of parameters k i according to the gradient ∂j ∂k in order to make the cost Twin experiments are an ideal framework in which the only source of uncertainty comes from the ignorance of the parameters.It plays a role of validation of the minimization procedure and gives information on its efficiency and on the robustness of the inverse problem.

Derivation of the adjoint model
In order to efficiently compute all partial derivatives of the cost function j (k) with respect to the components of the control vector k, we introduce the adjoint model (see e.g.Lions, 1971).
In DassFlow software, the adjoint model is obtained by using algorithmic differentiation of the source code (see Honnorat et al., 2007;DassFlow Software).This last approach ensures a better consistency between the computed cost function (including all types of errors i.e. discretization errors, rounding errors, iterative algorithms etc.) and its gradient, since it is the computed cost function that is differentiated.A large part Introduction

Conclusions References
Tables Figures

Back Close
Full of this extensive task can be automated using automatic differentiation (see Griewank, 1989).In the case of DassFlow-Ice, the direct code is written in Fortran 95 and is derived using the automatic differentiation tool Tapenade (see Hascoët and Pascual, 2004).
Let K be the space of control variables and Y the space of the forward code response.In the present case, we have: where β is defined by Eq. ( 4).
Let us point out that we include both the state and the cost function in the response of the forward code.The direct code can be represented as an operator M : K −→ Y such that: The tangent model becomes ∂M ∂k (k) : K −→ Y.As input variable, it takes a perturbation of the control vector dk ∈ K, it then gives the variation dY ∈ Y as output variable: The adjoint model is defined as the adjoint operator of the tangent model.This can be represented as follows: Now, let us make the link between the adjoint code and the gradient dj dk we seek to compute.By definition of the adjoint, we have: It reads, using the relations presented above: If we set dY * = (0, 1) T and by denoting the perturbation vector dk = (δβ) T , we obtain: Furthermore, we have by definition: Therefore, the adjoint variable dk * (output of the adjoint code with dY * = (0, 1) T ) corresponds to the partial derivatives of the cost function j : A single integration of the forward model followed by a single integration of the adjoint model allow to compute all components of the gradient of the cost function.
The optimal control problem (Eq.9) is solved using a local descent algorithm, more precisely the L-BFGS algorithm (a quasi-Newton method), implemented in the M1QN3 routine (see Gilbert and Lemaréchal, 1989).Thus, these partial derivatives are used as input to the minimization algorithm M1QN3.The global optimization process is represented in Fig. 1.Introduction

Conclusions References
Tables Figures

Back Close
Full

Validation of the adjoint model
In order to control the validity of the adjoint code two tests are performed: the scalar product test and the gradient test.

Gradient Test
This test aim to control that the partial derivatives of the cost function are correctly computed by comparing it with a finite difference approximation (see e.g.Honnorat et al., 2007, for the detailed test procedures).
Let us consider the following order two central finite difference approximation of the gradient: This scheme leads us to define: According to Eq. ( 16), one must have: lim The gradient test consists in verifying this property.

Scalar product test
The scalar product test check that the adjoint code is actually the adjoint of the tangent linear code by verifying that the relation (Eq.12) holds numerically.Introduction

Conclusions References
Tables Figures

Back Close
Full 3 Exact adjoint, reverse accumulation and "self-adjoint" approximation The current model has been obtained using algorithmic (or automatic) differentiation of the source code.Automatic differentiation of a fixed point type iterative routine of the form y = Φ(y, u) (such as the solution of the non-linear Stokes problem using a Picard method) is carried out by reverse accumulation (see Griewank, 1989;Griewank et al., 1993).The reverse accumulation technique consists in building a computational graph for the function evaluation where the nodes of the graph represent every value taken by the function.To every node, an adjoint quantity containing the gradient of the function Φ with respect to the node, is associated.
The adjoint values are computed in reverse order.The final value of the gradient is given by the sum of the partial derivatives of the function of the nodes of the computational graph.This result is a consequence of the chain rule.This process a priori requires to store as many state of the system as iterations performed by the forward solver to reach the converged state.
It is shown by Christianson (1994) that, in the case of a forward computation carried out by a fixed point method, the adjoint quantity also satisfies a fixed point problem whose rate of convergence is at least equal to the rate of convergence of the forward fixed point.Based on this result, it is a priori necessary to retain every iteration of the forward run to evaluate the gradient.In practice, as further detailed in Sect.3.2, the number of reverse iterations required to obtain an adjoint state with the same precision of the forward state can be adjusted depending on the convergence speed of the direct construction.
The "self-adjoint" approximation in glaciology, applied to the shallow-ice system, has been proposed by Macayeal (1993).The approximation consists of deriving the adjoint equation system without taking into account the explicit dependency of the viscosity η to the velocity field u.It seems important to recall that the terminology "self-adjoint" only makes sense in the Newtonian case (n = 1) due to a resulting symmetrical stiffness matrix.Introduction

Conclusions References
Tables Figures

Back Close
Full In general, the procedure consists in calculating a mechanical equilibrium based on the complete non-linear system and to obtain the gradient by simply transposing the final computed state.The extension of this method to the non-Newtonian Stokes problem can be done without difficulty (see Morlighem et al., 2010).The mechanical meaning of this method for n = 1 is unclear.
In contrast, in the automatic differentiation context, the meaning of this approximation can be easily deduced.Indeed, the use of such a method comes to retain, in the reverse accumulation process, only the gradient computed from the final evaluation of the function Φ.The quality of such an approximation is thus questionable and will strongly depend on the problem one considers and the required accuracy on the gradient.
The quality of this approximation (compared to the exact adjoint state) for parameter identification is assessed by Goldberg and Sergienko (2011) for depth-integrated shallow-ice type equations but has never been treated for the full-Stokes equations.

Precision of the "self-adjoint" approximation
We consider the flow described in Sect.2.1.The domain is a channel with a flat bottom and constant thickness with an aspect ratio of 1/10 on a 10 % slope.The friction condition at the bottom is given by Eq. ( 4) with a constant β and an exponent m = 3.A stationary free surface flow, uniform with respect to x, is thus obtained.
The cost function is defined by: where the observations u obs are the horizontal velocities at the surface Γ s , (x, z) designates the mean-slope frame and the control vector k is restricted to the discrete basal slipperiness β.
The gradient tests carried out for the "self-adjoint" and exact adjoint methods, using cost function ( 18 precision of the forward problem ν = u k+1 − u k / u k in order to quantify the best attainable precision by the backward problem with respect to ν.This precision is explicitly given to the direct solver through a convergence threshold but can be seen as the available accuracy on the data u obs ; data presenting a noise of 0.01 % corresponds to a direct solution accuracy ν = 10 −4 .The gradient test compares the gradient computed by the adjoint code to a reference gradient.For these tests, the reference gradient is obtained using a centered finite difference approximation (of order 2) computed for a precision on the function evaluation of 10 −12 .This precision being considerably higher than those considered for the solution of the forward problem, the finite difference gradient plays the role of an "exact" value (see Sect. 2.5).
The exact adjoint shows the expected theoretical behavior.We recover the slope of 2 (in logarithmic scale) associated with the order of convergence of the finite difference approximation (16).Figure 2 thus shows that the precision of the adjoint state is of the same order as the one of the direct solver.
On the contrary, the precision of the gradient provided by the "self-adjoint" approximation is rather limited.The best reachable precision is slightly smaller than 1 irrespective of the direct solver precision ν (and thus, only one gradient test curve is plotted on Fig. 2, for the case ν = 10 −8 , ν being the precision of the forward solution).
The "self-adjoint" approximation used within a parameter identification process is thus not able to provide an accurate gradient.However, as further discussed thereafter, numerical tests demonstrate a certain ability for this approximation to partially reconstruct the friction coefficient (for a computational cost well below the one of the exact adjoint).Nevertheless, significant weaknesses for the reconstruction of high frequencies as well as the reconstruction of the main frequency of the friction coefficient signal, specifically for extreme situations of sliding (very slow or very fast), are brought to the fore.Introduction

Conclusions References
Tables Figures

Back Close
Full

Truncation of the reverse accumulation
This section focuses on the effect of a truncation of the reverse accumulation process.Figure 3 plots gradient test results obtained for a truncated evaluation of the adjoint state.To do so, we run the gradient test on adjoint states successively cut of one additional iteration.We thus obtain all the levels of precision attainable by all the intermediary adjoint states between the full adjoint and the "self-adjoint" approximation.This test is carried out for various levels of precision ν of the direct solver.The number of iterations performed by the direct solver to reach the required accuracy ν depends on this precision.The previous results concerning the precision of the gradient presented previously are well recovered (see Fig. 2).The lowest precision, identical for every ν and equal to 0.6, is obtained from the "self-adjoint" approximation (corresponding to 1 reverse iteration) and the highest precision is reached by the full adjoint (corresponding to the last point of each curve).
A linear decreasing of the error (in logarithmic scale) presenting a slope of 3.7 is observed.This error behavior is coherent with the result of Christianson (1994) who states that the computation of the adjoint state by reverse accumulation amount to a fixed point computation.In the present case, we have a reverse accumulation algorithm presenting a rate of convergence of 3.7.Yet, the convergence speed of the forward fixed point (not plotted here) leads to a slope of 3. The convergence of the adjoint state computation is therefore higher than the one of the direct state computation.This result explains the plateau observed for the final iterations; indeed, a faster convergence of the reverse accumulation algorithm allows to reach the converged adjoint state with fewer iterations.Again, the accuracy of the "self-adjoint" approximation appears strongly limited and the possibility of an incomplete method, intermediary between the complete adjoint and the retention of only one iteration could bring an important gain of precision; taking into account the linearly decreasing error (in logarithmic scale) leads to significantly

Conclusions References
Tables Figures

Back Close
Full improved accuracy for each additional iteration retained during the computation of the adjoint state.Furthermore, the faster convergence of the reverse accumulation algorithm compared to the direct solver allows, in any case, to spare few iterations during the computation of the adjoint state without any loss of precision.The number of unnecessary iterations is a priori strongly dependent on the situation and must be studied in every case.
For the present test case, we observe that the 5 last iterations during the reverse accumulation are useless whatever the level of precision of the forward run (see the plateau on Fig. 3).These 5 last iterations correspond to the 5 first iterations carried out by the direct solver.Avoiding the accumulation of these iterations for the adjoint state evaluation amounts to starting the reverse accumulation from a residual on the forward run of 0.1 (i.e. a relative variation between two successive iterates of 0.1).This observation, although dependent on the considered case, can be seen as an empirical method to define a criteria on the number of direct iterations that should be accumulated to obtain the best accuracy on the adjoint state.In the present case, it amounts to initiate the memory storage of direct iterations once the direct solver residual is lower than 0.1.

Basal slipperiness identifiability
This section focuses on the practical limits of identifiability of the basal slipperiness by both the full adjoint and the "self-adjoint" method.
The main goal is to draw conclusions on the possibility of using the "self-adjoint" method (which brings an important time and memory saving) and then on the quality of the results it provides in the perspective of realistic identification of the basal slipperiness.The quality of the results is evaluated in terms of frequencies and amplitudes of the reconstructed friction coefficients compared to the target ones.

Conclusions References
Tables Figures

Back Close
Full As presented before, the precision of the "self-adjoint" gradient is bounded, whatever the level of precision of the direct solver ν.This level of precision can be seen as an a priori accuracy on the data considered in the cost function.Only synthetic data are used in the present work.A Gaussian noise of level δ is thus added a posteriori to emulate real data.The precision of the full adjoint gradient depending on ν (and equivalently on the level of noise δ on the data), we seek to observe which value of ν is required to observe the limit of precision of the self-adjoint method.
To this end, we consider three noise levels δ of 0.01 %, 0.1 % and 1 % representing respectively very low, low and realistic noise.In all cases, to just observe the final cost reached by both methods does not allow their precision, for a noise level greater or equal to 1 %, to be distinguished.On the contrary, the frequency analysis demonstrates that an identical final cost does not mean an identical reconstruction of the friction coefficient.It shows the equifinality issue on this type of ill-posed inverse problem.
We first consider the idealized case of a quasi-uniform flow in a sloppy channel with flat bottom with very low and low noise level in order to highlight the numerical limits of the "self-adjoint" method.
We then perform pseudo-realistic, spatially variable, flow experiments with a realistic noise for various density of the surface data.All the identifications presented hereafter use, as an initial guess for the friction coefficient, the average value a of the target coefficient.

Quasi-uniform flow
The following experiments are performed on the same flat bottom channel as in Sect.3.1.A non-linear friction law, defined by Eq. ( 4) is considered at the bottom with an exponent m = 3.The target friction coefficient, variable in x, is given by: TCD Introduction

Conclusions References
Tables Figures

Back Close
Full where a is the average value of the basal slipperiness in Pa • s • m −1 .By extension, we set: where dx = 0.2 m denotes the length of a basal edge or, in other words, the sharpness of the bedrock discretization.
We set β r = β 3 r , the friction coefficient resulting from the sum of 4 frequencies corresponding to wavelengths of 20, 10, 5 and 2 edge length dx.The low frequency f 0 represents a carrier wave for the 3 higher frequencies f i , 1 ≤ i ≤ 3.In terms of thickness of the domain h, frequencies f i , 0 ≤ i ≤ 3 correspond to wavelengths of 4h, 2h, 0.8h and 0.4h respectively.The coefficients β n r , 1 ≤ n ≤ 3 are plotted on Fig. 4 for the case a = 1.These properties are summarized in Table 1.
The flow is uniform when the basal slipperiness is constant along the domain and can be described as quasi-uniform when the basal slipperiness is given by Eq. ( 19).
We seek to determine the level of spatial variability of the basal slipperiness the full adjoint and the "self-adjoint" method can provide through the identification process, based on surface velocity observations, with respect to the degree of slip.The degree of slip depends on the value of parameter a and will be described thereafter in terms of slip ratio r.The slip ratio is a dimensionless quantity that quantifies how slippery the Introduction

Conclusions References
Tables Figures

Back Close
Full bedrock is.It is calculated as the ratio of the mean sliding velocity u b to the difference between mean surface velocity u s and mean basal velocity u b (cf.Hindmarsh, 2004).It leads to: A slip ratio r = 1 represents a situation where surface velocities result half from the sliding and half from the deformation.We consider 6 different slip ratios ranging from very high friction (close to adherence) to very rapid sliding.The slip ratios r = 0.005, r = 0.05 and r = 0.5 can be described as moderate sliding and the slip ratios r = 5, r = 50 and r = 500 as rapid sliding.
In order to bring to the front the limitations of the "self-adjoint" approximation, the following identifications of β are performed for noise levels δ = 0.1 % and δ = 0.01 % on the surface velocity data.Let us point out that the "self-adjoint"method provides very similar results to the full adjoint in terms of final cost when δ = 1 % (not plotted on Fig. 5) and the distinction between both methods clearly appears for lower noises.The cost function is defined by: The tuning of the regularization parameter γ is achieved according to the Morozov discrepancy principle (see Sect. 2.3).We plot on Fig. 5 the application of this method to the identifications performed with both methods (full adjoint and "self-adjoint") in the case of an intermediate friction (r = 0.5).The corresponding curves for other slip ratios are identical and consequently not plotted.Figure 5 clearly demonstrates the inability, for the "self-adjoint" method, to provide a gradient accurate enough for sufficiently low noise.As a matter of fact, the "selfadjoint" gradient does not allow the optimal misfit, corresponding to noise levels δ = 0.1 % and δ = 0.01 %, to be reached.In this situation, the "self-adjoint" approximation is a priori not valid.This observation is independent of the degree of slip.

Conclusions References
Tables Figures

Back Close
Full In order to study the effects of the approximation on the gradient computation, we compare, in the following, the friction coefficient inferred by both methods for δ = 0.01 % and δ = 0.1 %.
The best inferred friction coefficient (according to Morozov) are noted β f for the fulladjoint and β s for the self-adjoint.The quantities β f and β s thus denote their associated Discrete Fourier Transform (DFT).We denote by β r the DFT associated to the target friction coefficient (Eq.19).The DFT β r , β f and β s obtained for the three small slip ratios (moderate sliding) are plotted on Fig. 6 and those obtained for the three high slip ratios (rapid sliding) are plotted on Fig. 7.

Moderate sliding
One observes first that frequencies f 0 and f 1 (see Table 1) are globally well reproduced by both methods for δ = 0.01 % whatever the slip ratio.Namely the carrier frequency f 0 is very well reconstructed by both methods and this property seems desirable.The full adjoint method shows a greater robustness when identifying these two low frequencies with respect to the slip ratio whereas a noticeable deterioration for the identification of frequency f 1 occurs for the "self-adjoint" method when slip ratio decreases.
However, frequency f 2 appears correctly captured by the full adjoint method while it does not appear in the spectrum of the "self-adjoint" one.An increased difficulty in capturing this frequency occurs with slip ratio decreasing.
Finally, the highest frequency f 3 does not appear in any of the spectrums of both β f and β s whatever the degree of slip.For a noise level δ = 0.1 %, one loses the ability to retrieve frequency f 2 using the exact method.The identification of frequency f 1 is accurately obtained for the slip ratio r 1 = 0.5 but we observe a deterioration of the result when slip ratio decreases.The "self-adjoint" method captures almost none of frequency f 1 whatever the slip ratio.
Concerning the carrier frequency, one observes difficulties for the "self-adjoint" method to reconstruct it accurately even for the slip ratio r 1 = 0.5.The frequency dis-

Conclusions References
Tables Figures

Back Close
Full tinctly appears on the spectrum but only 80 % of the target amplitude is recovered.The decreasing of the slip ratio deteriorates, for both methods, the identification of f 0 .In the case r 3 = 0.005, the full adjoint method recovers 70 % of the target amplitude where the "self-adjoint" method recovers 50 %.

Rapid sliding
Again, low frequencies f 0 and f 1 are well retrieved with the full adjoint method for every noise level.The carrier wave reconstruction is nevertheless diminished (around 80 % of the target amplitude) compared to the moderate sliding situation r 1 = 0.5 but stable with the increasing of r.Similarly, frequency f 1 is rather well represented by the exact method for all the situations despite a certain degradation with increasing r.However, frequency f 2 does not appear in any spectrum irrespective of both slip ratio and method, contrarily to the moderate sliding situations.Again, frequency f 3 is never captured.
A small but noticeable noise appears for the case r = 500 for the full adjoint method, particularly when δ = 0.01 %.
The "self-adjoint" method shows a relatively good reconstruction of f 0 and f 1 for the case r = 5 but introduces a noises between frequencies f 1 and f 2 .A strong deterioration of the reconstruction occurs when r increases; for a noise level δ = 0.1 %, the "selfadjoint" identification is almost unable to recover the signal for r ≥ 50.

Assessments
From these observations, we draw the following conclusions.Firstly, the degree of slip of the target plays a strong role for the limit of identifiability of the basal slipperiness in terms of frequencies; a smaller slip ratio induces a lower sensitivity of the flow to the basal slipperiness and consequently a higher filtering on the transmission of information from the bedrock to the surface.

TCD Introduction Conclusions References
Tables Figures

Back Close
Full This result is in coherence with the result presented in Martin and Monnier (2013) where one observes a propagation of the sensitivity along the thickness when the vertical velocity profile approaches the one of a plug flow.
A strong friction induces a vertical velocity profile rather convex with velocity gradients (shearing) mostly concentrated close to the bottom and thus a lower transmission of the information from the bottom to the surface.
Similarly to strong frictions, low frictions also reduce the quality of the reconstruction.Again it comes from a reduced sensitivity of the flow to the basal slipperiness when rapid sliding occurs but this lower sensitivity appears for different reasons.Intuitively, the case of a very low friction leads to lower local topographical effects and the resistance to the ice flow acts through an equivalent global topography at larger scale.This characteristic appears in the explicit solution of the uniform flow (Eq.6): in order for the mathematical expression to makes sense when β tends to 0, it requires the slope parameter θ to tend to 0 as well.This phenomena is physically observed: in the presence of an extended sub-glacial lake, one observes a signature of this lake at the surface as a very flat surface topography over the lake.This interpretation is retrieved in the normalized sensitivities plotted on Fig. 10.
These two observations allow to state the existence of a numerical identifiability maximum for the friction coefficient using adjoint-based method; the best situation to carry out quality identifications corresponds to the intermediate friction range where sliding effects and deformation effects on the dynamics are balanced (typically 0.5 < r < 5).The low accuracy of the "self-adjoint" gradient appears to be a strong limitation in the case of rapid sliding (r > 5).
For the current quasi-uniform flow, for a noise level δ = 0.1 %, a limit on identifiable wavelength using full adjoint, for any degree of slip, is 2h, h the thickness of the domain.
More accurate data could allow to infer higher frequencies in the case of moderate sliding (r ≥ 0.5).Introduction

Conclusions References
Tables Figures

Back Close
Full For the "self-adjoint" method, for a slip ratio r ≤ 5, a wavelength of 4h is well inferred and a wavelength of 2h is captured for r = 0.5 and r = 5.For a slip ratio r > 5, the frequencies considered in the experiment are inappropriate.
In other respects, a tendency for the "self-adjoint" method to introduce non-physical interferences within the inferred coefficient for very low noise appears.This non desirable phenomena increases when the slip ratio takes extreme values.Beyond the approximation aspect, one can deduce a lack of robustness of the "self-adjoint" method for very low noises.It seems coherent regarding the low precision the "self-adjoint" gradient provides.On the contrary, the full adjoint method provides a less accurate identification when the slip ratio goes away from 1 without introducing non physical effects in the inferred parameter.
It is of interest to notice that the inability to recover frequency f 3 is not a numerical limitation.For sufficiently accurate data, it is also identifiable using the exact adjoint.
Similar experiments are performed in the next section for a pseudo-realistic flow ran on a radar vertical profile of the grounded part of the Mertz glacier in Antarctica for surface velocity data with different density and a 1 % noise.

Real topography flow: the Mertz glacier
The flow considered in this section is identical to the one presented in Eq. (2.1).The computational domain is built from real field data; topography of the bedrock and of the surface are bidimensional radar-sensed layer of the Mertz ice tongue in East Antarctica.These layers have been measured along a flowline of this outlet glacier (American program ICECAP 2010, see Greenbaum et al., 2010).Our study focuses on the grounded part of the glacier.
Synthetic data are obtained using the following friction coefficient: and with a the average basal slipperiness, dx = 100 m the bedrock edge length and one set: The context of a non-uniform flow on a complex topography allows to carry on the comparison between both methods in the case of a realistic flow simulation.We can then draw practical conclusions on the validity of using the "self-adjoint" approximation.Frequency f 0 is a carrier wave with 50dx wavelength corresponding to 5h, h ∼ 1km the average thickness of the domain.Frequencies f 1 , f 2 and f 3 correspond then to wavelengths of 2h, h and h/2 respectively, providing a situation similar to the channel test case (see Table 2).
In the present case of a non uniform flow with complex topography, it is not feasible to simulate an average slip ratio r = 500.Given the important spatial variability, we are able to achieve a maximum average slip ratio r = 50.In the following identification, we consider only 5 slip ratios ranging from r = 0.005 to r = 50.The Morozov's discrepancy principle applied to these 5 situations is plotted on Fig. 8.
The observed behavior is similar to the one noticed (but not plotted) for the previous academical situation.Both methods behave identically in terms of cost decreasing for a 1 % noise level on the data.They demonstrate a robust behavior providing in all cases an optimal discrepancy (according to Morozov).The trend of over-fitting of the cost (i.e.Figures

Back Close
Full to realize a discrepancy smaller than the reachable one when using the target coefficient with perturbed data) already observed (for the exact adjoint) demonstrates that the gradient accuracy provided by both methods is a priori adapted to the noise level considered on the data.The peculiar behavior for the case r = 50 where the discrepancy remains lower than the optimal one regardless of the regularization parameter γ is detailed thereafter.Figure 9 plots the DFT of the friction coefficients inferred by both methods and of the target coefficient (Eq.19) for a noise level of 1 % on the data and for the 5 slip ratios r.
While the wavelengths considered in the friction coefficient (Eq.19) are similar (in terms of thickness ratio) to those considered for the quasi-uniform test case, the use of a higher noise on a non-uniform flow deteriorated the reconstruction at all levels.The carrier frequency amplitude (of wavelength 5h) is never fully recovered by any method but clearly appears for r ≤ 5. Likewise, frequency f 1 (of wavelength 2h), well captured in previous simulations by the full adjoint method, is fairly well reconstructed only for 0.05 ≤ r ≤ 5. Again the "self-adjoint" method is able to recover it only very partially.
However, the interferences introduced by the "self-adjoint" method within the inferred friction coefficient do not appear anymore for this level of noise on the surface data.It therefore seems coherent with the limited accuracy of the gradient provided by this method.
As a consequence, the chosen frequencies for these simulations are too high to be recovered in this non-uniform flow with realistic data.Numerical experiments show that an accurate reconstruction can be obtained, for the exact method, for a carrier wave of wavelength 10h and a perturbation of wavelength 5h; shorter wavelengths are not accessible.
What is of further interest is that the exact method brings, in all cases, an enhanced and more faithful reconstruction of the friction coefficient for both the carrier wave and the first perturbation.
The pattern of behavior of the rapid sliding case (r = 50) is different compared to the other cases.The full adjoint method retrieves roughly the carrier frequency with very

Conclusions References
Tables Figures

Back Close
Full high interferences (including one low frequency high amplitude interference) and the "self-adjoint" method does not capture any information of the target signal in addition to the initial guess.
In order to understand this phenomena, we plot on Fig. 10 the gradients ∂j /∂β(β 0 ) with β defined by (Eq.26) for several average value a of the basal slipperiness, described in terms of slip ratio r.The computed gradients are evaluated around β 0 = a.
Increasing the slip ratio has a very clear effect on the sensitivities.For slip ratios r < 1, the sensitivities include the local effects of the high frequencies contained in β, thus providing a highly variable gradient around an average behavior.The fact that the sensitivity decreases with r, due to poorer information transmission between the bottom and the surface, is recovered.It follows that, in the cases r < 1, the limitations in the identification of all the frequencies of the basal slipperiness come from the precision on the data.
The situations r > 1 bring significantly smoother gradients.The cases r = 6 and r = 13, that still represent moderate slip ratios, contain a certain local variability but their rather smooth appearance shows a strong correlation with the global topography and the high frequencies of β seem already erased from the gradient.In these situations, the main component resisting the flow is more the large scale (or equivalent) topography than the friction itself.
For higher slip ratios, the topographical effects seem to vanish as well, and the gradient only grows from the inflow boundary to the outflow boundary to reach a maximum value close to the right border.In the present case, one can deduce that the only effect resisting the flow is the hydrostatic pressure considered on the right boundary.
A global decreasing of the sensitivity with increasing r is also observed, reinforcing the existence of a sensitivity peak for in-between r.For r > 1, it is not the quality of the data that prevents an accurate reconstruction of β but the non-local behavior of the flow.When basal slipperiness vanishes, it does not embody more than a small fraction of the global resistance to the flow.An extreme example is the progress of an ice-shelf on water where the friction resistance is close to zero.In the case of a tridi-Introduction

Conclusions References
Tables Figures

Back Close
Full mensional solution, stresses would be taken over by lateral shearing.In our case, these effects do not exist and it is the hydrostatic pressure boundary condition that resists the flow.These clearly non-local effects demonstrate the inability to control otherwise than globally the flow, thus limiting the range of identifiable frequencies regardless of data accuracy.
These phenomena imply a strong equifinality for basal slipperiness lower than a certain value.This observation appears in the Morozov's curves (see Fig. 8) for the case r = 50.Indeed, the discrepancies for both methods are smaller than the theoretical optimal one, even for very strong regularization (γ large) providing almost constant β around β 0 .The initial cost itself, evaluated for a constant β equal to the average value, is barely higher than the theoretical optimal cost.
The associated minimization problem is ill-posed and the Tykhonov regularization on the gradient of β does not allow to overcome this problem.
For the case r = 50 and a regularization small enough (considering that the Morozov's principle does not allow the optimal γ value to be selected) it is noticeable that the full adjoint method is able to retrieve a small quantity of information, along with a large noise (optimal control problem obviously ill-posed) whereas the "self-adjoint" method does not provide anything else than the initial guess, irrespective of the value of γ.

Density of the data
The previous simulations has been performed using surface velocity data quite dense (one measure every dx).This section deals with identical test cases to the previous section but using sparser (one measure point every 1 km) and thus more realistic data (corresponding to one ice thickness, see e.g.Gudmundsson and Raymond, 2008).This density corresponds to approximately 10 times less measure points than the previous Figures case.We consider thereafter the following friction coefficient for the synthetic data: with dx = 100 m the edge length at the bottom and with: and: The friction coefficient chosen for these simulation contains lower frequencies than the previous one, simulating a carrier wave of wavelength 20h perturbed by high frequencies of wavelengths 10h, 5h and 2h.These characteristics are summarized in Table 3. Results are plotted on Fig. 11 for a noise level of 1 %.As a consequence, the level of identifiability assessed for dense data in the previous section is no longer valid.However, considering that one out of ten points has been retained, results seem rather convincing.The exact method is able to accurately recover frequencies of wavelengths 20h and 10h (corresponding to f 0 and f 1 ) for all degrees of slip.The "self-adjoint" method recovers the carrier wave quite well although a stronger friction (r ≤ 0.05) significantly degrades the reconstruction of the amplitude.Frequency f 1 is well captured for propitious situations (0.5 ≤ r ≤ 5).Frequency f 2 (of wavelength Introduction

Conclusions References
Tables Figures

Back Close
Full 5h, the lowest frequency considered in the dense data situation) is partially reconstructed by the exact method for r ≤ 5 and never captured by the "self-adjoint" method.The case r = 50 is a lot less problematic than previously, due to lower frequencies and subsequently less local effects regarding the sharpness of the bed discretization.A pronounced difficulty appears for the identification of frequency f 1 (of wavelength 10h).The case r = 50 is the only one where frequency f 2 does not appear in the spectrum of β f (consistently to the previous simulations).

Conclusions
The significant CPU-time saving brought by the "self-adjoint" method represents an important asset in its favor.However, its reliability is questionable and it seems important to know its limitations in order to perform realistic experiments.
The realistic simulation (low density data, 1 % noise, real topography, non-linear friction) allows us to assess the full adjoint method's ability to identify accurately wavelength greater or equal to 10 ice thickness and to capture effects of wavelength up to 5 thickness for slip ratio lower than 5.
The "self-adjoint" method is able to reconstruct wavelengths greater than 20 thickness (with noticeable difficulties for strong friction).Wavelengths of 10 ice thickness can be captured in propitious situations of intermediate sliding (0.5 ≤ r ≤ 5).
The results provided by the exact method are significantly better than those given by Petra et al. (2012) (who assess a limit of 20 ice thickness for a non-linear rheology).It is difficult to compare considering that the authors provide neither their slip ratio nor the density of the data.In addition, the authors of Petra et al. (2012) use a linear friction law.
The use of a non-linear friction allows to simulate complex behaviors of the icebedrock interaction.This type of law can describe a non-linear deformation of the basal substrate or a non-linear response of the sliding velocity to the water pressure of subglacial cavities.The former reconstructions focus on the identification of a generic β.Introduction

Conclusions References
Tables Figures

Back Close
Full However one may confidently generalize these results to more complex sliding laws where β would be identified through its parameterization (by a water pressure, a contact surface with sub-glacial cavities, a sedimentary roughness, a geothermal flux,).It is important to recall that over-parameterization is hardly ever in favor of an accurate identification and that the identification of several parameters simultaneously would strongly reinforce the problem of equifinality (i.e. the ill-posedness of the optimization problem).This work focuses on the identification of the basal slipperiness that plays a major role to control the flow (i.e. the model shows a great sensitivity to the friction).The identification of a parameter such as the consistency η 0 , for which the model sensitivity is significantly lower, needs to be done with precaution for the exact adjoint method (see Martin and Monnier, 2013) and thus with increased precaution for the "self-adjoint" method.
Finally, the adjoint obtained from source-to-source algorithmic differentiation allows to simulate every level of needed precision between the best precision of the exact adjoint to the lowest one of the "self-adjoint" approximation.It leads to the consideration of an incomplete adjoint methodology where the approximation is completely adjustable, thus allowing the right compromise between CPU-time, memory burden and required accuracy to be achieved.Numerical experiments demonstrate that the retention of the last two states within the gradient computation significantly improves its precision while maintaining a quite small computational burden.Full one of the earlier study concerning the transmission at surface of basal slipperiness Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | where σ ¯= η(u)D ¯− pId represents the Cauchy stress tensor, η(u) the viscosity, η 0 the consistency of the fluid, n the power-law exponent, D¯the strain rate tensor, u = (u x , u z ) the velocity field, p the pressure field, ρ the ice density, g the gravity and D type sliding law is prescribed at the bedrock boundary Γ fr : sensitivities and identifications carried out in this work use adjoint-based computation and thus require the solution of the adjoint problem associated with the full-Stokes modelDiscussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | It takes dY * ∈ Y as an input variable and provides the adjoint variable dk * ∈ K at outputDiscussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ), are plotted on Fig. 2. The tests are performed for various levels of Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 6 .
Fig. 6.Discrete Fourier Transform of inferred friction coefficient β f and β s and of the target friction coefficient β r .Moderate sliding.