Edinburgh Explorer Data assimilation using a hybrid ice flow model

. Hybrid models, or depth-integrated ﬂow models that include the effect of both longitudinal stresses and vertical shearing, are becoming more prevalent in dynamical ice modeling. Under a wide range of conditions they closely ap-proximate the well-known First Order stress balance, yet are of computationally lower dimension, and thus require less intensive resources. Concomitant with the development and use of these models is the need to perform inversions of observed data. Here, an inverse control method is extended to use a hybrid ﬂow model as a forward model. We derive an adjoint of a hybrid model and use it for inversion of ice-stream basal traction from observed surface velocities. A novel aspect of the adjoint derivation is a retention of nonlinearities in Glen’s ﬂow law. Experiments show that in some cases, including those nonlinearities is advantageous in minimization of the cost function, yielding a more efﬁcient inversion procedure.


Introduction
Direct observations of many parameters crucial to behavior of glaciers and ice sheets are practically impossible (e.g. history of ice-sheet-wide surface temperature and precipitation, ice fabric) and those that are feasible are logistically challenging and usually confined to specific locations (e.g. basal sediments, subglacial water pressure). Therefore, the application of inverse methods in glaciology continues to gain popularity. Although inversion for history of the atmospheric temperature (MacAyeal et al., 1991;Dahl-Jensen et al., 1998), precipitation (Waddington et al., 2007, and firn thermal properties (Sergienko et al., 2008b) have been performed, inversions for the ice-stiffness parameter of ice shelves and ice-stream basal parameters are most common Correspondence to: D. N. Goldberg (daniel.goldberg@noaa.gov) (e.g. MacAyeal, 1992;MacAyeal et al., 1995;Rommelaere, 1997;Vieli and Payne, 2003;Larour et al., 2005;Khazendar et al., 2007;Sergienko et al., 2008a;Joughin et al., 2009). Traditionally, these inversions are done in the integral leastsquare sense, i.e. a total misfit between observed and calculated quantities is minimized. This approach is known as an optimal control or control method (MacAyeal, 1992(MacAyeal, , 1993. Alternative methods include a probabilistic approach (Chandler et al., 2006;Gudmundsson and Raymond, 2008;Raymond and Gudmundsson, 2009), and various iterative approaches to solving the inverse problem using higher-order forward models (Maxwell et al., 2008;Arthern and Gudmundsson, 2010).
Any inverse method includes a forward model as a necessary component. In numerous studies inverting either for rheological properties of ice shelves or basal conditions under ice streams, the so-called Shallow Shelf Approximation (SSA) (Morland and Shoemaker, 1982;Muszynski and Birchfield, 1987;MacAyeal, 1989) is used as a forward model. However, a new trend of using higher-order or full stress-balance models as forward models in inversions has started to emerge (Maxwell et al., 2008;Morlighem et al., 2010a). The SSA balance is of lower computational dimension than the First Order or Full Stokes balances (Greve and Blatter, 2009), and therefore easier to solve. It does not account, though, for the effect of vertical shear, which has an effect on the nonlinear Glen's Law viscosity (Glen, 1955) and the basal velocity used in flow laws, and which can be nonnegligible where basal traction is high. This is true of inland areas of the Antarctic Ice Sheet and majority of the Greenland Ice Sheet. Also, in their study of Pine Island Glacier, Vieli and Payne (2003) speculate that ∼20% of the observed velocity in the region of high driving stress just upstream of the grounding line is due to vertical shear, and that this contributed to quantitative errors in their analysis using the SSA balance. 316 D. N. Goldberg and O. V. Sergienko: Data assimilation using a hybrid ice flow model A class of glaciological models involves a verticallyintegrated stress balance that includes the effect of vertical shearing stresses, but also includes horizontal stress terms (the terms present in the SSA). For the purpose of discussion these models are referred to here as "hybrid" models since they combine two low-order approximations: the SSA and the Shallow Ice Approximation (SIA, Hutter, 1983). For example, Bueler and Brown (2009) heuristically combine the results of an SSA solution with an SIA solution, while Pollard and DeConto (2009), Schoof and Hindmarsh (2010), and Goldberg (2011) use depth-integrated forms of the horizontal stress terms. While these hybrid models account for all of the stress terms in the First Order balance, they have a computational advantage in that the elliptic solve (the most expensive step) is not resolved in the vertical. Note that, unlike the First Order balance, these models do not allow depth variations of horizontal stresses. However, in the approximation to First Order is shown to be quite good under a wide range of conditions (Pattyn et al., 2008;Goldberg, 2011). Furthermore, two of these models (Bueler and Brown, 2009;Pollard and DeConto, 2009) have been used in time-dependent wholecontinent simulations of Antarctica and Greenland. Clearly it is of value to be able to use the hybrid models as forward models in inversion procedures in order to find an optimal set of unknown parameters for these models. However, such an inversion has not yet been performed. Additionally, as use of these models becomes more common it will be advantageous to perform comprehensive and efficient analysis of the model sensitivities to a wide range of input parameters.
The control method developed by MacAyeal (1992) involves a construction of a model adjoint to the SSA equations in order to find the gradients of the performance index (or cost function) with respect to inverted parameters (basal sliding parameters in studies by e.g., MacAyeal, 1992;Joughin et al., 2009, and the ice stiffness parameter in studies by e.g., MacAyeal et al., 1995;Larour et al., 2005). The adjoint model is a powerful tool that allows one, in a single step, to find derivatives with respect to a large number of parameters at a point in solution space. However, in deriving this adjoint model, nonlinearities, such as the dependence of the Glen's Law viscosity on strain rates, are ignored. It is not clear whether the inclusion of this dependence is advantageous to the performance of the method, since without it the adjoint equation is the same as the linear one solved iteratively in the forward model.
In this paper we invert surface velocities for basal traction fields using the hybrid model from Goldberg (2011) as a forward model. Both synthetic and observed surface velocities are used in the inversions. The paper is organized in the following manner: in Sect. 2 the forward model is briefly introduced. The inversion scheme, which includes the adjoint model as a central part, is presented and discussed in Sect. 3, with the derivation and some of the lengthier expressions relegated to the Appendix A. Sections 4 and 5 present inversions of synthetic velocities, and results of an inversion using satellite-inferred surface velocities on Pine Island Glacier are shown in Sect. 6. Special attention is paid to the effects of including the nonlinearities mentioned above in the adjoint model on the convergence of the inversion scheme.

Forward model
The forward model used in this study is the one described in Goldberg (2011). It can be derived from a variational formulation, using a modified form of the energy functional that leads to the First Order balance (Reist, 2005;Schoof, 2010). Using the First Order model results of the ISMIP-HOM experiments as a benchmark (Ice Sheet Model Intercomparison Project -Higher-Order Models, Pattyn et al., 2008), good agreement is shown for length scales larger than ∼20 km in basal topography and for all length scales in basal traction. The equations are given here: Here u and v are x-and y-velocities, respectively, s is surface elevation, H is thickness (s − b, where b(x,y) is basal elevation), and n represents the nonlinearity in Glen's Law and in this study is equal to 3. The overline operator indicates vertically averaging, i.e. u = 1 H s b udz, and u x indicates the x-derivative of this quantity (and not the vertical average of u x ). The surface is stress-free. When sliding is present, the sliding law is in terms of the shear stress and velocity at the base. In this study, the sliding law is linear: Note that due to the inclusion of vertical shear, basal, surface, and depth-averaged velocity can all differ, in contrast to the SSA. This does not prevent a significant problem in terms of solving Eqs. (1)-(4), however. An iterative scheme can be developed by depth-integration of Eqs. (1) and (2), and writing u z and u(z = b) in terms of the current iterates of depth-averaged velocity, basal stress and viscosity. (The surface velocity, while not needed in the iterative scheme, can be similarly diagnosed.) This yields a set of equations to be solved for the next iterate of depth-averaged velocity. The equations have the same structure as those solved in an iterative solution of the SSA balance, so an SSA code can be easily modified. For details, please see Goldberg (2011). view settings. The approach is essentially the same as in MacAyeal (1992), and similar to that of Arthern and Gudmundsson (2010) -a steepest-descent method. The differences are (a) the forward model and (b) the fact that nonlinearities are accounted for in constructing an adjoint model. But the same paradigm of finding the search direction as a functional derivative that is then discretized (rather than by differentiating the discretized forward equations) is still adhered to. The effect of accounting for nonlinearities is explored in the subsequent sections. The cost function

Adjoint model
where u s is horizontal velocity at the surface, the asterisk superscript denotes observed quantities, and is the model domain, is minimized over all choices of β. The set of possible β depends on the choice of basis for β: for example, in MacAyeal (1993) it is Fourier modes and in MacAyeal et al. (1995) it is the finite element basis for velocity. Beginning with an initial guess for β, the control method finds the gradient of J with respect to the degrees of freedom of β (subject to the constraint that Eqs. (1)-(4) are satisfied). A line search minimization then gives a new guess for β. A Fletcher-Reeves conjugate search algorithm is used (Press et al., 1992). Typically when this control method is applied to glaciological models, an adjoint model is solved, and the result is used to find the gradient of J with respect to basal traction parameters (or other field that is being inverted for). The advantage of using an adjoint model is that the derivative of a given observable value can be found with respect to a large array of input parameters for the computational cost of a single forward solve. This is in contrast to finding derivatives by direct finite differencing, which requires a separate forward solve for each input parameter. In this study the adjoint of the model described in the previous section is constructed directly from the differential equations, rather than discretizing and taking the adjoint of the discretized model. The result is a set of linear partial differential equations that are then solved in order to find the gradient of J . Note that this procedure does not assume any discretization details, and the discretization of the adjoint can be independent from that of the forward model.
We now present the adjoint model. Its derivation is lengthy and not entirely straightforward, and so it is left to the Appendix A. Furthermore, the expressions involved in the adjoint itself are lengthy, so details are only given for the adjoint of the flowline version of the model (i.e., flow in the x − z plane). The adjoint of the three-dimensional model, and its derivation, are very similar.
The flowline version of Eqs.
(1)- (3) is Here u b = u(z = b). In our flowline inversions, periodic boundary conditions are considered. As in MacAyeal (1993), the cost function J is modified: the flowline model appears as a constraint, with lagrange multiplier λ: where [0,L] is the domain. An adjoint model is then derived that must be solved for λ: (10) where F and G are linear operators on their first arguments (λ and u * s − u s , respectively) that also depend on u and β. γ and γ s are functions that depend on the gradients of u: and F and G are given in the Appendix A. (Note that if n = 1, then γ and γ s are equal to H (u − u b ) and H (u s − u), respectively.) Equation (10) is solved for λ with appropriate boundary conditions: if the boundary conditions on Eq. (6) are Dirichlet, then Eq. (10) has homogeneous Dirichlet boundaries. If Eq. (6) has periodic boundary conditions, then Eq. (10) does as well. Note that the form of the adjoint model Eq. (10) is dependent on the forward model and the form of the cost function J , but would be the same no matter which input parameter is being investigated. However, in this study the goal is to find the gradient of J with respect to β, which is done using the following: which gives the response of J to a perturbation in β. Again, K is a linear operator on λ, the specific form of which is given in the Appendix A. With a finite-dimensional representation of β, Eq. (12) can then be discretized to find ∂J ∂β i , where β i are the degrees of freedom in such a representation, i.e.
Equation (10) is written in such a way (i) because the expressions for F , G, and K are lengthy, and (ii) in order to emphasize the effect of ignoring the dependence of viscosity on strain rates, something quite often done in glaciological inversions using adjoints (MacAyeal, 1993;Vieli and Payne, 2003;Larour et al., 2005;Joughin et al., 2009). Doing so here is equivalent to neglecting F , G, and K, and additionally letting n = 1 in Eq. (11). The adjoint model with F and G left in and n = 1 will subsequently be referred to as the complete adjoint and, with F and G ignored and n = 1 as the incomplete adjoint.
It is interesting to note that in doing this, the operator given by the left hand side of Eq. (10) is the same as the forward model when the viscosity is "frozen", as in a Picard-iterative solution (MacAyeal and Thomas, 1986). This is not a coincidence; when the strain-rate dependence of viscosity is ignored, the equation system in Goldberg (2011) is linear and self-adjoint under the L 2 inner product. (This is also true of the three-dimensional model.) Thus using the incomplete adjoint saves on development time and also ensures that the adjoint model has the same desirable properties as the forward model (i.e., that the matrix that is solved is symmetric and positive definite). If the matrix is solved in parallel, the domain decomposition and parallel memory allocation need not change. As discussed in Goldberg (2011), the computationally expensive component of the hybrid model is the solution of a system of elliptic PDEs with the same structure as those solved for the SSA balance.
However, the adjoint model is only solved once per iteration of the inverse model, so it is possible that relaxing the property of self-adjointness will not carry too large a penalty. In the following sections, flowline and 2-D (plan view) inversions are carried out, and the effects of including such nonlinearities in the adjoint model are examined. An important point to remember is that, with the same forward model and cost function, the only way in which these effects can manifest is in the rate of convergence of the inversion. Whether nonlinearities are included or not, the solution (or set of solutions) of an inversion is the same.
In this study, the forward and adjoint models are solved using one-dimensional or bilinear finite elements as in Goldberg (2011) and Goldberg et al. (2009), and φ i is equal to 1 on grid cell i and zero elsewhere. Still, the discussion above does not depend on specific details of the discretization.

Flowline inversion
A flowline version of the hybrid model was used to invert synthetic surface velocities for basal traction, assuming a linear sliding law. The experiments are based on the flowline experiments with sliding from the ISMIP-HOM intercomparison. The domain is periodic, and ice thickness has a constant value of 1000 m and a surface at an angle of 0.1 • with the horizontal, and the Glen's Law constant is uniformly equal to 2.1544 × 10 5 Pa (m a −1 ) − 1 3 . The velocities inverted for β are the surface velocities from a First Order flowline solver (that used in Goldberg, 2011) with the same thickness, surface slope, and Glen's Law constant, and a β-profile given by where basal stress is given by and L x is the length of the domain. As shown in Goldberg (2011), the hybrid model surface velocities agree well with First Order surface velocities in this setting. And so while these inversions are known to be ill-posed, and thus many different β-profiles could produce surface velocity profiles close to the "target" one, we still know that Eq. (15) is a valid solution to the inverse problem, or at least leads to an cost function value as small as any achieved in this study (Goldberg, 2011). Maxwell et al. (2008) state that their method finds the solution to the inverse problem for which noisy oscillations are minimized. And so we can judge our inversion results not only by the agreement of calculated and prescribed surface velocities, but also by the agreement of the inverted β with Eq. (15). For an inversion scheme, both the incomplete adjoint mentioned in the previous section and the complete adjoint were used. In all inversions, the initial guess for β was set to a uniform value. 300 iterations of the inverse model were done, regardless of the final value of J . Figure 1 shows the results of such an inversion with a domain length of 40 km. Inversions using both incomplete (dashed line) and complete (solid line) adjoints are shown. Values of J versus iteration count are shown, as well as the final inverted β 2 (which is compared with β 2 given by Eq. 15). Comparison is made using β 2 rather than β since it is β 2 that appears in Eq. (16). In the left column, the initial guess for β is 20 Pa 1 2 (m a −1 ) − 1 2 (uniformly), and in the right the initial guess is 40 Pa 1 2 (m a −1 ) − 1 2 . With either initial guess, the complete adjoint reaches a much smaller value of J than the incomplete adjoint (by several orders of magnitude) and finds a solution very close to Eq. (15), while the incomplete adjoint finds a highly oscillatory solution. Using the complete adjoint, J decreases steadily, while with the incomplete adjoint most of the reduction of J is in the first few oscillations.
In fact, with the initial guess of 20 Pa found by the complete adjoint scheme is relatively insensitive to the initial guess for β. Figure 2 shows the results for a domain length of 20 km, with an initial guess for β of 20 Pa Here the contrast between the complete and incomplete adjoints are even more apparent. The complete adjoint steadily reduces J and finds a solution that is reasonably close to Eq. (15), while the incomplete adjoint barely adjusts β from its initial guess and does not reduce J after the very first iteration. No other initial guess was examined for this domain length.
The ability of the adjoint models to represent derivatives, particularly derivatives of surface velocities, can be examined. The calculation of such derivatives can be cast in terms of the adjoint method: using the fact that where δ x i (x) is the Dirac delta function shifted by x i , the same approach as described above is applied to and the adjoint model is again derived, but with a different right hand side than Eq. (10). The expressions are not given here, but again it is simple to separate out the terms corresponding to the strain-rate dependence of viscosity. Figure 3a shows the Jacobian of surface velocities with respect to basal traction values, calculated directly by finite differencing. (With no analytic expression for the Jacobian, this is taken as the "true" value.) Basal traction is given by Eq. (15) and there is no topography. The figure can be seen as a contour plot of ∂u(x i ) ∂β(x j ) , where x i is along the horizontal axis and x j the vertical. Figure 3b shows this Jacobian as calculated using the complete adjoint model, similarly to the way the gradient of J with respect to β is found. Figure 3c is the equivalent calculation using the incomplete adjoint model. Quite a difference between the two can be seen, especially for the derivatives of the velocities in the "slippery" region with respect to the traction values in the "sticky spot". It seems that using the incomplete adjoint model underestimates these sensitivities, and therefore overshoots in its guess for these traction values, leading to the oscillations seen in Figs. 1c and 2b. It was noted in Goldberg (2011) that in the ISMIP-HOM tests of nonsliding flow over wavy topography, agreement of the hybrid model with First Order surface velocities was not as good as in the tests of sliding flow over periodic traction. Inversion for basal topography was not done in this study; however, it is fair to ask whether the presence of basal topography would affect the results presented in this section. A series of tests was done where there is still sliding at the base, but the basal elevation varies from the mean slope sinusoidally with the same wavelength as the basal traction and an amplitude of 100 m (compare with 500 m in the ISMIP-HOM experiments). The results were very similar to those in Fig. 1, and not shown.

Plan view inversion -synthetic data
The ISMIP-HOM intercomparison also includes a set of three dimensional experiments, one of which involves sliding over varying basal traction in a doubly periodic domain. Surface velocities from this experiment were inverted for basal traction, and the results are shown here. Both the complete and incomplete adjoints were used. The forms and derivations of these models are very lengthy and not shown, but they are simple extensions of the flowline adjoint models discussed in the Appendix A. Since a three-dimensional First Order solver was not used in this study, a mean was taken over the publicly available results from the intercomparison that solved the First Order balance (http://homepages.ulb.ac. be/ ∼ fpattyn/ismip/tc-2-95-2008-supplement.zip).
In this experiment the Glen's Law constant and the (uniform) thickness H had the same values as in the flowline experiments. While not directly used, the basal traction specified for the ISMIP-HOM experiment is where L x and L y are the x-and y-dimensions of the domain. As discussed before, this may not necessarily be the only possible solution of the inversion, but it is useful for comparison.
In the plan view inversions, it was found that when the initial guess for β was constant, the residual J did not decrease by much even after a large number of iterations, using either the complete or incomplete adjoint models (see Discussion and conclusions). Instead, two different spatiallyvarying initial guesses for β were considered: Note that β 1 is a Gaussian "bump" in the middle of the domain, while β 2 is sinusoidal in x (but not in y), but its peaks and troughs do not correspond to those of Eq. (19).
The results of the inversion are shown in Fig. 4, for L x and L y equal to 40 km. The left column corresponds to using β 1 as an initial guess, and the right to β 2 . The top row shows residual (J ) as a function of iteration count. In the case of β 1 , the inversion scheme reaches the same value of J after 100 iterations with either the complete or incomplete adjoint. However, J converges more quickly using the complete adjoint. With β 2 as an initial guess, there is almost no decrease in J using the incomplete adjoint, while using the complete adjoint achieves a decrease in J comparable with the β 1 case. The bottom row of Fig. 4 shows β 2 − β 2 ih , where β here is found using the complete adjoint. In the case where β 1 is the initial guess, the final inverted β using the complete and incomplete adjoints are very similar, though this is not true for the β 2 case. In the β 1 case, traction in the "slippery regions" (the top left and bottom right) is slightly overestimated and is slightly underestimated at the centers of the "sticky spots" (bottom left and top right), but overall the agreement is good. There is no remnant of the initial guess seen in the misfit. On the other hand, in the β 2 case the misfit is overwhelmed by a transverse strip at x = L x 2 , where β 2 should be equal to ∼1000 Pa (m a −1 ) −1 but is instead close to zero. This is indeed a remnant of the initial guess, since β 2 is zero along this line. Since horizontal stresses tend to damp out small scales in basal traction this does not have a large effect on the cost function J , but it demonstrates some dependence on initial guess.

Plan view inversion -real data
In addition to synthetic surface velocities, inversions were done using InSAR-and speckle tracking derived surface velocities (Joughin, 2002) and 5 km gridded ice thickness and bed elevation data from the Airborne Geophysical Survey of the Amundsen Sea Embayment, Antarctica (AGASEA) conducted during the 2004-2005 austral summer (Vaughan et al., 2006) (these data are available from http://nsidc.org/ data/nsidc-0292.html). An 80×80 km region containing the grounded portion of Pine Island Glacier (PIG) was selected. (This region contains the areas referred to as the "ice plain", the "steepening", and the "trunk" by Payne et al., 2004.) The object of this exercise was not to ask specific glaciological questions; many studies have used established inversion methods to investigate basal properties of PIG (e.g. Payne et al., 2004;Joughin et al., 2009;Morlighem et al., 2010a). Rather, the purpose is to assess the convergence properties of the hybrid model inversion scheme with "real" data.
The boundary conditions of this inversion differ from the previous plan view inversions in that they are not periodic. The depth-averaged velocities at the boundary of the domain are constrained to be the interpolated InSAR surface velocities. (As discussed in Goldberg (2011), lateral boundary conditions can only influence the solution through their depth average.) Figure 5c shows the convergence behavior of the incomplete and complete adjoints, with an initial, uniform guess for β of 10 Pa 1 2 (m a −1 ) − 1 2 . Basal stress |τ | corresponding to the complete adjoint is shown in Fig. 5d. (We show basal stress rather than β 2 , since in this experiment velocities are not derived synthetically using an analytical expression for β.) Additionally, the relative importance of vertical shear in the corresponding forward model solution is shown by Fig. 5e, in which is plotted. For much of the region speed due to vertical shear is less than 20% of the the surface speed, but there are areas (ones that coincide with high basal traction) where vertical shear accounts for up to 80% of the surface speed. Such areas could not be resolved with the SSA model as the forward model, due to its assumption of no vertical shear. The corresponding fields from the inversion with the incomplete adjoint are very similar both in magnitude and spatial pattern, and are not shown. The convergence behavior shown in Fig. 5c differs from that seen in the experiments using synthetic observations. First, the cost function J is not lowered by as many orders of magnitude. This is expected, though, since the synthetic surface velocities were generated using a flow model to which the forward model is a very close approximant. Second, the difference in convergence rate between the complete and incomplete adjoints is not as dramatic. Inversion with both choices finds similar solutions, and at comparable iteration counts, the cost function in the incomplete adjoint inversion is at most twice that of the complete adjoint inversion. Still, the fact that this is achieved early in the inversion (between 10 and 20 iterations) shows that the complete adjoint could still have some utility.
In further contrast to the prior experiments, sensitivity to the initial guess of β was observed with the complete adjoint. When the initial guess for β was very high (30-40 Pa 1 2 (m a −1 ) − 1 2 ), no convergence was observed for either choice of adjoint. The observed behavior of J was similar to that seen for the incomplete adjoint in the synthetic data experiments: almost no decrease in J was observed. Despite the results of the synthetic-observation experiments, this demonstrates a strong sensitivity of the inversion process to the initial guess of the inverted parameters. The technical aspects of this issue are beyond the scope of the present study, and are the subject of further investigations.
An important consideration is whether the result of such an inversion is appropriate for the forward model. In Goldberg (2011) solutions of the hybrid model were compared with First Order solutions. It was found that the models were in good agreement when the basal slope was smaller than ∼0.07 for flow over a frozen bed, which is close to the maximum basal slope in the region of PIG considered. It was also observed that agreement was very good down to very small length scales in basal traction. And so the basal traction and slope shown in Fig. 5b and 5d  surface velocities of Pine Island and Thwaites Glaciers using an SSA forward model, note that the forward model balance is not strictly applicable in strong-bedded regions. It is possible that inversions with a hybrid model can give more complete results without using a three-dimensional forward model. However, it should be noted that the above statements may not apply for regions where First Order approximations are violated (i.e. near the grounding line). Morlighem et al. (2010a) compared inversions of PIG and its catchment and tributary region using SSA, First Order, and Full Stokes forward models. Their results showed that nonhydrostatic effects were of leading-order importance in the grounding zone (part of which protrudes into the bottom of our domain, between ∼40 and 60 km in the x-direction), where the sharply rising bed exerts a backpressure on the flow.

Discussion and conclusions
Including the nonlinear terms in Eqs. (10) and (12) (i.e., using the complete adjoint instead of the incomplete adjoint) does not change the solution of the inversion; it can only affect whether the inversion scheme finds a minimum of Eq. (5) and the speed of convergence. In this sense the flowline inversions demonstrated a clear advantage in including these terms. Use of the complete adjoint resulted in fast convergence toward a minimum with relative independence on initial guess, which was not the case for inversions using the incomplete adjoint.
In the plan view inversions of synthetic data, the rate of convergence improved with the inclusion of nonlinear terms. However, there was no convergence when the initial guess for β was spatially uniform, whether nonlinear terms were included or not. This is due to specifics of the model set up, namely the periodic boundary conditions. With such conditions and a uniform β, the forward model solution is a velocity field that does not vary in x or y, and so has a small effective strain rate (entirely due to vertical shear) and a high Glen's Law viscosity. The result is that the search direction found by the adjoint model is nearly uniform, even though the misfit (u * s − u s ) has relatively large variation. This effect was verified by decreasing the Glen's Law coefficient B for the first few iterations (not shown).
In the plan view versions of observational data, the improvement of convergence rate was not as dramatic as for of synthetic data, and also the relative insensitivity of the complete adjoint inversion to initial guess seemed to disappear. We point out that there are several reasons why performance of the complete adjoint model does not show a dramatic improvement over the incomplete adjoint. First, there are limitations of the forward model associated with small scales in surface velocity and bed heterogeneity. Second, and perhaps more important, the data sets used in the inversion -bed elevation, ice thickness, and surface velocities -were obtained by different techniques during different time periods, and are incompatible (Morlighem et al., 2010b). Such data incompatibility has a strong effect on the inversion process. By contrast, complete versus incomplete adjoint may give a finetuning effect which is overshadowed by stronger factors. The effects of data compatibility on the results of inversion with a hybrid model are a subject of ongoing investigation.
Still, the PIG inversion shows a small but noticeable difference was seen after a relatively small number of iterations. Since inversions might involve a cutoff after a target residual has been reached rather than a fixed number of iterations, this shows that the complete adjoint may still have some utility in such inversions, provided it is not too expensive to calculate relative to the incomplete adjoint.
Using a hybrid model that accounts for vertical shear within the ice as a forward model has several advantages. Among them are possibilities to invert (or optimize) for parameters over regions that cannot be described by a single zero-order approximation (SIA or SSA) but do not require treatments of Full Stokes models. Bost fast, streaming and slow, vertical shear-driven flow regimes can be considered in the same domain. The hybrid model is computationally more efficient than First Order models, and produces solutions of the same order of accuracy in a wide range of conditions appropriate to ice modeling. The derived adjoint model could be used for numerous applications: from inversion for other model parameters to model sensitivity studies. The adjoint is derived directly from the forward model equations rather than from their discretized equivalents, so the discretization of the adjont does not depend on that of the forward model. The use of a glacial flow model and its adjoint to invert for unknown flow parameters is not new; however, such approaches typically ignore the dependence on strain rates of the nonlinear viscosity. In this study it is seen that, for this particular forward model, including this dependence can have a measured effect on the convergence of the inversion scheme. The model from Goldberg (2011) was the only one considered; however, similar flow models are being developed or are already being used in large-scale ice models (e.g., Pollard and DeConto, 2009;Schoof and Hindmarsh, 2010), and the results of this study may indicate that inclusion of this dependence may be necessary for data assimilation using such models.
While flow in slow-moving regions can be represented more accurately with a hybrid model than an SSA model, Morlighem et al. (2010a) point out that the cost function (5) works better in fast-moving regions. Thus minimizing such a cost function may favor such areas at the cost of a relatively large misfit in slow-moving areas. They suggest using a different expression which measures the logarithm of the misfit (their equation 12). Additionally, they add a regularization term to their equivalent of Eq. (9) that penalizes oscillations in their inverted basal traction field. No such regularization was done in this study. It was seen in some of the results (e.g., Fig. 4d) that very high gradients in β can occur, depending on the initial guess. It is worth investigating whether the modified cost function or regularization term of Morlighem et al. (2010a) changes any of the results of our study.
Inversion of surface velocities for basal traction numbers was the only application of an adjoint model considered in this study, but there are others. Heimbach and Bugnion (2009) examined the sensitivity of the evolution of the Greenland Ice Sheet to initial conditions by deriving an adjoint model for the ice sheet model SICOPOLIS (Greve, 1997) using automatic differentiation tools. While that version of SICOPOLIS made use of the SIA balance to calculate velocities, the need for a similar study involving a model that uses a higher-order stress balance was underlined in their paper. The availability of continental-scale ice sheet models that do so, such as PISM (Bueler and Brown, 2009) or that of Pollard and DeConto (2009) present the possibility for such a study. In these models, the solution of the stress balance for velocities is but a single component of a timestep (the others being evolution of thickness, temperature, and in some cases basal water and isostasy); however, it is the only component that requires the iterative solution of a nonlinear elliptic equation. Solvers of such equations involve indirect matrix solvers, preconditioners, stopping conditions and indeterminate iteration counts. Applying automatic differentiation techniques to these solvers could result in lengthy computation in the derivation of an adjoint. Instead, it may be possible to analytically derive an adjoint for the elliptic solver and integrate it with the techniques used by Heimbach and Bugnion. With such a strategy it is worth considering both complete and incomplete adjoints. The structure of the incomplete adjoint would make it somewhat easier to develop a solver. On the other hand, it was shown in the flowline experiments that the complete adjoint can, in some cases, give a more faithful representation of derivatives. With a time-dependent model, there is potential for accumulation of errors over multiple timesteps, and using a better representation of model derivatives would help to control these errors.