We address the inverse problem of inferring the basal geothermal heat flux from surface velocity observations using a steady-state thermomechanically coupled nonlinear Stokes ice flow model. This is a challenging inverse problem since the map from basal heat flux to surface velocity observables is indirect: the heat flux is a boundary condition for the thermal advection–diffusion equation, which couples to the nonlinear Stokes ice flow equations; together they determine the surface ice flow velocity. This multiphysics inverse problem is formulated as a nonlinear least-squares optimization problem with a cost functional that includes the data misfit between surface velocity observations and model predictions. A Tikhonov regularization term is added to render the problem well posed. We derive adjoint-based gradient and Hessian expressions for the resulting partial differential equation (PDE)-constrained optimization problem and propose an inexact Newton method for its solution. As a consequence of the Petrov–Galerkin discretization of the energy equation, we show that discretization and differentiation do not commute; that is, the order in which we discretize the cost functional and differentiate it affects the correctness of the gradient. Using two- and three-dimensional model problems, we study the prospects for and limitations of the inference of the geothermal heat flux field from surface velocity observations. The results show that the reconstruction improves as the noise level in the observations decreases and that short-wavelength variations in the geothermal heat flux are difficult to recover. We analyze the ill-posedness of the inverse problem as a function of the number of observations by examining the spectrum of the Hessian of the cost functional. Motivated by the popularity of operator-split or staggered solvers for forward multiphysics problems – i.e., those that drop two-way coupling terms to yield a one-way coupled forward Jacobian – we study the effect on the inversion of a one-way coupling of the adjoint energy and Stokes equations. We show that taking such a one-way coupled approach for the adjoint equations can lead to an incorrect gradient and premature termination of optimization iterations. This is due to loss of a descent direction stemming from inconsistency of the gradient with the contours of the cost functional. Nevertheless, one may still obtain a reasonable approximate inverse solution particularly if important features of the reconstructed solution emerge early in optimization iterations, before the premature termination.

We consider the following inverse problem: to infer the unknown basal geothermal heat flux field given surface velocity observations and a non-Newtonian full Stokes ice sheet flow model governed by thermomechanically coupled mass, momentum, and energy equations. Grid-based discretization of the basal heat flux field leads to a high-dimensional inverse problem. The main aim of this paper is to present an efficient method for solving this large-scale coupled-physics inverse problem and to use model problems to study the prospects for, and limitations of, inferring the geothermal heat flux from surface ice velocities.

Ice sheet models are characterized by unknown or uncertain parameters
stemming from the lack of direct observations of the interior and the base of
the ice sheet. Unknown parameters include those that represent basal
friction, basal topography, rheology, geothermal heat flux, and ice
thickness. The geothermal heat flux parameter field, in particular, has a
strong influence on the thermal state of the ice and hence plays a critical
role in understanding the dynamics of the ice sheet through its effect on
basal and internal ice temperatures

When formulating the thermomechanically coupled inverse problem, we must
assume an appropriate thermal regime, which depends critically on the
geothermal heat flux. Ice sheets and glaciers can be in one of the following
four thermal states: (1) all of the ice is below the melting point; (2) the
melting point is reached only at the bed; (3) a basal layer of finite
thickness is at melting point; or (4) all of the ice is at the melting point
except for a surface layer

The inverse problem is formulated as a regularized nonlinear least-squares
minimization problem governed by thermomechanically coupled nonlinear Stokes
and thermal advection–diffusion equations. The cost functional we minimize
represents the sum of the squared differences between observed and predicted
surface velocities and a regularization term that renders this ill-posed
inverse problem well posed. Discretizing the infinite-dimensional geothermal
heat flux field and the governing partial differential equations (PDEs) leads to a large-scale numerical
optimization problem; as such, derivative-based optimization methods offer
the best hope for its efficient solution

We systematically study how well finite-amplitude variations of the geothermal heat flux can be recovered from noisy surface velocity observations. To be precise, we invert for geothermal heat flux fields that contain large and short-wavelength variations using velocity observations with various degrees of error. Our results show that the quality of the reconstructed geothermal heat flux deteriorates with shorter-wavelength variations and with increasing noise level in the observations. In addition, we study the influence of the number of observations and find that the reconstruction improves as the number of observation points increases, provided the discretization of the model equations is sufficiently fine to capture the additional information from a larger number of observations. To analyze prospects and limitations of the inversion, we also investigate the spectrum of the Hessian of the data misfit part of the cost functional, which provides information about directions in parameter space that can be recovered from observations.

A common approach to the numerical solution of multiphysics problems uses
operator splitting; namely, motivated by the difficulty of either solving a
two-way coupled system or computing the Jacobian
of a coupling term, one discards certain coupling terms in the Jacobian of
the forward problem to reduce the two-way coupled problem to one that is
coupled in one direction. The coupled problem is then solved by iterating
back and forth between the solution of single physics components. This
approach, which we term “one-way coupled”, can often yield convergence to
the solution of the fully coupled multiphysics problem, depending on the
spectral radius of a certain iteration matrix. The one-way coupled approach
has been used successfully for the solution of thermomechanically coupled ice
sheet forward problems in

However, when solving the corresponding multiphysics inverse problem using gradient-based methods, the use of such a one-way coupled approach may be problematic. In particular, sacrificing coupling terms in the Jacobian (while often acceptable for the forward problem) will lead to an incorrect adjoint operator, since this operator is given by the transpose of the Jacobian. This approximate adjoint operator leads to an incorrect adjoint solution, which then leads to an incorrect gradient. Since the necessary optimality condition for the inverse problem states that the gradient must vanish, an incorrect gradient leads to the wrong solution of the inverse problem. Moreover, since line search methods require descent on the cost functional in a direction based on the gradient, the inconsistency between the cost functional and its gradient can lead to failure of the line search and thus lack of convergence. Thus, sacrificing coupling terms as is commonly done for the forward problem may not lead to convergent inverse iterations, and if the inverse iterations do converge, they will converge to the wrong inverse solution.

In general, how much of a difference this will make to the solution of the inverse problem will depend on the strength of the coupling terms that have been neglected in the adjoint problem. In particular, despite a gradient that has been computed from an incorrect adjoint equation, and early termination of optimization iterations, one might still obtain a reasonable approximation of the correct inverse solution. To illustrate these issues in the context of a thermomechanically coupled ice sheet inverse problem, we neglect certain coupling terms in the Jacobian (as might be done in a forward solver), leading to an incorrect adjoint operator. We then compare inversion results obtained using an approximate gradient based on a one-way coupled adjoint operator – which we refer to as a “one-way coupled gradient” – with inversions that use the correct gradient (i.e., based on the fully coupled adjoint). The results indicate that using this one-way coupled gradient instead of the correct gradient leads to a deterioration in the convergence rate of the inverse solver and eventual failure of the line search, but the resulting inverse solution for the geothermal flux does not differ substantially from the correct inverse solution.

The remaining sections of this paper are organized as follows. In
Sect.

We first state the forward problem and then formulate the inverse problem to infer the geothermal heat flux from surface velocity observations.

Ice can be modeled as viscous, incompressible, non-Newtonian, heat-conducting
fluids. Assuming the mass of ice occupying a domain

The energy equation (Eq.

The domain

In summary, the

Primary variables in the forward and adjoint problems.

Next, we present a weak form of the forward
problem (Eqs.

The geothermal heat flux field

The first term in the cost functional

To compute the minimizer for the large-scale optimization
problem Eq. (

Starting with an initial guess for the parameter field

In this section, we provide expressions for the gradient

In what follows, we use the formal Lagrange approach, which computes the
gradient by taking variations of a Lagrangian functional. The Lagrangian
functional

The gradient of

Now that the computation of the gradient, which forms the right-hand side of
the Newton system Eq. (

The resulting incremental adjoint problem, to be solved for the
incremental adjoint velocity, pressure, and temperature (

In conclusion, to evaluate the expression for the
gradient Eq. (

It is well known that the Newton update direction computed by
solving Eq. (

It is critical that the total number of CG iterations be as small as
possible, since as mentioned above, each iteration requires a pair of
forward/adjoint incremental problem solves. Despite the reduction in overall
number of CG iterations provided by inexact solution of the Newton step, the
number can still be large when a good preconditioner is not used. An effective
preconditioner is simply the inverse of the regularization operator, which
amounts to a Laplacian solve on the basal surface. This is because the
Hessian of the data misfit operator, like many ill-posed infinite-dimensional
inverse operators, has eigenvalues that decay to zero; preconditioning by an
inverse Laplacian simply increases the rate of decay. Thus the resulting
preconditioned Hessian behaves like a compact perturbation of the identity
with smooth dominant eigenfunctions, for which CG converges rapidly and in a
number of iterations that is independent of the mesh size; see, for example,

Once a descent direction is computed by inexact solution of the Newton step
equation (Eq.

Initialize/define variables

converged

Perform preconditioned inexact CG iterations for solving

Solve the forward equation with

descent

In this section, we describe the discretization of the forward and the inverse problems and discuss a stabilization technique required to avoid oscillations for advection-dominated problems. We compare two approaches for computing the gradient of the cost functional, namely the OTD and the DTO approaches.

For advection-dominated problems, the standard Galerkin finite element method
applied to the energy equation (Eq.

We use quadratic elements for temperature, and the Taylor–Hood element pair
for velocity and pressure (quadratic elements for velocity and linear
elements for pressure). We let

The SUPG-stabilized discretization of Eq. (

The numerical solution of the inverse problem requires the computation of
gradients of

For standard Galerkin discretizations, OTD and DTO usually coincide, i.e., they result
in exactly the same finite-dimensional gradient. However, the operations of
optimization and discretization do not commute when the forward problem is
discretized by SUPG. As SUPG is used to stabilize the adjoint equation, the
discrete gradient becomes inconsistent with the discrete cost functional.
This is because the discrete adjoint of SUPG stabilization for the forward
equation is not equivalent to SUPG stabilization of the adjoint equation. The
implication of an inconsistent gradient is that the computed gradient may not
actually lead to a direction of descent with respect to the discretized cost
functional, which can result in a failure in the line search and lack of
convergence. In the DTO approach, the SUPG stabilization term
in Eq. (

Both DTO and OTD approaches have advantages and disadvantages, and the
preference for one over the other depends on the circumstances of the problem
at hand

In this section, we study properties of the inverse problem to infer the unknown geothermal heat flux field from surface velocity observations. In particular, we study the limits of our ability to invert for the heat flux as a function of the length scales of the heat flux and of the noise level in the velocity observations.

We consider a two- and a three-dimensional ice slab,

Coordinate system and cross section through a three-dimensional slab of ice, as used in the computational experiments (exaggerated in height for visualization).

In all model problems, we assume that the surface temperature increases as
the elevation decreases as follows:

The boundary conditions are as follows.

On the top surface,

On the bottom surface

On the outflow boundary,

We assume that

In addition, for the three-dimensional problem, we impose
periodic boundary conditions on the fore and aft boundaries, i.e.,

Parameters and constants. Note that we choose

For all numerical experiments, we extract surface velocities at points from
forward solution fields with specified “truth” geothermal heat flux field
as synthetic observations, and add random Gaussian noise to lessen the
“inverse crime”, which occurs when the same numerical method is used to
both synthesize the observations and drive the inverse solution

First, we consider inversion for a geothermal heat flux in a two-dimensional
problem. We discretize the domain,

Two-dimensional mesh (exaggerated in height for visualization).

We first study inversion with a “truth” geothermal heat flux defined by

In Fig.

Figure

Temperature and velocity found by solving the forward problem with
geothermal heat flux given in Eq. (

Reconstruction of geothermal heat flux

Reconstructions of geothermal heat flux with different noise levels
and different wavelength variations for the two-dimensional model problem. In

We continue with a systematic study of the consequence
of the wavelength variation of the geothermal heat flux and of the SNR on the
reconstruction. For this study, we consider different wavelengths of
variations in the “truth” geothermal heat flux

The relative error

In Fig.

For fixed wavelength, the reconstructed geothermal heat flux

For fixed noise level, shorter-wavelength variations of the geothermal heat flux are more difficult to reconstruct.

For short wavelength (e.g.,

We consider 10, 25, 50, and 100 uniformly distributed observation points and two
different meshes, namely a mesh consisting of 40

Shown in

To study the influence of the number of observations and of the mesh
resolution on the ill-posedness of the inversion, we study the properties of
the Hessian matrix of the data misfit component of

Since we are using a Newton method to solve the inverse problem, the Hessian
matrix is available (or more correctly, its action in a particular direction,
as presented in Sect.

Next, we consider a three-dimensional model problem with domain

The top row in Fig.

Reconstruction of geothermal heat flux

The temperature field (in

Multiphysics forward problems are commonly solved using so-called
“one-way coupled” or “operator-split” approaches. For example, for a
coupled problem with two physics components, the first physics subproblem
would be solved assuming the state variables of the second physics subproblem
remain fixed, after which the second physics subproblem is solved using the
just-computed first physics state variables. One then iterates until
convergence, which is guaranteed only if the spectral radius of a certain
iteration matrix is less than unity. If the iteration converges, it converges
to the correct solution. Such one-way coupled solvers have been used
successfully for ice flow forward problems

In the following discussion, we express the forward
problem Eqs. (

Reconstructions of the geothermal heat flux based on the one-way
coupled gradient obtained when the coupling matrix

As an illustration of neglecting Jacobians of coupling terms in the adjoint
equation, we neglect

In Fig.

First, we plot the angle between the exact gradient

These results illustrate several important characteristics of approximations made in inverse problems governed by multiphysics forward models. First, discarding the Jacobians of coupling terms within the adjoint operator can result in substantially incorrect gradients. This could lead to incorrect solution of the inverse problem due to the fact that the vanishing of the gradient constitutes the first-order necessary condition for solution of the inverse problem. It could also lead to premature termination of the iterations due to the loss of a descent direction stemming from inconsistency of the gradient with the contours of the cost function. Second, despite the incorrect gradient, it may still be possible to obtain a reasonable solution to the inverse problem, particularly when the discrepancy between exact and approximate gradients remains small for a sufficient number of iterations to provide a good approximate inverse solution.

We have formulated an inverse problem for estimating the uncertain geothermal
heat flux at the base of an ice sheet or glacier in a thermomechanically
coupled nonlinear Stokes model from surface velocity observations. Since the
forward problem involves an advection-dominated energy equation, a SUPG stabilization was used to suppresses
non-physical oscillations in the temperature field. This required use of a
discretize-then-optimize approach to compute adjoint-based gradients and
Hessians. We advocated an inexact Newton method to solve the discretized
inverse problem. Using two- and three-dimensional model problems, we studied
the identifiability of the geothermal heat flux field on the basal boundary.
We found that the quality of the reconstruction deteriorates with
shorter-wavelength variations of this heat flux and with increasing noise in
the observations. In particular, a geothermal heat flux with a mean value of
0.06 W m

Moreover, we derived expressions for the gradient and the Hessian of the cost functional for a fully thermomechanically coupled Stokes forward model. We discussed problems that can occur when the gradient is approximated by a so-called one-way coupled approach, in which the two-way coupling of Stokes and the energy equations is replaced by one-way coupling, as is frequently done within forward solvers. The results show that the inversion based on a one-way coupled approach can fail to converge due to the inconsistency of the gradient and the cost functional, leading to the loss of a descent direction. Nevertheless, one might still obtain a reasonable approximate inverse solution, particularly if important features of the reconstructed solution emerge early in optimization iterations, before the iterations terminate prematurely.

We have used synthetic observations on idealized geometries to probe the limits of invertibility for the geothermal heat flux field. We have assumed that the ice is cold everywhere and thus enforced a no-slip boundary condition at the base. In reality, the ice may reach the pressure melting point at some basal locations. This requires a different set of boundary conditions, which account for ice either below or at the melting point. Solution of thermomechanically coupled ice flow models with such variational inequality boundary conditions is the subject of our current work.

We appreciate helpful comments from Ginny Catania. This work was partially supported by NSF's Cyber-Enabled Discovery and Innovation Program (OPP-0941678) and DOE Office of Science, Office of Advanced Scientific Computing Research (DE-SC0002710, DE-SC0009286). Hongyu Zhu also acknowledges funding through the ICES NIMS Fellowship. Edited by: G. H. Gudmundsson