Arctic Mission Benefit Analysis: impact of sea ice thickness, freeboard, and snow depth products on sea ice forecast performance

. Assimilation of remote-sensing products of sea ice thickness (SIT) into sea ice–ocean models has been shown to improve the quality of sea ice forecasts. Key open questions are whether assimilation of lower-level data products such as radar freeboard (RFB) can further improve model performance and what performance gains can be achieved through joint assimilation of these data products in combination with a snow depth product. The Arctic Mission Beneﬁt Analysis system was developed to address this type of question. Using the quantitative network design (QND) approach, the system can evaluate, in a mathematically rigorous fashion, the observational constraints imposed by individual and groups of data products. We demonstrate the approach by presenting assessments of the observation impact (added value) of different Earth observation (EO) products in terms of the uncertainty

Abstract.Assimilation of remote-sensing products of sea ice thickness (SIT) into sea ice-ocean models has been shown to improve the quality of sea ice forecasts.Key open questions are whether assimilation of lower-level data products such as radar freeboard (RFB) can further improve model performance and what performance gains can be achieved through joint assimilation of these data products in combination with a snow depth product.The Arctic Mission Benefit Analysis system was developed to address this type of question.Using the quantitative network design (QND) approach, the system can evaluate, in a mathematically rigorous fashion, the observational constraints imposed by individual and groups of data products.We demonstrate the approach by presenting assessments of the observation impact (added value) of different Earth observation (EO) products in terms of the uncertainty reduction in a 4-week forecast of sea ice volume (SIV) and snow volume (SNV) for three regions along the Northern Sea Route in May 2015 using a coupled model of the sea ice-ocean system, specifically the Max Planck Institute Ocean Model.We assess seven satellite products: three real products and four hypothetical products.The real products are monthly SIT, sea ice freeboard (SIFB), and RFB, all derived from CryoSat-2 by the Alfred Wegener Institute.These are complemented by two hypothetical monthly laser free-board (LFB) products with low and high accuracy, as well as two hypothetical monthly snow depth products with low and high accuracy.
On the basis of the per-pixel uncertainty ranges provided with the CryoSat-2 SIT, SIFB, and RFB products, the SIT and RFB achieve a much better performance for SIV than the SIFB product.For SNV, the performance of SIT is only low, the performance of SIFB is higher and the performance of RFB is yet higher.A hypothetical LFB product with low accuracy (20 cm uncertainty) falls between SIFB and RFB in performance for both SIV and SNV.A reduction in the uncertainty of the LFB product to 2 cm yields a significant increase in performance.
Combining either of the SIT or freeboard products with a hypothetical snow depth product achieves a significant performance increase.The uncertainty in the snow product matters: a higher-accuracy product achieves an extra performance gain.Providing spatial and temporal uncertainty correlations with the EO products would be beneficial not only for QND assessments, but also for assimilation of the products.

Introduction
Over the last few decades the state of the Arctic climate system has undergone rapid change.Most pronounced are major decreases in summer sea ice extent and sea ice volume throughout the year.This transformation is affecting marine ecosystems and coastal communities in an unprecedented way.Economic activities such as resource extraction, maritime transportation, and tourism may benefit from these changes provided that risks, e.g. of sea ice hazards, can be managed.In this context, the performance of short-term to seasonal forecasts of sea ice conditions is of crucial importance (Eicken, 2013).
Forecasts of the sea ice and the ocean state are routinely produced by coupled sea ice-ocean models that are driven by prescribed atmospheric conditions.In order to derive reliable forecasts, uncertainties in the models' initial state, of the atmospheric boundary conditions, and in the parameterisations of physical processes need to be minimised.Observations can help to reduce these uncertainties and, thus, to improve the forecast quality.Recently Earth observation (EO) products of sea ice thickness (SIT) have been shown to provide particularly valuable constraints (Lisaeter et al., 2007;Yang et al., 2014;Day et al., 2014;Kauker et al., 2015;Xie et al., 2016).The constraints from lower-level EO products (i.e.rawer products that more directly relate to the actual measurement) that are used to derive SIT products may be even stronger, because rawer products are typically more accurate.For the example of the CryoSat-2 SIT product (Ricker et al., 2014) retrieved by the Alfred Wegener Institute (AWI), the uncertainty in the radar freeboard (RFB) product underlying their SIT retrieval is smaller by about 2 orders of magnitude compared to the derived ice thickness product (Fig. 13).This difference is a consequence of the uncertainty associated in particular with snow and ice density and snow depth, which are used to retrieve SIT from RFB.For direct assimilation of RFB these variables can be extracted from the model into which the data are assimilated, but even in this approach significant uncertainty remains.Hence, the trade-off between assimilation of SIT or RFB requires a rigorous quantitative assessment.This is even more important when the products are assimilated jointly with variables such as snow depth (SND) that introduce complementary information.
Such rigorous assessments can be performed in an efficient manner by the quantitative network design (QND) approach, allowing for an objective evaluation of the added value of observations for a given aspect of a model simulation or forecast.The technique originates from seismology (Hardt and Scherbaum, 1994) and was first applied to the climate system by Rayner et al. (1996), who optimised the spatial distribution of in situ observations of atmospheric carbon dioxide to achieve minimum uncertainty in inferred surface fluxes.After an initial QND study that demonstrated the feasibility of the approach for remote sensing of the column-integrated atmospheric carbon dioxide concentration (Rayner and O'Brien, 2001), QND is now routinely applied in the design of CO 2 space missions (e.g., Patra et al., 2003;Houweling et al., 2004;Crisp et al., 2004;Feng et al., 2009;Kadygrov et al., 2009;Kaminski et al., 2010;Hungershoefer et al., 2010;Rayner et al., 2014;Bovensmann et al., 2015).Kaminski et al. (2012a) presented an interactive Mission Benefit Analysis System that uses the QND approach to assess optical sensors of the land surface.For the western Arctic domain, the QND approach has been successfully applied to evaluate the impact of (hypothetical) airborne measurements of SIT and SND in improving sea ice predictions (Kaminski et al., 2015).The study evaluated two idealised flight transects derived from NASA's Operation IceBridge airborne altimeter ice surveys in terms of their potential to improve 10-day to 5-month forecasts of sea ice conditions, including for operational purposes.
The present study describes the implementation of the QND methodology into a system for Arctic mission benefit analysis (ArcMBA) and then uses the system to investigate the impact of a series of EO products on forecasts of snow and ice volume over three regions along the Northern Sea Route (NSR).It addresses products of SIT, sea ice freeboard (SIFB), RFB, laser freeboard (LFB), and SND.The layout of the remainder of this article is as follows: Section 2 will describe the methodological aspects, including the QND approach, the coupled sea ice-ocean model, and the EO products.Section 3 will present the simulated sensitivities of target quantities and observation equivalents to the model's control vector that is composed of process parameters and initial and boundary conditions.Section 4 will describe the experimental set-up.Section 5 will present the QND assessments, followed by a discussion of these findings in Sect.6.Finally, Sect.7 provides a summary and conclusions.

Quantitative network design
The QND methodology is presented by Kaminski and Rayner (2017) and is partly based on Tarantola (2005) and Rayner et al. (2016).For the sake of self-containedness we provide a shortened form of the presentation by Kaminski and Rayner (2017).QND is a generic concept that is applicable beyond the context of Arctic modelling (see examples in Sect.1).It is, hence, useful to provide a generic presentation in this subsection.The specific elements of our application will be described in the subsequent subsections.
As mentioned, the QND formalism performs a rigorous uncertainty propagation from the observations to a target quantity of interest relying on the indirect link from the observations to the target variables established by a numerical model.We distinguish between four sources of uncertainty in a model simulation: The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/ 1. Uncertainty caused by the formulation of individual process representations and their numerical implementation (structural uncertainty).
2. Uncertainty in constants (process parameters) in the formulation of these processes (parametric uncertainty).
3. Uncertainty in external forcing (boundary values such as surface winds or precipitation) driving the relevant processes.
4. Uncertainty in the state of the system at the beginning of the simulation (initial state uncertainty).
The first category reflects the implementation of the relevant processes into the model (code), while the others can be represented by a set of input quantities controlling the behaviour of a simulation using the given model implementation.The QND procedure formalises the selection of these input quantities through the definition of a control vector, x.The choice of the control vector is a subjective element in the QND procedure.A good choice covers all input quantities with high uncertainty and high impact on simulated observations d mod or target quantities y (Kaminski et al., 2012b;Rayner et al., 2016).
The target quantity may be any quantity that can be extracted from a simulation with the underlying model (in the current study regional integrals of predicted sea ice and snow volumes, see Sect.2.2), but also any component of the control vector, for example a process parameter such as the snow albedo.In the general case, where the target quantity is not part of the control vector, the QND procedure operates in two steps (Fig. 1).The first step (inversion step) uses the observational information to reduce the uncertainty in the control vector, i.e. from a prior to a posterior state of information.The second step (prognostic step) propagates the posterior uncertainty forward to the simulated target quantity.
Within the QND formalism, we present all involved quantities as probability density functions (PDFs).We typically assume a Gaussian form for the prior control vector and the observations, if necessary after a suitable transformation.The Gaussian PDFs' covariance matrices express the uncertainty in the respective quantities, i.e.C(x 0 ) and C(d obs ) for the prior control vector and the observations.In the context of these PDFs we will use the term uncertainty to refer to its full covariance matrix in the case of a vector quantity, and in the case of a scalar quantity or a given vector component it refers to the square root of the entry on the diagonal of the full covariance matrix corresponding to that particular vector component.In the latter case the uncertainty refers to 1 standard deviation of the marginal PDF corresponding to that component, and we use the notation σ (d 2 ) to denote, for example, the standard deviation of the second component of d.
For the first QND step we use a mapping M from control variables onto equivalents of the observations.In our notation the observation operators that map the model state onto the individual data streams (see Kaminski and Mathieu, 2017 and Sect. 2.5) are incorporated in M. M is in practise computed by a specific numerical model with specific inputs and outputs.Let us first consider the case of a linear model, for which we denote by M the Jacobian matrix of M, i.e. the derivative of M with respect to x.In this case, the posterior control vector is described by a Gaussian PDF with uncertainty C(x), which is given by where the data uncertainty C(d) is the combination of two contributions: The term C(d obs ) expresses the uncertainty in the observations and C(d mod ) the uncertainty in the simulated equivalents of the observations M(x).The first term in Eq. ( 1) expresses the impact of the observations and the second term the impact of the prior information.In the non-linear case we use Eq. ( 1) as an approximation of C(x).
The mapping N involved in the second, the uncertainty propagation step, is the mapping from the control vector onto a target quantity, y.The Jacobian matrix N of the mapping N is employed to approximate the propagation of the posterior uncertainty in the control vector C(x) forward to the uncertainty in a target quantity, σ (y) via If the model were perfect, σ (y mod ) would be zero.In contrast, if the control variables were perfectly known, the first term on the right-hand side would be zero.The terms C(d mod ) in Eq. ( 2) and σ (y mod ) in Eq. ( 3) capture the structural uncertainty as well as the uncertainty in those process parameters, boundary values, and initial values that are not included in the control vector.These two terms typically rely on subjective estimates.When comparing the effects of different data sets in the same set-up, σ (y mod ) acts as an offset (for the respective variance) in Eq. ( 3).In our assessments (Sect.5) we provide both terms in Eq. ( 3) separately: we first report two plausible estimates of σ (y mod ), and then, for each product, evaluate Eq. ( 3) with σ (y mod ) set to zero, which sharpens the contrast between the products.
To conduct a valuable QND assessment, the requirement on the model is not that it simulates the target quantities and observations under investigation realistically, but rather that it provides a realistic sensitivity of the target quantities and observations under investigation with respect to a change in the control vector.(As a hypothetical example we can think of a perfect regional tracer model that is run with an offset in the initial or boundary conditions for a passive tracer.The simulated tracer concentration will carry this offset, but all sensitivities will be perfect.)If the sensitivities of the target quantities and observations (i.e. the Jacobians) are realistic, www.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569-2594, 2018   but the simulation of target quantities and observations is incorrect, we can always obtain a valuable QND assessment with appropriate model uncertainty.The result of the assessment may then be that a particular data stream is not useful in constraining a particular target quantity given current modelling capabilities.Under such circumstances, the QND system could be operated with reduced model uncertainty to explore the level of accuracy required of the model for a data stream to serve as a useful constraint on a given target quantity.In particular, when it comes to newly available, unvalidated data streams and target quantities, the accuracy of both the simulation and the sensitivities is hard to assess.In the case of a model that does not capture relevant processes we may expect errors in both the simulation and the sensitivities and consequently also in the QND assessment.Our performance metric is the (relative) reduction in posterior target uncertainty σ (y) 2 with respect to a reference.To compare it against the case without any observations we first compute, as the reference, the prior target uncertainty, σ (y 0 ), The uncertainty reduction with respect to the prior, quantifies the impact of the entire network.A schematic illustration of the approach with the prior and posterior uncertainty ranges is shown in Fig. 2. The observations d 1 and d 2 render a range of trajectories unlikely, which in the first step (Eq. 1) leads to a reduction in uncertainty in the control vector (from C(x 0 ) to C(x)) and in the second step (Eq.3) to Hence, the QND formalism can be employed to evaluate hypothetical candidate networks.Candidate networks are characterised by observational data type, location, sampling frequency and time, and data uncertainty but not the observational value.Here, we define a network as the complete set of the characterisation of observations, d, used to constrain the model.The term network is not meant to imply that the observations are of the same type or that their sampling is coordinated.For example, a network can combine different types of in situ and satellite observations.
In practice, for predefined target quantities and observations, model responses can be precomputed and stored.A network composed of these predefined observations can then be evaluated in terms of the predefined target quantities without any further model runs.Only matrix algebra is required to combine the precomputed sensitivities with the data uncertainty.This aspect is exploited in our ArcMBA system.

Target quantities
For this study we selected target quantities, y, that are particularly relevant for maritime transport, namely predicted sea ice volume (SIV) and snow volume (SNV) over three regions along the NSR.These three regions are displayed in Fig. 3 and named West Laptev Sea (WLS), outer New Siberian Islands (ONSI), and East Siberian Sea (ESS).We make these predictions for 28 May 2015, a point in time at which there is still sufficient snow cover for our prediction to be relevant.These predictions were started on 1 April and constrained by observational information until 30 April; i.e. the assimilation window in April was followed by a 4-week prediction period (Fig. 4).

Sea ice-ocean model
To simulate observation equivalents (M in Eq. 1) and target quantities (N in Eq. 3) we employ a coupled model of the sea ice-ocean system.The model is required to provide realistic simulations of the sensitivity of observation equivalents and target quantities to changes in the control variables.In the present study we use the Max-Planck-Institute Ocean Model (MPIOM, Jungclaus et al., 2012Jungclaus et al., , 2013;;Haak et al., 2003), i.e. the sea ice-ocean component of the Max-Planck-Institute Earth System Model (MPI-ESM, Giorgetta et al., 2013).MPI-ESM regularly provides climate projections for the Intergovernmental Panel on Climate Change (IPCC) in particular for the IPCC's 5th assessment report (Stocker et al., 2013) and the upcoming 6th assessment report (AR6) and within the seasonal-to-decadal prediction system (Müller et al., 2012).In the following we provide a brief description of the current model development status, largely following Jungclaus et al. (2006) and Niederdrenk (2013).
Recent development of the ocean part of the model includes the treatment of horizontal discretisation, which has undergone a transition from a staggered E-grid to an orthogonal curvilinear C-grid.The treatment of subgrid-scale mixing has been improved through the inclusion of a new formulation of bottom boundary layer slope convection, an isopycnal diffusion scheme, and a Gent and McWilliams style eddyinduced mixing parameterisation (Gent and McWilliams, 1990).Along-isopycnal diffusion is formulated following Redi (1982) and Griffies (1998).Isopycnal tracer mixing by unresolved eddies is parameterised following Gent et al. (1995).For the vertical eddy viscosity and diffusion the Richardson number-dependent scheme by Pacanowski and Philander (1981) is used.An additional wind mixing proportional to the cube of the 10 m wind speed (decaying exponentially with depth) compensates for too low turbulent mixing close to the surface.Static instabilities are removed through enhanced vertical diffusion.
A viscous-plastic rheology (Hibler, 1979) is used for the sea ice dynamics.Sea ice thermodynamics are formulated using a Semtner (1976) zero-layer model relating changes in sea ice thickness to a balance of radiant, turbulent, and oceanic heat fluxes.In the zero-layer model the conductive heat flux within the sea ice-snow layer is assumed to be directly proportional to the temperature gradient across the sea ice-snow layer and inversely proportional to the thickness of that layer; i.e. the sea ice does not store heat.The effect of snow accumulation on sea ice is included along with snow-ice transformation when the snow-ice interface sinks www.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569-2594, 2018 below the sea level because of snow loading (flooding).The effect of ice formation and melting is accounted for within the model, assuming a sea ice salinity of 5 g kg −1 .MPIOM allows for an arbitrary placement of the model's poles on an orthogonal curvilinear grid.In the set-up used here (taken from Niederdrenk, 2013;Mikolajewicz et al., 2015;Niederdrenk et al., 2016) the poles are located over Russia and North America (Fig. 5).Placement over land avoids numerical singularities that for poles over the ocean would be caused by the convergence of the meridians, and the non-diametric placement allows a high resolution (average of about 15 km) to be reached in the Arctic.This set-up achieves a spatial resolution as high as that of the EO products we analyse (in fact over the target regions the model resolution is higher) without major computational constraints, which allows an evaluation of the full spatial information content provided by the respective EO products.Here, we will refer to this particular model configuration as Arctic MPIOM.
As forcing data at the ocean's surface, the model needs heat, freshwater, and momentum fluxes.These data are taken from ECMWF's ERA-Interim reanalysis (Dee et al., 2011).ERA-Interim is a global atmospheric reanalysis (of the period from 1979 to present) that is produced by a 2006 release of the Integrated Forecasting System (IFS -version Cy31r2) and applies a 4-dimensional variational analysis with a 12 h analysis window.The spatial resolution of the data set is approximately 80 km (T255 spectral) on 60 vertical levels from the surface up to 0.1 hPa.ERA-Interim surface variables that force Arctic MPIOM are 2 m temperature, 2 m dew point temperature (surrogate of 2 m specific humidity -not provided by ECMWF), 10 m zonal and meridional wind velocity (to calculate the wind speed), total cloud cover and the following fluxes (provided in accumulated form over the 12-hourly forecast window): surface downward solar radiation, surface downward thermal radiation, total precipitation, zonal and meridional wind stress.Land run-off into the ocean is taken from the German Ocean Model Intercomparison Project (OMIP, Röske, 2001).
For the computation of the Jacobians M and N (introduced in Sect.2.1) that is described in Sect.3, we run Arctic MPIOM from a restart file for 1 April 2015.This restart file is in turn generated from a hindcast run of Arctic MPIOM that is initialised on 1 January 1979.This initialisation is based on a set of observations that consists of a topography data set (ETOPO5 5 min gridded elevation data, NOAA, 1988) and a hydrographic climatological data set (Polar science center Hydrographic Climatology, PHC3; Steele et al., 2001) containing potential temperature and salinity.The ocean is initially assumed to be at rest.Sea ice is assumed to be present if the sea surface temperature falls below the freezing tem- The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/perature of seawater.A hundred percent ice cover and a sea ice thickness of 2 m is assumed where sea ice is present and sea ice is assumed to be at rest.From this initial state the model is integrated with the ERA-Interim surface forcing until 31 March 2015 (the beginning of our assimilation window).While a 36-year integration is certainly too short to spin up the deep ocean, it is sufficient for the purpose of this study, because the spin-up time of sea ice and the upper ocean (depth above about 500 m) is generally assumed to be only a few decades.For a successful QND assessment it is essential that MPIOM provides realistic sensitivities of the observation equivalent and the target quantities to the changes in the control vector (Eqs. 1 and 3).However, observations are not available to validate these sensitivities.The only validation of MPIOM possible is against observations of the state of the sea ice and ocean.In the following we present comparisons with selected observation-based products, first for the hind-casting period and then for the assimilation window and the forecasting period.
The hindcast with Arctic MPIOM has been validated against a remotely sensed ice concentration from the reprocessed Ocean and Sea Ice Satellite Application Facility (OSI SAF) sea ice concentration product (Eastwood et al., 2015) and against a combination of in situ and remotely sensed ice thickness observations.In situ observations of sea ice thickness still have a high uncertainty, and each data source has its own strengths and weaknesses.As of today the most reliable pan-Arctic sea ice thickness data set is derived from a combination of various sources of in situ observations and remotely sensed satellite sea ice thickness products (Lindsay and Schweiger, 2015).
The reprocessed OSI SAF sea ice concentration product is available daily on a 10 km spatial grid and includes spatially and temporally varying uncertainty estimates.For an assessment of the performance of the Arctic MPIOM, the www.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569-2594, 2018  et al., 2013), for which the MPIOM was forced with the same atmospheric forcing data set as used in our study (ERA-Interim) (see panel f of their Fig. 3).In September large misfits to the OSI SAF sea ice concentration are obtained (Fig. 6f).Especially over the Eurasian basin, the model's sea ice margin is located too far north, but also over the central Arctic the model underestimates the sea ice concentration.
In our target regions the misfit remains relatively small.The aforementioned analysis by Notz et al. (2013) shows similar misfits (see panel f of their Fig. 4) to a different sea ice concentration data set, namely NSIDC-CDR (National Snow and Ice Data Center Climate Data Record).
An evaluation of the hindcast simulation with Arctic MPIOM with respect to the modelled SIT is much more difficult, because the observation-based products exhibit large uncertainties reflecting the corrections imposed by the respective measurement principles.For example, electromagnetic Air-EM measurements detect the air-snow interface and not the interface between snow and sea ice, introducing significant errors in the SIT estimates that are corrected by assumptions or measurements of snow depth.Moored and submarine ULS measurements have to be corrected for the first return echo.Differences in the observed and measured spatial scales further complicate the comparison.The aforementioned study of Lindsay and Schweiger (2015) synthesises all available in situ and remotely sensed satellite SIT products in an ice thickness regression procedure (ITRP) for the time period 2000 to 2012.Low-order spatial and temporal polynomials are fitted to the available sea ice thickness measurements.The resulting sea ice thickness regression product describes the evolution in the central Arctic and is linear in time plus a quadratic time-dependent component; i.e. it does not contain year-to-year variability.Uncertainty ranges are deduced from the uncertainty of the individual regression coefficients.The year-to-year variability is reflected in this uncertainty.Lindsay and Schweiger (2015) show, for example, that the ICESat ice thickness product from the Jet Propulsion Laboratory (ICESat-JPL, Kwok and Cunningham, 2008), which is widely used for model validation, has a large positive bias of about 40 cm.Here we compare the modelled long-term mean (2000 to 2012) sea ice thickness of the Arctic MPIOM hindcast to the ITRP sea ice thickness for the 2-month periods, February-March and October-November.We selected these 2-month periods, because the availability of the ICESat satellite product ensures a high data coverage in the ITRP.The long-term mean sea ice thickness of the Arctic MPIOM hindcast simulation for February-March and October-November is depicted in Fig. 7 (panels a and b) together with the misfit to the ITRP ice thickness (panels c and d).A prominent feature is a strong underestimation of the Arctic MPIOM sea ice thickness north and west of Fram Strait and in the strait itself.In the regions of interest for our QND study (the areas around the NSR) the misfit is moderate in February-March (overestimation of about 25 %) with the exception around the New Siberian Islands, where the misfit can reach more than 1 m (overestimation of about 50 %).In October-November the misfit is very moderate in these areas, except for Bering Strait, where Arctic MPIOM underestimates the sea ice thickness by more than 50 cm.Next we address the Arctic MPIOM performance over our assimilation and forecasting period (see Fig. 4).We show the April mean and the 28 May mean of the modelled SIT and the misfit of the April mean thickness to that retrieved from CryoSat-2 (Fig. 8).For a comparison of CryoSat-2 thickness to in situ observations we refer to Haas et al. (2017).The misfit to the CryoSat-2 ice thickness in April 2015 is similar to the misfit to the ITRP shown in Fig. 7: a strong underestimation north of the Canadian Archipelago and north and west of Fram Strait and a moderate overestimation in the area of the target quantities of about or less than 50 cm (about 25 % relative error).Figure 9 depicts the April mean and the 28 May mean of the modelled snow depth and the misfit to the modified Warren climatology (Warren et al., 1999) that is used in the CryoSat-2 retrieval (see Sect. 2.5).The main challenge for sea ice thickness retrieval with satellite altimeters is the parameterisation of snow depth on sea ice, which is still not measured routinely.The current CryoSat-2 retrieval uses a modified snow climatology, which addresses shortcomings of the Warren et al. (1999) climatology that was based largely on data from drifting stations, mainly on multi-year sea ice collected over the past decades, and hence is not reflective of a much younger, more seasonal Arctic ice cover.Given the increased fraction of first-year ice in the Arctic Ocean, the approach proposed by Kurtz and Farrell (2011) is used and the climatological snow depth values used in the retrieval are multiplied over first-year ice by a factor of 0.5.Note that on 28 May over the target regions a large fraction of snow cover had already melted.The misfit to the modified Warren climatology in the target area East Siberian Sea is of the order of about 10 cm (50 % relative error) but much less for the other target areas.
Overall, the misfits of the Arctic MPIOM are acceptable in particular for our target regions along the NSR (Fig. 3) and are comparable to misfits found in sea ice-ocean model intercomparison projects (e.g.Chevallier et al., 2017).

Control vector
Criteria for the choice of the control vector are presented in Sect.2.1.The specification of prior, both mean (x 0 ) and uncertainty (C(x 0 )), follows Kaminski et al. (2015), and is listed in Table 1.The largest possible control vector in our modelling system is the superset of initial and surface boundary conditions as well as all parameters in the process formulations, including the observation operators.As described in Sect.3, the Jacobian computation requires an extra run for each additional component of the control vector.To keep our ArcMBA system numerically efficient, 2-and 3-dimensional fields are partitioned into regions.More precisely, we divide the Arctic domain into nine regions (shown in Fig. 10).In each of these regions we add a scalar perturbation to each of the forcing fields (indicated in Table 1 by f in the type column); the perturbation is applied for the entire simulation time.Likewise we add a scalar perturbation to six initial fields indicated in Table 1 by i in the type column.For the ocean temperature and salinity the size of the perturbation is reduced with increasing depth (and zero below 500 m).Finally we have selected 29 process parameters from the sea ice-ocean model plus two parameters from the observation operators for freeboard products (see Sect. 2.5 for details).This procedure results in a total of 157 control variables.We assume the prior uncertainty to have diagonal form; i.e. there are no correlations among the prior uncertainties relating to different components of the control vector.The diagonal entries are the square of the prior standard deviation.For process parameters this standard deviation is estimated from the range of values typically used within the modelling community.The standard deviation for the components of the initial state is based on a model simulation over the past 37 years and computed for the 37-member ensemble corresponding to all states on the same day of the year.Likewise the standard deviation of the surface boundary conditions is computed for the 37-member ensemble corresponding to the April-October means of the respective year.

Data sets and observation operators
The study evaluates three data sets retrieved by the AWI (Ricker et al., 2014)   thetical SND products.Below, we describe these data sets and the simulation of their model equivalents, i.e. the respective observation operators that provide the links from the model's state variables to the respective data sets (Kaminski and Mathieu, 2017).Recall that the (combination of) data set(s) enters the QND algorithm through its uncertainty C(d) and that the observation operator is incorporated in the model M (see Sect. 2.1).
The three products derived by AWI from CryoSat-2 are SIT (h i ), SIFB (f i ), and RFB (f r ).Their definition is illustrated in Fig. 11 together with that of LFB (f l ).
The retrieval chain is described in detail by Ricker et al. (2014) and Hendricks et al. (2016).Recall that for each product, in order to run an assessment, we need the spatiotemporal coverage as well as the uncertainty ranges.The lefthand side of Fig. 12 summarises the main steps in the retrieval chain, starting with the rawest (lowest-level) product (RFB) on top.When descending from RFB via SIFB to SIT each step adds further assumptions, which contribute to the product uncertainty.The other element required to evaluate a given product is the observational Jacobian (M ), i.e. the sensitivity of the model simulation to a change in the control vector.The right-hand side of the graph illustrates how the equivalents of the respective products are simulated from the relevant model variables, which are denoted in violet.On this side of the graph, the complexity increases from bottom to top, i.e. from SIT via SIFB to RFB.For example, in the assessment of the SIT product, the uncertainty in quantities needed to apply the Archimedes' principle (including that of snow depth derived from climatology) is contained in the retrieval product, whereas the observation operator that extracts the product equivalent from the model is relatively simple (Archimedes' principle is described, for example, by Guerrier and Horley, 1970).We note that, while retrieved SIT is the effective SIT (h i,eff ) -i.e. it refers to the average over the ice-covered area of a grid cell -simulated SIT refers to the grid-cell average; i.e. for the Jacobian calculation it has to be divided by the simulated sea ice concentration (SIC, denoted by c): The same was done for snow depth: At the level of RFB, by contrast, it is the observation operator that includes, inter alia, on the modelling branch, the application of Archimedes' principle for which it requires simulated snow depth and the densities of snow (ρ s ), sea ice (ρ i ), and water (ρ w ), while the retrieval product is relatively raw.In particular this retrieval product is not affected by uncertainties due to assumptions concerning the snow depth, ρ s , ρ i , and ρ w .
The observation operators for f i , for f r , and for f l are The term −0.22h s /c in Eq. ( 9) adds to the simulated f i the correction for the difference in propagation speed of the radar signal in snow compared to air, which affects f r (Hendricks et al., 2016).This is the reason why f r is located below f i in Fig. 11.We note that, in these three observation operators, f i , f r , and f l have the same sensitivity to h i , but sensitivities to h s and c differ.The sea ice component of the MPIOM uses constant densities of snow, sea ice, and water.As simulated freeboard is relatively sensitive to densities of snow and sea ice, we have, however, included these quantities as The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/The CryoSat-2 product files used in this study directly contain monthly SIT and SIFB on the Equal-Area Scalable Earth Grid (EASE) 2.0 grid with random (based on standard uncertainty propagation) and total (random plus systematic) per-pixel uncertainty ranges (for details see Hendricks et al., 2016, and references therein).Fig. 13 shows product uncertainties for April 2015.In our assessments we use the total uncertainties for the SIT and SIFB products, and for the RFB product we use the random uncertainty component of the SIFB product.Recall that we assume uncertainties to be uncorrelated in space.
For our hypothetical monthly LFB products, we assume a coverage of the Northern Hemisphere with a retrieved value over each cell of the EASE 2.0 grid with SIC above 70 %, in analogy to the threshold used in the CryoSat-2 retrieval (Hendricks et al., 2016).We explore two assumptions with respect to the uncertainty of the products, a mission with a high accuracy (uniform uncertainty of 2 cm) and a mission with low accuracy (uniform uncertainty of 20 cm).In both cases uncertainties are uncorrelated in space.
For our hypothetical monthly mean SND products, we also assume a coverage of the Northern Hemisphere with a retrieved value over each cell of the EASE 2.0 grid with SIC above 70 %.As for LFB we explore two assumptions on the uncertainty of the products, a mission with a high accuracy (uniform uncertainty of 2 cm) and a mission with low accuracy (uniform uncertainty of 15 cm).In both cases uncertainties are uncorrelated in space.
Table 2 provides an overview of the products we assess.For later use, it also lists, for each product and the three control regions, the number of sampled EASE 2.0 grid cells and the corresponding regional average uncertainties.Finally, it also shows the uncertainties on the spatial average of the sampled variable over all sampled EASE 2.0 grid cells based on the assumption of uncorrelated observational uncertainty.

Target and observational Jacobians
The evaluation of Eqs. ( 1) and (3) requires a target Jacobian N for each target quantity and an observational Jacobian M for each of the observational products we assess (Table 2).This subsection first describes the computation of these Jacobians and then discusses them.For a given product, the observational Jacobian is computed in two steps.The first step performs the following actions: a reference run is performed using the prior control vector x, the input variables to the observation operator are stored over the observational period, aggregated to the model grid, and the observation operator is applied to derive the observation equivalent M(x) on the space-time grid of the observational product.In the second step, for each component of the control vector the following procedure is applied: a sensitivity run is performed with a control vector x + p i that is identical to the prior control vector but with the ith component changed by a perturbation i , and an observation equivalent M(x + p i ) is computed in the same way as for the reference run.The Jacobian column is then computed as σ i (M(x + p i ) − M(x))/ i , where σ i is the prior uncertainty of x i .As a consequence of the normalisation by the prior uncertainty, each row in the Jacobian has the same unit as the respective observation.For a given product, column i of the corresponding observational Jacobian quantifies the sensitivity of the model-simulated equivalent to that product with respect to a change in the i component of the control vector x i by 1 standard deviation (see Table 1 for the value).
For any given product the dimension of the observational Jacobian is the product of the dimension of the control space and the grid size of the observational product.As an example, Fig. 14 displays the row of the Jacobians for April means of SIT, SIFB, RFB, LFB, and SND over a single point in space indicated by the black dot (and by the black cross on Fig. 3).
The SIT sensitivity is dominated by the model's initial SIT in control region 6 (black bars in Fig. 14 and enlarged in Fig. 15) but shows also considerable sensitivities to the initial SIC, the initial SND, the initial ocean temperature (TEMP) and the zonal wind stress (WIX).The negative sensitivity to SIC in that region is caused by two mechanisms.The first mechanism is expressed by Eq. ( 6): the observation h i,eff is the effective SIT (thickness averaged over the ice-covered grid cell) and is reduced if the initial SIC is increased (and vice versa) because the model conserves the total sea ice volume.The second mechanism is related to sea ice growth, which depends on the open-water fraction, i.e. more (less) sea ice can grow if the SIC is reduced (increased).The small negative sensitivity of SIT to SND is caused by the strong insulation effect of snow, which hampers the growth of sea ice (or fosters the growth if SND is reduced).The physical process behind the small negative sensitivities on the initial ocean temperature needs no further explanation; we recall, however, that, in the presence of sea ice, the control variable relates to a temperature change below the second model layer (at 17 m depth).The negative sensitivity with respect to the zonal wind stress (WIX) mirrors less advection of thick sea ice originating from the Beaufort Gyre.WIX is positive for eastward wind stress.A positive sensitivity to WIX is most distinct in region 6 (but also evident in regions 7 and 8) and slows down the Beaufort Gyre, which advects less sea ice into the target region (sea ice behaves, at least in April and May, to a large extent like a rigid body, i.e. a change in regions 7 and 8 impacts almost instantaneously on the target regions), resulting in a negative sensitivity.The SIT sensitivities on model parameters (Fig. 14 and enlarged in Fig. 15) are very small compared to the sensitivities on the initial state The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/Table 2. Overview on data sets, the # of sampled EASE 2.0 grids in control regions 5-7 (columns 2-4), the respective average uncertainties (columns 5-7), the uncertainty of the product aggregated over all sampled EASE 2.0 grid cells (column 8).or the atmospheric boundary conditions, as the short integration time (we sample the April mean of a model simulation starting on 1 April) restricts the impact of the parameters.
The various freeboard products exhibit high sensitivity to initial SIT and SND (orange, red, and green bars in Fig. 14 and enlarged in Fig. 15).As SIT enters all freeboard observation operators in the same way (Sect.2.5), the freeboard sensitivity to April mean SIT is equal for all products, which also renders their sensitivity to initial SIT almost equal.The LFB sensitivity on the initial SND is positive (LFB is the www.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569-2594, 2018 show the observational Jacobians with respect to the initial and forcing fields for each control region (see Table 1 for an explanation of the abbreviations).
freeboard at the top of the snow layer), while the sensitivity of the RFB and SIFB is negative because an increased SND will reduce the RFB and SIFB through the increased weight on the ice floe (see Fig. 11).Due to the definition of the observation operator for RFB (Eq.9) its sensitivity to initial SND is larger than that of the SIFB (Sect.2.5).The sensitivity of the freeboard products (yellow, red, and green bars) with respect to the parameters of the sea ice and ocean model is low.
The impact of the sea ice density on the respective observation operators (Eqs.8 to 10) is high, while sensitivity with respect to the snow density is much lower (because the sea ice thickness is much larger than the SND at the observational point).The SND shows only considerable sensitivity to the initial SND in control region 6 and some minor positive sensitivity with respect to the precipitation in the same region.
Likewise we computed target Jacobians N for each of the six target quantities (SIV and SNV each over three regions) described in Sect.2.2.Each target quantity is a scalar and thus the Jacobian has one entry for each component of the control vector.As an example Fig. 16 displays the Jacobians for SIV and SNV (on 28 May) over the outer New Siberian Islands (ONSI) region.The first point to note is that sensitivities of regional SIV and SNV to the control vector differ, so an observation must constrain different components of the control vector to perform well for one or the other target quantity.
SIV over the ONSI region is highly sensitive to initial SIT over control regions 5 and 6 (Fig. 17), which at least partly overlap with the target area.Similarly to the SIT observation and due to the same mechanisms discussed above, the SIV target quantity also exhibits negative sensitivity to the initial SIC, SND, and zonal wind stress.It is interesting to note that The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/SIV is also sensitive to initial and boundary conditions over more remote control regions.For example, it exhibits a positive sensitivity to the initial SIT in the control regions 1 and 7 from which thick sea ice is advected into the target region during the period from 1 April to 28 May.This also explains the negative sensitivity to the zonal wind stress in region 7 and the meridional wind stress in region 1: for sufficiently high SIC the sea ice almost behaves as an incompressible fluid, allowing even for sensitivity to wind stress changes in very remote control regions, e.g. the negative sensitivity to the zonal wind stress in region 8.The positive sensitivity to the zonal wind stress in region 1 (with thick ice) may be less obvious, as it follows the deflection of ice drift by about 20 • to the right.The largest SIV sensitivity to model parameters (Fig. 17) is found for the snow albedo of freezing conditions (albsn), but still that sensitivity is low compared to the sensitivity with respect to the initial state and atmospheric boundary conditions.
SNV shows particularly high sensitivity to the initial SND but also considerable sensitivity with respect to the precipitation and air temperature in region 6.The largest model parameter sensitivity is found for the snow albedo for melting conditions: increasing the snow albedo will reduce the melting.4. hypothetical low-accuracy LFB, and 5. hypothetical high-accuracy LFB.
The following three assessment variants were used: 1. product was evaluated individually, 2. product was evaluated together with a hypothetical lowaccuracy SND product, and 3. product was evaluated together with a hypothetical high-accuracy SND product.
The reference for these assessments is a case without observations.Row three ("prior") shows the uncertainties in the target quantities that result from the prior uncertainty in the control vector.

Sea ice and snow volume uncertainty
The section presents the results of our assessments.As explained in Sect.2.1 the uncertainty component from the model error σ (y mod ) in Eq. ( 3) covers the residual uncertainty that remains with an optimal control vector; i.e. it reflects uncertainty from uncertain aspects not included in the model error and structural uncertainty reflecting wrong or missing process formulations.σ (y mod ) is model dependent and is probably the most subjective component in the prior and posterior uncertainties.σ (y mod ) acts as an offset (for the respective variance) for all cases and reduces the contrast between the cases.As our assessments focus on the differences between the cases, we exclude it from the target uncertainties in rows 3-18 of Table 3 and provide estimates in separate rows.To illustrate the subjective nature of this estimate and possible ranges, we derive two crude estimates (last two rows).The first estimate (denoted by σ mod,absolute ) assumes a model that perfectly simulates the same ice-covered area of all three regions as our model and that, over this area, achieves an uncertainty of 20 cm for SIT and of 10 cm for SND.The second estimate (denoted by σ mod,relative ) assumes a model that simulates the same SIV and SNV as our model with an uncertainty of 10 % for SIV and 30 % for SNV.We use a higher uncertainty for SNV because it has a stronger dependence on the surface forcing (mainly precipitation), for which the temporal and small-scale spatial structures are not resolved in the control vector.
Figure 18 shows the uncertainty reduction with respect to the prior case as defined in Eq. ( 5) for both SIV and SNV and all three target regions.A value of 100 % means that the product has resolved all uncertainty in the respective target quantity, while a value of 0 % means that the product was not useful for improving the forecast of the target quantity.We first discuss the single product assessments, i.e. without additional use of a hypothetical snow product.For all three regions, the SIT (yellow bar) has a considerably better performance for SIV than for SNV.Between SIV and SNV the only difference in our QND assessments consists of the target Jacobians, N .For example, for target region ONSI, Fig. 16 shows particularly high sensitivity of SIV to initial SIT and of SNV to initial SND in control regions 5 and 6.Hence, to constrain SIV (SNV) over that target region a product has to constrain primarily initial SIT (SND) over these two control regions.Figure 14 shows that, indeed, SIT provides a much stronger constraint on initial SIT than on initial SND.In contrast to SIT, SIFB has a similar performance for SIV and SNV over all target regions (Fig. 18).Compared to SIT, SIFB shows a much lower sensitivity to initial SIT but a higher sensitivity to initial SND (Fig. 14 -the sign of the sensitivity is irrelevant in this consideration), and thus it shows a more balanced performance for SIV and SNV than the SIT product.RFB and the two hypothetical LFB products achieve a better performance for SNV than for SIV.The only difference between the RFB and SIFB Jacobians is the larger impact of h s /c on RFB, as a consequence of the correction for the signal propagation through snow (see Sect. 2.5).This is why RFB shows a better performance for SNV than for SIV, while SIFB had about an equal level of performance for SIV and SNV.LFB has the same sensitivity to initial SIT as RFB but an even larger sensitivity to initial SND.Consequently, for the low-accuracy LFB product, the imbalance between the performance for SIV and SNV is even higher than for the RFB product.This imbalance is lower for the high-accuracy www.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569-2594, 2018 LFB product, because this product already performs excellently on SIV such that there is not much scope for further increases in performance on SNV.So far we have discussed differences in performance for SIV and SNV for a given product.Next we address performance differences between products.First, we note that switching from SIT to SIFB drastically reduces the performance for SIV.As explained in Sect.2.5, on the left-hand side of Fig. 12 (retrieval branch), switching from SIFB to SIT applies Archimedes' principle, with uncertain assumptions primarily on the input variables snow and ice density and snow depth, which yield an increase in product uncertainty by about an order of magnitude (Fig. 13 and Table 2).On the right-hand side of Fig. 12 (modelling branch) switching from SIT to SIFB means dealing with uncertainty on the same input variables (snow and ice densities and snow depth), which renders the simulation of SIFB more uncertain than that of SIT.In the model, the uncertainty in these variables is determined by the prior uncertainty of the control vector, either directly (snow and ice densities) or indirectly (snow depth) through their model-simulated dependency on the control vector.It appears that the increase in uncertainty, when going from SIT to SIFB on the modelling branch, overcompensates for the reduction in uncertainty on the retrieval side, when going back from SIT to SIFB.In other words, on the modelling branch, the assumptions on uncertain input appear more conservative than those on the retrieval branch.On the retrieval branch, going (backwards) from SIFB to RFB consists of a reduction in product uncertainty by about another order of magnitude, as the retrieval of RFB does not require information on snow depth.Even with this further reduction in product uncertainty, the performance of RFB is inferior to that of SIT for SIV over WLS and ONSI, and only just superior for SIV over ESS.
Differences between target regions in the performance of the same product are the result of a complex interplay of the Jacobians N for the target regions and the product's constraint on the control vector quantified by C(x) (see Eq. 3).For each of the target regions a different (combination of) control region(s) is most relevant: for WLS this is control region 5 (not shown), for ONSI it is control regions 5 and 6 (Fig. 16 and enlarged in Fig. 17) and for ESS it is control regions 6 and 7 (not shown).The ability of a product to constrain a particular control region is determined by the combination of the observational Jacobian of the product and the product uncertainty (see Eq. 1).
It is tempting to explain regional performance differences simplistically by linking them to differences in observational coverage and uncertainty.Technically, this explanation corresponds to replacing our observational Jacobian M (that is based on model dynamics) with a drastically simplified representation.Such a simplistic approach would imply that only observations over a given control region constrain that same region (and none other), and that the observational Ja- The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/ Figure 18.Uncertainty reduction for sea ice (SIV) and snow (SNV) volume over target regions when using observational constraints.The colour of the bars represents the different observational constraints.Yellowish: SIT and SIT in combination with a hypothetical snow depth product with two different uncertainties (15 and 2 cm), reddish: as SIT but for SIFB, greenish: as SIT but for RFB, bluish: as SIT but for hypothetical LFB with 20 cm uncertainty, greyish: as SIT but for hypothetical LFB with 2 cm uncertainty.
cobian for each product and control variable is spatially uniform.The constraints of a product on a control region would then be proportional to the square root of the number of samples n of that region and to the reciprocal of the average observational uncertainty σ over the region.Table 2 shows both impact factors for the most relevant control regions, i.e. 5-7.For RFB and compared to region 6, the relevant quantity √ n/σ is about 41 % lower in region 5 and 12 % lower in region 7.This is at least quantitatively in line with the performance decrease for RFB and SIV from ONSI (most relevant in region 6 and to smaller extent in 5) to ESS (most relevant in region 6 and to smaller extent in 7) to WLS (most relevant in region 5).The performance ranking for RFB and SNV is different, however; i.e. the simplistic approach does not hold.Also for SIT, the differences in √ n/σ between the three control regions are smaller and fail to explain the performance decrease from WLS to ONSI to ESS.These calcu-lations demonstrate the limits of a performance assessment that is only based on observational coverage and uncertainty, while neglecting the model dynamics.
The two hypothetical LFB products have a slightly better spatial coverage of the most relevant control regions than the products derived from CryoSat-2 and use uniform data uncertainties that span the range from 2 cm (high-accuracy LFB) to 20 cm (low-accuracy LFB).Recall that the specified data uncertainty combines the observational uncertainty (i.e.product uncertainty) with the residual model uncertainty due to structural errors and uncertain contributions not accounted for in the control vector (Eq.2).Only the high-accuracy LFB can clearly outperform all CryoSat-2 products for both SIV and SNV and over all three regions, while the low-accuracy LFB falls in performance between that of SIFB and RFB.
Next we discuss the effect of combining either of these five products with the two hypothetical SND products.The difwww.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 ference in the respective product Jacobians shown in Fig. 14 suggests complementarity of SND to the SIT and freeboard products.Indeed, the combination with SND considerably increases the performance of all SIT/freeboard products for SIV and SNV and over all regions.Most striking is the improved SIT performance for SNV.The combination with SND results in similar performances for SIT and SIFB, a slightly better performance for low-accuracy LFB, yet a slightly better performance for RFB, and the best performance for the high-accuracy LFB.The assessment for SIV and in combination with low-accuracy SND yields the same performance ranking of products, with slightly larger differences between products.Being combined with the highaccuracy SND product instead of the low-accuracy SND product yields a performance gain for all products and for SIV and SNV over all regions.
Between the two LFB products, the increase in accuracy yields a considerable performance gain for SIV and SND over all regions, when assessed individually and in combination with SND.Over all regions the combination of the highaccuracy LFB with the low-accuracy SND performs better for SIV than the combination of the low-accuracy LFB with the high-accuracy SND.For SNV the two combinations are similar in performance.

Discussion
There are a number of factors in the set-up of our ArcMBA system that impact our assessments.One of them is the model, which is required to realistically compute the sensitivities (Jacobians) of the target quantities and of the observation equivalents to changes in the control vector.As detailed in Sect.2.3, the MPIOM has a state-of-the-art representation of processes, compares reasonably with a range of observations (Notz et al., 2013;Kaminski et al., 2017), and the setup of MPIOM we are using has a spatial resolution below the grid size of the observations and well below the size of the target regions.The model thus appears appropriate for our study and the ArcMBA system in general.Nevertheless, through the Jacobians the results depend on the model, and it would be useful to confirm the robustness of the assessments through the use of a second model, or even an ensemble of models.
The study investigated the performance of 4-week forecasts in May 2015.The impact of an observation is likely to depend on the state of the Arctic sea ice-ocean system.The robustness of the ArcMBA assessments can thus be increased through extension of the system for an ensemble of ice and ocean conditions representing different forecasting times (for example 2, 7-10, and 90 days and also 0 days, i.e. an analysis), different seasons, different typical years (potentially also including conditions of very low ice cover), and different target regions and variables, e.g.SIC.
In our set-up, the control vector has 157 components.In particular, within any of our nine control regions, we do not resolve changes in the spatial patterns of the initial conditions nor in the spatio-temporal patterns of the forcing data.This means that we are ignoring uncertain aspects in the inputs that determine our simulation, which results in aggregation errors (Trampert and Snieder, 1996;Kaminski et al., 2001) and renders the ArcMBA assessments of the product impacts too optimistic.As the target quantities are integrals over large regions, we expect, however, that our control regions can capture most of the uncertainty.Also, the set of reasonable surface forcings is in practise limited by physical relations between variables, in space and in time.Similar restrictions apply to the initial state.Further, we use the same control vector for all cases, so that the relative performance with respect to the prior (uncertainty reduction) and among products is more reliable.Nevertheless it appears useful to explore extended control vectors, for example with decreased sizes of the control regions, in particular in areas with high impact on observations or target quantities.
Another factor that impacts our assessments is correlations in the data uncertainty.These uncertainty correlations are difficult to estimate.We used zero uncertainty correlation for each of the products, which is certainly the most optimistic assumption and yields the best performance.As we made this assumption consistently for all products, the relative performance between the products is less affected.To illustrate the implications of uncorrelated uncertainty in the products, we computed the resulting uncertainty in the respective average of each observable over all sampled grid cells (last column of Table 2).For April 2015 this yields a mean of SIT about 2 cm, of SIFB about 2 mm, of RFB about 0.4 mm and (using the respective high-accuracy versions) of LFB and SND about 0.1 mm.
The effect of uncertainty correlation on the assessments can also be demonstrated by the following simplified calculation: if we partition our product grid into groups of n by n pixels and assume perfect uncertainty correlation and the same Jacobian for each observation within a given group, then we decrease the first term in Eq. ( 1) by a factor of n 2 .This case then yields the same results as a case with an uncertainty that is uncorrelated and increased by a factor of n.This means we can interpret the impact of the low-resolution LFB product (uncorrelated uncertainty of 20 cm) as the impact of a high-resolution LFB product (with 10 times lower uncertainty, i.e. 2 cm) for which the uncertainty within each 10 by 10 group of pixels is completely correlated.This is likewise found for the SND product and (roughly) 6 by 9 groups of pixels, because (15 cm/2 cm) 2 is about 6 × 9.One reason for spatial uncertainty correlation would be a sensor footprint that exceeds the size of a 25 km EASE grid cell.Likewise, for sensors with footprints considerably smaller than a 25 km EASE grid cell, the procedure for upscaling from the sampled fraction of a grid cell to a grid-cell average could suffer from systematic errors that affect large scales in The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 www.the-cryosphere.net/12/2569/2018/ the same way, which would result in large-scale uncertainty correlations.
Our hypothetical products (LFB and SND) observe every pixel with SIC above 70 %.This is optimistic but, at least for snow, not totally unrealistic, depending on the mission concept.Recalling that the data uncertainty also has to include an uncertainty from the model error, the value of 2 cm for the high-accuracy products (without spatial correlation) is extreme and unrealistic (as it is already a challenging requirement on the observational uncertainty) but still useful as a sanity check for the methodology.We note that, even for the assessment of an individual product, the posterior uncertainty on the target quantities is not a simple linear function of the product uncertainty, because of the contribution from the prior term in Eq. ( 1).This means, for example, the posterior uncertainties achieved by a hypothetical LFB product with 0.11 cm uncertainty will not be the average of the posterior uncertainties achieved by our two hypothetical SND products with respective uncertainties of 2 and 20 cm.For combined assessment of multiple products the relation between the uncertainty of a single product and the posterior target uncertainty is yet more complex.
The uncertainty specified with the SIT product is higher than the uncertainty specified with the SIFB products derived from CryoSat-2.This increase reflects the inclusion of uncertainty in the input quantities for the application of Archimedes' principle, in particular of climatological snow depth.In the assessment of SIFB, Archimedes' principle is applied in the observation operator, where the input quantities including snow depth are taken from the model.The fact that the impact of SIFB on SIV is lower than that of SIT indicates that the assumptions on the uncertainty of input quantities for the application of Archimedes' principle are more conservative on the modelling branch than those that were made on the retrieval branch (yielding the respective product uncertainties).More conservative assumptions on the retrieval branch would yield higher uncertainty in the SIT product.We also note that biases may be reduced when using the RFB product instead, as it does not rely on an external snow depth climatology but uses a consistently simulated snow depth.

Summary and conclusions
The ArcMBA tool was used to assess the impact of a series of EO products on the quality of 4-week forecasts of SIV and SNV over three regions along the NSR in May 2015.The tool is built around the MPIOM, a coupled model of the sea iceocean system extended by observation operators that link the simulated variables to equivalents of SIT, SIFB, RFB, LFB, and SND products.
On the basis of the per-pixel uncertainty ranges that are provided with the CryoSat-2 SIT, SIFB, and RFB products, the SIT and RFB products achieve the best performance for the target SIV.For the SNV target, the performance of SIT is only low, the performance of SIFB is higher, and the performance of RFB is yet higher.A hypothetical LFB product with low accuracy (20 cm uncertainty) has a performance between SIFB and RFB for both SIV and SNV.A reduction in the uncertainty of the LFB product to 2 cm yields a significant increase in performance.
Combining either of the SIT or freeboard products with a hypothetical SND product achieves a significant performance increase.The uncertainty in the SND product matters: a higher-accuracy product achieves an extra performance gain.
The provision of spatial and temporal uncertainty correlations with the EO products would be beneficial, not only for assessments within systems like the ArcMBA but also for assimilation of the products.For example, complete uncertainty correlation within each group of 10 by 10 pixels (with uniform Jacobians within each group) is equivalent to an uncertainty increase by a factor of 10.
The ArcMBA can be extended to cover further EO products and further target variables.In the set-up used here the model can simulate a range of sea ice-ocean variables in addition to those considered in the present study (e.g.ice drift, mixed layer depth, freshwater content, sea surface salinity, sea surface temperature, or ocean circulation).Switching to a more comprehensive model configuration would enable the investigation of yet further variables.For example, the MPIOM can be operated with its biogeochemistry module HAMOCC (Ilyina et al., 2013) or in a mode coupled to an atmospheric general circulation model.
The study has investigated the performance of 4-week forecasts in May 2015.It would be interesting to analyse how the relative performance of the products varies from year to year, with the length of the forecasting period, for other target regions and with a different sea ice-ocean model.
As the QND approach can evaluate a (group of) hypothetical product(s) -characterised by their space-time coverage and uncertainty ranges -it is generally suited to assess the benefit of filling a given observational gap.It provides answers to hypothetical questions such as "Provided we could derive a product of a given variable with a given accuracy, at a given sampling frequency and spatial coverage.What is the added value of this additional observation for the quality of sea ice forecasts (as quantified by uncertainty reduction in a set of predicted target quantities)?"ArcMBA assessments of a set of such products (each filling an observational gap) can help to establish a priority list.
The ArcMBA system is an ideal framework which assists the formulation of mission requirements or the development of EO products.In an end-to-end simulation it can translate product specifications in terms of spatio-temporal resolution and coverage, accuracy, and precision into a range of performance metrics.Alternatively, it can translate requirements on forecast performance into requirements on the respective observables.As demonstrated in the present study, the joint aswww.the-cryosphere.net/12/2569/2018/The Cryosphere, 12, 2569Cryosphere, 12, -2594Cryosphere, 12, , 2018 sessment of products from (constellations of) multiple satellites is one of the particular strengths of the ArcMBA approach.This type of assessment can be performed for higherlevel products (e.g.SIT or SIC) but also for rawer products (e.g.freeboard or brightness temperature).
Figure 1.Data flow through a two-step procedure of QND formalism.Oval boxes denote data; rectangular boxes denote processing."Backward propagation with inverse model" implements Eq. (1), and "forward propagation with model" implements Eq. (3).Figure taken from Kaminski and Rayner (2017).

Figure 2 .
Figure 2. Schematic presentation of the QND procedure: each coloured line illustrates a model trajectory that simulates counterparts of the observations (d 1and d 2) and a target quantity (y) for a given value of the control vector (x).Through the model, the observations act as constraints on the control vector, which reduces its uncertainty from C(x 0 ) to C(x).This uncertainty reduction on the control vector translates into an uncertainty reduction in the target quantity from σ (y 0 ) to σ (y).

Figure 3 .
Figure 3. Target regions along the NSR.Black cross indicates a location for further use in Fig. 14.

Figure 4 .
Figure 4. Timeline of assimilation and forecast set-up.

Figure 5 .
Figure 5. Model grid, mesh indicates groups of 4 by 4 grid cells.

Figure 6 .
Figure 6.The long-term mean sea ice concentration (%) of the Arctic MPIOM for 1990 to 2008 for March, June and September (a to c) and the misfit to the OSI SAF sea ice concentration (panel d to f).In panels (d) to (f), shades of red indicate underestimation and shades of blue indicate overestimation of sea ice concentration in the model.

Figure 7 .
Figure 7.The long-term mean (2000 to 2012) of the simulated sea ice thickness (m) for the 2-month periods February-March and October-November (panel a and b) and the misfit (model -observations) to the ITRP (panel c and d).In panels (c) and (d), shades of red indicate underestimation and shades of blue indicate overestimation of sea ice thickness in the model.

Figure 8 .
Figure 8.(a) The modelled mean April 2015 sea ice thickness (m), (b) the modelled sea ice thickness on 28 May 2015, and (c) the mean April 2015 misfit of the modelled sea ice thickness relative to the CryoSat-2 sea ice thickness.In panel (c), shades of red indicate underestimation and shades of blue indicate overestimation of sea ice thickness in the model.

Figure 9 .
Figure 9. (a) The modelled mean April 2015 snow depth (m), (b) the modelled snow depth on 28 May 2015, and (c) the mean April 2015 misfit of the modelled snow depth relative to the modified Warren climatology used in the CryoSat-2 sea ice thickness retrieval.In panel (c), shades of red indicate underestimation and shades of blue indicate overestimation of snow depth in the model.

Figure 11 .
Figure 11.Schematic illustration of sea ice thickness and different freeboard variables.

Figure 12 .
Figure 12.Overview of the processing chain for CryoSat-2 (CS) product retrievals (a) and the chain for modelling product equivalents from the control vector (M(x); b).Oval boxes denote data and rectangular boxes processing steps.Green indicates remote-sensing products and violet indicates model variables.Yellow diamonds mark the assessment of the EO products with the QND algorithm.MSS is mean sea surface height.

Figure 14 .
Figure 14.The sensitivities of the respective EO product to the control vector (observational Jacobians) for April means of LFB (orange bars), RFB (red bars), SIFB (green bars), SIT (black bars), and SND (cyan bars) over a single point indicated by the black dot (and by black cross on Fig.3).The observational Jacobians with respect to the process parameters are shown in the left middle panel.The other panels show the observational Jacobians with respect to the initial and forcing fields for each control region (see Table1for an explanation of the abbreviations).

Figure 15 .
Figure 15.An excerpt of Fig. 14 of the observational Jacobian (top) in target region 6 (East Siberian Sea) and (bottom) for the model parameters.

Figure 16 .
Figure 16.As Fig. 14 but for the sensitivities of the sea ice (SIV) and snow (SNV) volume over the target region, outer New Siberian Islands (ONSI), on the control vector (target Jacobians).

Figure 17 .
Figure 17.An excerpt of Fig. 16 for the target Jacobian (top) in target region 6 (East Siberian Sea -a) and region 5 (Laptev Sea -b) and (c) for the model parameters.

Table 1 .
from observations provided by the CryoSat-2 mission, two data sets characterising hypothetical LFB products, and two data sets characterising hypo-Control variables.Column 1 lists the quantities in the control vector; column 2 gives the abbreviation for each quantity; column 3 indicates whether the quantity is a process parameter (p), an initial field (i), or an atmospheric boundary field (denoted by f for forcing); column 4 gives the name of each quantity; column 5 indicates the prior uncertainty (1 standard deviation) and the corresponding units (unless unitless) and provides the mean parameter value in parenthesis, where applicable; and column 6 identifies the position of the quantity in the control vector -for initial and boundary values (which are differentiated by region) this position refers to the first region, while the following components of the control vector then cover regions 2 to 9.

Table 3 .
Prior and posterior uncertainties of sea ice volume (SIV, columns 4-6) and snow volume (SNV, columns 7-9) respectively for three regions in km 3 .Column 1 indicates observation, column 2 indicates uncertainty range ("product" refers to uncertainty specification provided with product), column 3 indicates uncertainty range of additional hypothetical snow product (a dash means no snow product is used).In each of columns 4-9 the lowest uncertainty range is highlighted in bold.The two bottom rows give estimates for the uncertainty due to model error, i.e. the residual uncertainty with optimal control vector.