This work evaluates the statistical predictability of the Arctic sea ice volume (SIV) anomaly –
here defined as the detrended and deseasonalized SIV – on the interannual timescale.
To do so, we made use of six datasets, from three different atmosphere–ocean general circulation models,
with two different horizontal grid resolutions each. Based on these datasets, we have developed a statistical
empirical model which in turn was used to test the performance of different predictor variables, as well as to
identify optimal locations from where the SIV anomaly could be better reconstructed and/or predicted. We tested
the hypothesis that an ideal sampling strategy characterized by only a few optimal sampling locations can provide
in situ data for statistically reproducing and/or predicting the SIV interannual variability. The results
showed that, apart from the SIV itself, the sea ice thickness is the best predictor variable, although total sea
ice area, sea ice concentration, sea surface temperature, and sea ice drift can also contribute to improving the
prediction skill. The prediction skill can be enhanced further by combining several predictors into the statistical
model. Applying the statistical model with predictor data from four well-placed locations is sufficient for
reconstructing about 70 % of the SIV anomaly variance.
As suggested by the results, the four first best locations are placed at the transition Chukchi Sea–central Arctic–Beaufort Sea (79.5

The ongoing melting of the Arctic sea ice observed in the last decades

Since this intense sea ice loss is projected to continue throughout the twenty-first century

In order to respond to the need of having an improved observational system for better understanding the SIV variability, but at the same time minimizing the costs required to do so, this work raises the hypothesis that an ideal sampling strategy characterized by only a few optimal sampling locations can provide in situ data for statistically reproducing and/or predicting the SIV interannual variability. To test this hypothesis, this study follows three main directions. First, we propose a statistical empirical model for predicting the SIV. Since we are mainly interested in predicting the interannual variability rather than the seasonal cycle and the long-term trends, we will focus on the SIV without these two components – hereafter defined as SIV anomaly. Second, we investigate the performance of a set of ocean- and ice-related predictor variables as input into the empirical model. Third, we intend to localize a reduced number of optimal sampling locations from where the predictor variables could be systematically sampled using oceanographic moorings and/or buoys. Sampling in situ data at optimal locations or, in other words, by collecting data at locations at which most of the pan-Arctic SIV anomaly variability is captured by the predictor variables, makes it much more feasible to sustain a long-term programme of operational oceanography from both logistical and financial points of view.

To the best of the authors' knowledge, this study is the first to apply an empirical statistical model for supporting an optimal observing system of the
pan-Arctic SIV anomaly, albeit a similar study was conducted by

Thus, even though we claim that in situ observations are crucial for understanding the SIV variability, our study makes use of outputs
from three AOGCMs. This is the only way to have continuous and well-distributed data of the predictand and some predictor variables,
such as sea ice thickness. The AOGCMs used in this work are cutting edge in terms of model physics and resolution

To fully address the three overall directions and the hypothesis described above, this study is guided by the following open questions. (i) What is the performance of different pan-Arctic predictors for predicting pan-Arctic SIV anomalies? (ii) What are the best locations for in situ sampling of predictor variables to optimize the statistical predictability of SIV anomalies in terms of reproducibility and variability? (iii) How many optimal sites are needed for explaining a substantial amount (e.g. 70 % – an arbitrarily chosen threshold) of the original SIV anomaly variance? (iv) Are the results model dependent, in particular, and/or are they sensitive to horizontal resolution?

This work follows a multi-model approach. It takes advantage of six coupled historical runs from three different AOGCMs
(each with two horizontal grid resolutions), all conducted within the context of the
High Resolution Model Intercomparison Project (HighResMIP;

The AOGCMs are version 1.1 of the Alfred Wegener Institute Climate Model (AWI-CM;

A comprehensive comparison including these three models and their respective specifications is presented by

The two configurations from the same model keep the parameters identical, except for the resolution-dependent parameterizations

For the three models, the SIV time series from the versions with a coarser horizontal grid present higher mean values compared to their
respective finer-resolution versions (Fig.

Sea ice volume time series from the six model configurations used in this work:

In this section, we identify potential predictor variables for use as input into the empirical statistical model that predicts SIV anomalies. Apart from the condition that all predictor variables could be regularly sampled from observational platforms in the real world, we only preselected variables which have the potential to impact the sea ice through dynamic and/or thermodynamic processes. Overall, two categories of predictors are tested: integrated variables, intrinsically represented by a single pan-Arctic time series, and predictors represented by several gridded time series of the same variable. Here, predictor variables are also considered in terms of their anomaly.

In total, a set of seven predictors are considered for this preliminary inspection. Three of them are integrated variables that are pan-Arctic SIV itself, pan-Arctic sea ice area (SIA), and Atlantic basin ocean heat transport (OHT) estimated at 60.0

Lag-0 correlation coefficient estimated between the predictand (SIV anomaly) and a set of pan-Arctic potential predictors: SIA, OHT, SIT, SIC, SST, and Drift. The correlation coefficients between OHT and SIV anomaly for the high-resolution model versions are not shown since only statistically significant coefficients are displayed in the table. Regional predictors (SIT, SIC, SST, and Drift) are represented by pan-Arctic averages. As for the predictand, all predictors are used with monthly time resolution and in terms of their anomaly.

Lag-0 comparison between the time series from the predictand (SIV (10

To obtain the same first assessment for the other predictors, the gridded values are reduced to their pan-Arctic average. To do so, the time series are normalized twice: first, by the grid area of each grid cell and, second, by
the correlation maps with the predictand

The basis of our statistical empirical model (SEM) is a multiple linear regression model where the time series of the dependent variable (

In our case, the reconstructed time series of SIV anomaly (SIV

To ensure robustness to the statistical reconstructions, the SEM is applied within a Monte Carlo loop with 500 repetitions.
In every repetition, 70 % of the data are randomly selected for training (

Two different approaches for applying the SEM are used in this work. First, in Sect.

We intend to identify a reduced number of sites from which predictor variables could offer an optimal representation of the pan-Arctic SIV anomaly.
To identify the first best location, a score map (Sc[

By following the approach above, the goal is to create a first score map (Sc[

Aiming at spotting a single first optimal location that better represents all datasets (ensemble first optimal location), we take the average of the six
score maps. To give the same weight for all datasets in the averaging, the individual score maps are scaled between zero and 1
(ScNorm

After determining and fixing the first ideal location [

Region of influence for a station arbitrarily placed at the North Pole (black star) as defined by each model (colourful lines) and by the averaged region of influence from the different models (shades of green to yellow).

In this approach, the regression described in Eq. (

In this section, the statistical predictability of the SIV anomaly is quantitatively evaluated by considering leading periods of
1 to 12 months. Also, the predictive performance of seven pan-Arctic predictors is tested.
The predictors are SIV itself, SIA, OHT, SIT, SIC, SST, and Drift. Here, we focus on the months with relatively large
(March; Sect.

Figure

Statistical predictability of the March SIV anomalies, estimated from 12 leading months and quantified by the RMSE (10

Statistical predictability of the September SIV anomalies, estimated from 12 leading months and quantified by the RMSE (10

A way of further improving the statistical predictability is to use several predictors at once.
Figure

Score maps (Sc[

A similar scenario compared to March is found for the September SIV anomaly predictability (Fig.

In this section, the empirical statistical model is used for supporting an optimal sampling strategy by following the methodology
described in Sect.

Here we assume that numerical models are able to reproduce the main physical processes behind
the interactions among predictand and predictors. Practically, we will take into account four gridded predictors, SIT, SIC, SST, and Drift, and one pan-Arctic predictor SIA, although it is worth reminding ourselves that only predictors significantly correlated with the predictand will be
incorporated into the statistical model. As per the results of Sect.

For each of the six model realizations, score maps (Sc[

The RMSEs (and associated SD from the Monte Carlo scheme) calculated between the original SIV anomalies and the SIV anomalies reconstructed by the
SEM, using predictor variables from the first optimal location (black stars in Fig.

Mean RMSEs (and associated SDs, error) from the 500 Monte Carlo realizations calculated between the original SIV anomalies and the SIV anomalies reconstructed by the SEM. We recall that in each Monte Carlo realization 70 % of the data are randomly used for training the SEM, while 30 % are used for calculating the error. The middle column shows the values for the case where the predictors are extracted from the individual optimal locations, while the right column shows the values found with predictors from the common optimal location.

Once the primary common optimal site has been identified and accepted for all datasets, we search for the second best location.
For that, the neighbouring grid points which fell into the region of influence of the first best site are not considered as a second option.
Figure

Figure

Table

Optimal observing framework, as suggested by the ensemble of model outputs, for sampling predictor variables in order to statistically reconstruct and/or predict the pan-Arctic SIV anomaly. The numbers indicate the first up to the 10th best observing locations in respective order. The hatched area around each location (same colour code) represents their respective region of influence. The selection of points respects the hierarchy of the regions of influence in a way that the second point can not be placed within the region of influence no. 1 (shades of red), the third point can not be placed within the regions of influence nos. 1 and 2 (shades of red and purple), and so on.

Optimal observing framework for sampling predictor variables in order to statistically reconstruct and/or predict the pan-Arctic SIV anomaly.
The numbers indicate the first up to the 10th optimal sites. Each of the coloured areas represents an Arctic subregion according to the
Arctic subdivision suggested by the National Snow and Ice Data Center (NSIDC). The black line indicates the global region of influence
defined in Fig.

Geographical coordinates for the first 10 optimal sampling locations (second and third columns). The fourth column shows the subregions
in which each of the points is placed in (see Fig.

Lag-0 comparison between the original (black) and statistically reconstructed SIV anomalies. The reconstruction takes into account the first (red),
the three first (first–third; green), and the six first (first–sixth; blue) optimal locations:

Once the ideal sampling locations are established, these sites are used to effectively reconstruct the entire time series of SIV anomalies from the
six model outputs, by taking into account only the valid predictors from each location.
We will make use of the RMSE to evaluate how good our statistical prediction is in terms of absolute values as in the previous sections, but
here we are also interested in inspecting the ability of the empirical model to
reproduce the full variability of the SIV anomalies. For that, apart from the RMSE, we also calculate the coefficient of determination (

Figure

Figure

Figure

In terms of used predictor variables, Fig.

To evaluate the performance and robustness of our SEM, the RMSE and

Root-mean-squared error (RMSE; left column) and coefficient of determination
(

In this work, we have introduced a statistical empirical model for predicting the Arctic SIV anomaly
on the interannual timescale. The model was built and tested
with data from three AOGCMs (AWI-CM, ECMWF-IFS, and HadGEM3-GC3.1), each of which provided two horizontal resolutions, performing a
total of six datasets. We have first inspected the predictive skill of seven different pan-Arctic predictors, namely SIV, SIA, OHT, SIT,
SIC, SST, and Drift. These predictors were tested since they have dynamical and/or thermodynamical influence on the SIV.
The three first predictors are intrinsically represented by single time series, while the remaining predictors are gridded variables that
were reduced to mean pan-Arctic time series. From this first assessment, performed for the months of March and September, the results
(Sect.

In contrast, OHT provided very poor predictive skill.

We now recall and objectively answer the first open question posed in the introduction of this paper.

We believe that this work positively impacts three aspects of a real-world observing system. First, it provides recommendations for optimal sampling locations. We are confident that our multi-model approach provides a solid view of the sites that better represent the variability of the pan-Arctic SIV. Second, even if those regions are not taken into account for any reason (for instance logistics, environmental harshness, strategical sampling, etc.), observationalists could still take advantage of the “region of influence” concept. By doing so, they avoid deploying two or more observational platforms that would provide relatively similar information in terms of pan-Arctic SIV variability. Third, considering that observational platforms are already operational, our SEM could be trained with model outputs (with the same or other state-of-the-art AOGCMs) and so incorporates observational data to project future pan-Arctic SIV variability. Within this context, we expect that this paper will provide recommendations for the ongoing and upcoming initiatives towards an Arctic optimal observing design.

Despite these promising results, we recognize that it might be harder to achieve skilful predictions in the real world employing statistical tools because the actual SIV variability is likely noisier than the one described by AOGCM outputs. While model results provide an average representation of variables inside a grid cell, real-world observations would be much more heterogeneous. This issue is even more pronounced when looking at our main predictor (SIT) due to the inherent roughness and short-scale spatial heterogeneity of the real-world SIT. As a consequence, this heterogeneity may be a source of uncertainties in a real observing system, and more observations would be required for effectively predicting the SIV anomaly. Some caution should be exercised since our findings might be slightly different for other AOGCMs. A good perspective for addressing this issue is to reapply the methodology developed in this paper, but using all models that will be made available through the CMIP6. Also, with the sea ice depletion, some of the suggested optimal sampling locations might in the future be ice free.

Finally, it is worth mentioning the recent effort from the scientific community to enhance the Arctic observational system.
This effort takes place through recent observational programmes such as the Year of Polar Prediction (YOPP)

All codes for computing and plotting the results of this article are written in the Python programming language and are available upon request.

All model outputs used in this study were made available through the PRIMAVERA project (

LP, FM, and TF designed the science plan. DD computed the pan-Arctic sea ice area and volume. LP developed the statistical empirical model, conducted the data processing, produced the figures, analysed the results, and wrote the manuscript based on the insights from all co-authors.

The authors declare that they have no conflict of interest.

Leandro Ponsoni was funded by the APPLICATE project until September 2019 and is now a postdoctoral researcher of the Fonds de la Recherche Scientifique (FNRS). François Massonnet is a FNRS research associate. David Docquier was funded by the EU Horizon 2020 PRIMAVERA project until September 2019 and is currently funded by the EU Horizon 2020 OSeaIce project, under the Marie Skłodowska-Curie grant agreement no. 834493. Guillian Van Achter is funded by the PARAMOUR project which is supported by the Excellence Of Science programme (EOS), also funded by FNRS. We thank the two anonymous reviewers and the editor, Petra Heil, for their constructive suggestions and criticism.

The work presented in this paper has received funding from the European Union's Horizon 2020 Research and Innovation programme under grant agreement no. 727862 (APPLICATE project – Advanced prediction in Polar regions and beyond) and no. 641727 (PRIMAVERA project – PRocess-based climate sIMulation: AdVances in high-resolution modelling and European climate Risk Assessment).

This paper was edited by Petra Heil and reviewed by two anonymous referees.