On the timescales and length scales of the Arctic sea ice thickness anomalies: a study based on 14 reanalyses

The ocean–sea ice reanalyses are one of the main sources of Arctic sea ice thickness data both in terms of spatial and temporal resolution, since observations are still sparse in time and space. In this work, we first aim at comparing how the sea ice thickness from an ensemble of 14 reanalyses compares with different sources of observations, such as moored upward-looking sonars, submarines, airbornes, satellites, and ice boreholes. Second, based on the same reanalyses, we intend to characterize the timescales (persistence) and length scales of sea ice thickness anomalies. We investigate whether data assimilation of sea ice concentration by the reanalyses impacts the realism of sea ice thickness as well as its respective timescales and length scales. The results suggest that reanalyses with sea ice data assimilation do not necessarily perform better in terms of sea ice thickness compared with the reanalyses which do not assimilate sea ice concentration. However, data assimilation has a clear impact on the timescales and length scales: reanalyses built with sea ice data assimilation present shorter timescales and length scales. The mean timescales and length scales for reanalyses with data assimilation vary from 2.5 to 5.0 months and 337.0 to 732.5 km, respectively, while reanalyses with no data assimilation are characterized by values from 4.9 to 7.8 months and 846.7 to 935.7 km, respectively.


Introduction
The variability of the Arctic sea ice has received increasing attention from the scientific community over the past years (e.g., Chevallier and Salas-Mélia, 2012;Stroeve et al., 2014;Blanchard-Wrigglesworth and Bitz, 2014;Guemas et al., 2016). The main reason lies in the fact that Arctic sea ice plays a key role in the Earth's climate system (Budyko, 1969;Manabe and Stouffer, 1980b;Maykut, 1982). Among other contributions, it has been suggested that a decline of the Arctic sea ice extent and volume leads to a weakening of the Atlantic Meridional Overturning Circulation (Sévellec et al., 2017) and, therefore, potentially impacts the global distribution of heat (Drijfhout, 2015;Hansen et al., 2016). At the same time, the Arctic is one of the most sensitive regions to climate changes due to a phenomenon known as Arctic amplification (Manabe and Stouffer, 1980a;Holland and Bitz, 2003;Serreze et al., 2009). For instance, the current observed warming in the Arctic is reported to be nearly twice as large as other regions of the globe (Anisimov et al., 2007).
Other multiple specific interests from different stakeholders have reinforced the importance of sea ice projections, both at regional and larger scales, which include shorter shipping lanes (Lindstad et al., 2016), travel and tourism industry (Handorf, 2011), hunting and fishing activities (Nuttall et al., 2005), mineral resource extraction (Gleick, 1989), potential impact on the weather at midlatitudes (Walsh, 2014), environmental hazards (Nelson et al., 2002), and loss of weather predictive power by indigenous communities (Krupnik and Jolly, 2002). In this context, the sea ice thickness (SIT) is likely the most relevant state variable for monitoring, forecasting, and understanding recent and future changes of Arctic sea ice, first, because this parameter provides predictive information for the sea ice extent anomalies (Lindsay et al., 2008;Holland et al., 2011) and, second, due to the fact that SIT anomalies persist longer than sea ice extent anomalies, the former being reported as a forcing of the latter (Blanchard-Wrigglesworth et al., 2011).
However, direct observations of SIT and/or related parameters, namely draft and freeboard, are still sparse in time and space, despite the continuous efforts for compiling former and recent datasets from a range of sources (Lindsay, 2010;Lindsay and Schweiger, 2015). Some recent observational programs, such as the Year Of Polar Prediction (YOPP) (Jung et al., 2016) and the MOSAiC International Arctic Drift Expedition (https://www.mosaicobservatory.org/; last access: 15 July 2018), aim to enhance the Arctic observational system, being especially useful for improving our future modeling and forecasting skills.
Due to this lack of direct measurements in the past and present day, the ocean-ice reanalyses deserve special attention. A reanalysis product consists of models' outputs, which are generated over a certain time span by the same model, configurations, and procedures, and so are distributed onto regular grids, evenly stepped in time. These products are often built with assimilation of observational dataset(s) in order to improve the estimate of a certain parameter. For instance, SIT is often estimated by assimilating atmospheric, oceanic and, eventually, sea ice concentration data. The ocean-ice reanalyses are likely the main and more robust source of SIT data in terms of spatiotemporal resolution, being also broadly used for initialization and assimilation in other climate models (e.g., Guemas et al., 2016). Additionally, long-term reanalyses are crucial for understanding the past Arctic sea ice characteristics, in a period when in situ observations of ice parameters were inexistent.
In this work we make use of 14 state-of-the-art reanalyses in order to study two important aspects of the SIT predictability: the timescale (or persistence) and the length scale of SIT anomalies (see Sect. 2.3). Their importance is reinforced by the fact that the predictability of the SIT field depends on how long the anomalies persist over time and on how these anomalies spread in space. Notice that, hereafter, besides the traditional definition of time predictability, we adopt this term also for the spatial scale. In addition, timescales and length scales may also be useful for designing an optimal observation system when selecting ideal locations for deploying instruments as well as for defining a frequency sampling strategy (Blanchard-Wrigglesworth and Bitz, 2014). Blanchard-Wrigglesworth and Bitz (2014) reported SIT anomalies with typical timescales and length scales of about 6-20 months and 500-1000 km, respectively. These results reinforce the fact that the SIT anomaly persists longer compared to the sea ice area anomaly, which is reported with a timescale of 2-5 months (Blanchard-Wrigglesworth et al., 2011;Day et al., 2014). Blanchard-Wrigglesworth and Bitz (2014) suggested that the SIT anomalies from models characterized by a thinner mean ice state tend to present shorter persistence but larger spatial scales. Blanchard-Wrigglesworth et al. (2011) reported a decline in the timescale of sea ice volume anomalies, as a result of the ice thinning induced by recent climate changes.
The first aim of this study is to evaluate the performance of different reanalysis products regarding their SIT realism by comparing these reanalyses against observational datasets. A point of main interest is to identify whether or not the assimilation of sea ice concentration by the reanalyses improves the representation of SIT. Second, we seek to characterize the timescales and length scales of the Arctic SIT anomalies. Again, we verify whether or not sea ice data assimilation plays a role in the temporal and spatial scales of SIT anomalies. Furthermore, we investigate the long-term evolution of timescales and length scales, as well as the relationship between these two parameters.
The paper is organized as follows: Sect. 2 introduces the reanalysis products, the observational datasets, and the respective methods applied in this research; Sect. 3 compiles all results, including the comparison between observations and reanalyses (Sect. 3.1), the comparison of reanalyses themselves (Sect. 3.2), and the patterns of timescales (Sect. 3.3) and length (Sect. 3.4) scales. Lastly, Sect. 4 draws discussion and conclusions on the findings reported in the previous sections.

Sea ice reanalyses
Monthly fields of SIT from 14 state-of-the-art oceanice reanalyses are used in this work. All but one were compiled in the context of the ORA-IP project Chevallier et al., 2017;Uotila et al., 2018). The ORA-IP reanalyses (and their respective author/provider institution) are C-GLORS05 (CMCC; Storto et al., 2014), ECCO-v4 (JPL/NASA, MIT, AER;, ECDA (GFDL/NOAA; Zhang et al., 2013;Chang et al., 2013) Zuo et al., 2015;Tietsche et al., 2017), TOPAZ4 (ARC MFC; Sakov et al., 2012;Xie et al., 2017) and UR025-4 (University of Reading; Valdivieso et al., 2014). The 14th reanalysis is the Pan-Arctic Ice-Ocean Modeling and Assimilation System, PIOMAS (Zhang and Rothrock, 2003). For the abbreviations, the reader is referred to Appendix A. The original horizontal grids present different resolutions (Table 1), but for comparison, all reanalyses are interpolated onto a common grid of 1 • × 1 • spatial resolution following Chevallier et al. (2017). Tick labels are placed at the first day of the respective year. Reanalyses labeled in blue and red highlight whether the datasets were built with or without sea ice data assimilation, respectively. Specific characteristics of each reanalysis regarding horizontal resolution, ocean-sea ice model, atmospheric forcing data, subgrid-scale ice thickness distribution, ice dynamics (VP, viscous-plastic, or EVP, elastic-viscous-plastic), parameters for the ice strength formulation, air-ice drag coefficient, ocean-ice drag coefficient, and the presence (and respective method) of ice data assimilation are summarized in Table 1. For additional information the reader is referred to Chevallier et al. (2017) and/or to the respective providers.

Observational references
We use a compilation of 16 observational datasets available in the Unified Sea Ice Thickness Climate Data Record (Sea Ice CDR; Lindsay, 2010; http://psc.apl.uw.edu/sea_ice_cdr; last access: 15 July 2018). The Sea Ice CDR is a concerted effort to bring together a range of datasets in a consistent format but is originally sampled by different methods and spatiotemporal scales as well as being stored in a variety of formats. We use the post-processed version of the Sea Ice CDR data, which is distributed by monthly mean for moored upward-looking sonar (ULS) point measurements or 50 km averages for submarine, airborne, and satellite observations. If applicable, the Sea Ice CDR already provides the files corrected for data biases (e.g., Rothrock and Wensnahan, 2007b).

Methods
Reanalyses are compared against observations by selecting SIT values from the nearest grid points to the respective observational sites, during the same respective months. Complementary metrics are employed to evaluate the relationship between observations and reanalyses. When directly comparing SIT from reanalyses and observations, we estimate the root mean square error (RMSE), the correlation coefficient (R), and the mean residual sum of squares (MRSS) from the linear fit between both datasets, by having the reanalysis values as predictors and the observational values as predicted variables. Since SIT and draft are different variables, here we evaluate the strength of the linear relationship between them. Thus, when comparing SIT from reanalyses against draft from observations, we only estimate R and MRSS. In this work we do not account for snow variation in order to avoid adding uncertainties to the SIT fields.
SIT anomalies are derived by eliminating the trend and the seasonal cycle present in the time series. To do so, the trend is estimated separately for every month by means of a second-order polynomial fit and subtracted from the respective month. A second-order fit seems to better reproduce the trends when compared to the linear fit, although the results are very similar (not shown). The same method is applied for the analyses conducted with the pan-Arctic sea ice volume anomaly, derived from the SIT data, as illustrated in Fig. 1.
Grid point comparisons of SIT anomalies among all reanalyses are performed by means of RMSE and R maps, calculated over an overlapping period of 15 years, from January 1993 to December 2007. This time span corresponds to the period during which data are available from all reanalyses. Furthermore, as adopted by Blanchard-Wrigglesworth and Bitz (2014), only grid points wherein the mean ice thickness at the time of summer minimum is greater than 0.1 m are taken into account. This condition is valid for all reanalysisbased results, unless otherwise stated. The timescale (or persistence) is derived from individual time series by calculating the lagged autocorrelation stepped forward by one measurement, equivalent to 1 month. The efolding reference is used so that the persistence is assumed to be the time when the lagged autocorrelation curve crosses the 1/e (∼ 0.3679) value, as proposed in previous works (e.g., Blanchard-Wrigglesworth and Bitz, 2014;Guemas et al., 2016). As an example, Fig. 2 displays the timescale derived from the mooring-based draft anomaly sampled in the framework of the BGEP at 150 • W, 75 • N, from August 2003 to August 2013 (Krishfield et al., 2013). Figure 2 also shows the timescales from the allocated reanalysis-based SIT anomalies. For this geographical location and time span, the draft anomaly from BGEP persists for about 3.7 months, while the SIT anomalies from the different reanalyses persist from 2.4 to 8 months. The persistence is estimated both from a regional and pan-Arctic perspective. First, it is calculated at each grid point, for all SIT anomaly time series. Second, it is estimated for the long-term (GECCO2 and MOVE-CORE) pan-Arctic ice volume anomalies. For the latter case, we evaluate how stable the e-folding timescale is over time by applying a moving (stepped by 1 month) and length-variable window (from 5 to 59 years). Here, we also investigate whether the moving timescale is marked by significant band(s) of variability. To do so, we applied wavelet analysis as proposed by Torrence and Compo (1998). The length scales of the SIT anomalies are estimated for the reanalysis datasets. The first step is to determine one-point correlation maps. In other words, we calculate the cross-correlation between the SIT anomaly from each grid point with the anomaly from all other points. Subsequently, we make use of the e-folding reference and, for every map, we select all grid cells with a correlation coefficient higher than 1/e. The radius of a circle that yields the area covered by these selected cells is defined as the length scale of the SIT anomaly. This methodology is detailed and graphically presented by Blanchard-Wrigglesworth and Bitz (2014). Figure 3a shows an example for which the length scale is calculated for the SIT anomalies from CryoSat seasonal data: spring (March-April) and autumn (October-November), from autumn 2010 to spring 2017. In turn, Fig. 3b-c reveal that a similar length scale pattern is also present in PIOMAS. It is worth mentioning that this illustrative example allows a first assessment of how length scales from observations and reanalyses compare to each other. However, it can not be compared to the spatial scales of monthly anomalies further studied in Sect. 3.4.

Comparison of reanalyses with observations
The scatter plots shown in Fig. 4 combine SIT from each reanalysis and the observational datasets from all sources. The latter are separated into two parameters: draft (black dots) and SIT (green dots). The comparisons indicate that all reanalyses are significantly correlated to the observations, whether these are draft or SIT. By comparing SIT and draft, four reanalyses have correlation coefficients larger than 0. For a detailed overview on how each reanalysis is linked to each observational dataset, in terms of RMSE, MRSS, and R, the reader is referred to the tables presented in Appendix B.

Comparison of reanalyses to each other
As a first assessment of how well the reanalyses compare to each other, we estimate the RMSE and R between time series of the SIT anomaly, at every grid point, and between all pairs of products. The results are organized as a square matrix in Fig. 5, in which the number at the top of each panel represents the respective global value estimated by considering the data from all grid points. The lower triangular part of the matrix reveals that the smallest RMSE is found for Figure 4. Comparison between sea ice thickness from reanalyses and sea ice thickness (green points) or draft (black points) from observational datasets. The lines represent the linear fits having the reanalysis as the predictor and the observations as predicted variables. The mean residual sum of squares (MRSS) from the fit, the correlation coefficient (R), and the root mean square error (RMSE) are also displayed for each comparison. RMSE is calculated only when comparing SIT from both sources (green) but not when comparing SIT and draft (black). Reanalyses labeled in blue and red highlight whether the datasets were built with or without sea ice data assimilation, respectively. the pair ECDA-UR025-4 (RMSE = 0.21 m). Only four other pairs present RMSE ≤ 0.25; they are the match between the two GloSea5 products (0.23 m) and the combination of UR025-4 with C-GLORS05 and ECCO-v4 (0.25 m). The largest RMSE is found when comparing GECCO2-MOVE-G2 (0.61 m).

Timescales
An important property inherent to time series in general concerns their timescale and/or persistence, as defined in Sect. 2.3. In other words, we aim to infer how long the SIT anomaly maintains a good correlation with future measurements at the same grid cell. Persistence can also be perceived as the skill of a self-prediction scheme, for which past data are used to predict future values. In addition, it is a relevant variable to be taken into account when designing the sampling frequency of observational programs, especially if these programs target the understanding of the SIT time variability. Figure 6 displays the e-folding timescales for the SIT anomaly at every grid point, and for all reanalyses. The area weighted mean (AWM) timescales (in months) sorted in ascending order are 2.5 (GloSea5), 2.6 (GloSea5-GO5), 3.6 (PIOMAS) 3.7 (ECCO-v4), 3.8 (MERRA-Ocean), 4.0 (UR025-4), 4.3 (TOPAZ4), 4.4 (C-GLORS05), 4.7 (ORAP5), 4.9 (MOVE-CORE), 5.0 (G2V3), 6.0 (ECDA), 7.2 (MOVE-G2), and 7.8 months (GECCO2). These values were Figure 6. The e-folding timescales (or persistence) estimated for the SIT time series. Only grid cells in which the time mean (for the January 1993-December 2007 period) SIT at the time of summer minimum is greater than 0.1 m are taken into account for the computations. Averages for the systems with ice data assimilation and no data assimilation are represented by the DA and NA panels, respectively. All maps have the 0 • longitude placed at 06:00, while the bounding latitude is 67 • N. Reanalyses labeled in blue and red highlight whether the datasets were built with or without ice data assimilation, respectively. calculated only taking into account grid points with a valid SIT value from all reanalyses.
The results reveal that the thickness anomalies from reanalyses with no ice data assimilation (NA; Fig. 6, red labels) present a longer persistence, mainly distinguished in MOVE-G2 and GECCO2. Potential reasons to explain why the thickness anomalies persist longer in NA systems are suggested and discussed in Sect. 4. In contrast, the thickness anomalies from the GloSea5 systems (GloSea5 and GloSea5-GO5) have a much shorter persistence. From a regional point of view, Fig. 6 shows that GloSea5 and GloSea5-G05 are the only reanalyses in which the SIT anomaly persistence is remarkably short all over the Arctic, presenting e-folding timescales higher than 4 months only in a few, not evenly distributed, grid points. By contrast, the SIT from GECCO2 has a marked longer persistence (> 15 months) extending from the region off the northern coast of Greenland to the north of the Canadian Archipelago and mid-Arctic Ocean. The ECDA product presents a relatively similar pattern of the timescale over the region mentioned above but persisting for a shorter period (∼ 8 months). SIT anomalies from MOVE-G2 also indicate long persistence off the coast of northern Greenland, extending to the central Arctic and East Siberian Sea. For the remaining reanalyses, there is no common regional pattern of persistence outstanding from their respective timescale maps.
Nevertheless, the results above should be interpreted with caution. The e-folding timescale is a metric that depends on the shape of the lagged autocorrelation curve, which in turn may differ according to the period and time span of the original time series being analyzed. In order to evaluate how stable the timescale is by varying the time span and also by allowing it to evolve over time, we applied a time-moving and length-variable window to calculate the e-folding timescale of the ice volume anomaly (detrended in the same way as the SIT time series) from the two longest reanalyses (GECCO2 and MOVE-CORE), as shown in Fig. 7a and e. The window length varies from 5 to 59 years (stepped by 1 year) and it moves in time, stepped forward by 1 month. Here, we use the ice volume anomaly, rather than the SIT anomaly, for two reasons, first, because it is computationally more efficient than calculating the timescale for the SIT anomalies at every grid cell, considering the large number of interactions for a time-moving and length-variable window, and second, because the volume provides a pan-Arctic perspective of the SIT persistence.
Notice in Fig. 7a and e that the persistence overall grows to longer than ∼ 20 months when taking into account long time spans, remarkably for GECCO2 in which the ice volume anomaly persists for longer than 25 months at several center times. As for the thickness anomalies, MOVE-CORE presents a shorter persistence compared to GECCO2.
As a measure of stability, we estimate the standard deviations for all computations displayed in Fig. 7a and e. Results show that MOVE-CORE has a more stable timescale, with standard deviation of 3.0 months from its mean (9.7 months), while GECCO2 presents the average and standard deviation of 15.0 ± 6.5 months. Figure 7b and f show the case where the window length is 15 years, as it is for the overlapping period January 1993-December 2007. For this case, the average (standard deviation) timescales for GECCO2 and MOVE-CORE are 11.4 ± 2.6 and 9.1 ± 2.5 months, respectively. Minimum to maximum ranges are 6.2-16.5 months for GECCO2 and 4.9-13.5 months for MOVE-CORE. If we take into account the same center time of the time span January 1993-December 2007, that is mid-June 2000, the ice volume anomaly persistence is 13.6 and 9.2 months (red stars in Fig. 7b and f, respectively). Note that the timescales of the ice volume anomalies are a few months longer compared to the persistence of the AWM thickness anomalies (9.2 months, GECCO2; 5.4 months, MOVE-CORE). Figure 8. The e-folding length scales estimated for the SIT time series. Only grid cells in which the time-mean (for the January 1993-December 2007 period) SIT at the time of summer minimum is greater than 0.1 m are taken into account for the computations. Averages for the systems with ice data assimilation and no data assimilation are represented by the DA and NA panels, respectively. All maps have the 0 • longitude placed at 06:00, while the bounding latitude is 67 • N. Reanalyses labeled in blue and red highlight whether the datasets were built with or without ice data assimilation, respectively.
We make use of wavelet analysis (Torrence and Compo, 1998) to evaluate whether the time series displayed in Fig. 7b and f exhibit a significant band(s) of variability. Figure 7c and d reveal that the ice volume anomaly from GECCO2 presents two bands of significant variability, as highlighted by the horizontal gray bars in Fig. 7d. The first spans from 4.4 to 6.1 years, and it is present in the first half of the time series but does not persist over time (black contours in Fig. 7c). The second is marked by periods longer than 10.7 years, which seems to be recurrent over time but should be interpreted with caution since it is placed near the "cone of influence", in which edge effects become important, as indicated by crosshatched areas overlapping the black contours in Fig. 7c. The ice volume anomaly from MOVE-CORE, in turn, is marked by a single band of significant variability, with periods longer than 12.7 years (Fig. 7g, h). Again, this band should be interpreted with caution since it is also placed near the cone of influence.

Length scales
The e-folding length scale is a metric used for indicating how well a variable from a certain grid cell compares to the neighboring cells. In other words, it shows how the anomalies spread in space. As for the timescale, the length scale is a promising parameter to be explored when designing observational systems, but in terms of spatial coverage of instru- ments. Simplistically, regions marked with high length scales would require fewer instruments to be monitored better. Figure 8 shows the length scales for the SIT anomaly at every grid point. The AWM length scales, in kilometers and ascending order, for each system are 337.0 (GloSea5), 420.7 A similar pattern to the timescale is observed here, with GloSea5 and GloSea5-GO5 presenting the minimum length scales, rarely higher than 500 km, while the reanalyses without sea ice data assimilation are characterized by higher length scales, sometimes higher than 1200 km. In all systems the length scales are relatively longer near the central Arctic. This suggests that higher length scales could be somehow associated with thicker ice. The relationships between mean ice thickness, timescale, and length scale will be explored in detail in Sect. 4.
The stability of the length scale over time ( Fig. 9) was tested by means of a moving window with 15 years length, as follows: first, we calculate the one-point correlation maps for every grid point; second, we estimate the length scale for each one-point correlation map; third, the AWM length scale was calculated taking into account only grid points with a valid SIT value from all reanalyses; fourth, the process was repeated by stepping forward the 15-year window by 12 months. It is worthwhile mentioning that, computationally, it is much more expensive to calculate the length scale than the timescale. This is the reason why, here, we just use a time-moving but not length-variable window. The results suggest that the length scale is relatively more stable than the timescale (Fig. 7b and f), as further discussed in Sect. 4.

Discussion and conclusions
The first aim of this study was to evaluate how the SIT from the reanalyses compares against observational datasets, either draft or SIT. We have used three different metrics to perform this comparison: the correlation coefficient (R), as a measure of the linear correlation between datasets; the mean residual sum of squares (MRSS), as an indicator of whether reanalysis values are good predictors for the observations; and the root mean square error (RMSE), which directly compares how the SIT from the reanalyses approaches the SIT from observations. The results show that some of the reanalyses have a relatively good correspondence either comparing SIT and draft or SIT from both sources of data. This is the case, for instance, for the TOPAZ4 product. A direct comparison between SIT from all reanalyses and observations indicates RMSEs ranging from 0.7 to 1.1 m. PIOMAS has the best agreement with the observational datasets. A particular case is GECCO2, which presents a relatively small RMSE and a good correlation with the SIT observational datasets, as well as a linear relationship. However, this same product is weakly correlated with the draft observational datasets as well as having poor predictive skill.
One of our main goals in performing such a comparison was to identify whether or not systems built with assimilation of sea ice concentration data are closer to observations, compared to the products built with no sea ice data assimilation. The results suggest that reanalyses with sea ice data assimilation do not necessarily perform better. One could speculate that some reanalyses do not reflect the covariances between sea ice concentration and SIT well.
We have compared the mean state (mean SIV) and respective variability (SD SIV) of all reanalyses against the specifications and parameters displayed in Table 1. Nevertheless, for such a comparison, where each system has its own configuration with several varying parameters, we were not able to distinguish the effect that the selected parameters may have on the mean state and variability. The comparison among SIT from the different reanalyses (Sect. 3.2) is not straightforward and does not necessarily improve due to common specifications and key parameters from the two systems being compared. For instance, the pair C-GLORS05-G2V3 shares a set of common assumptions (ocean-sea ice model, atmospheric forcing, vertical discretization, number of ice thickness categories, EVP dynamics, ocean-ice drag coefficient, analysis window, and both assimilate sea ice data), but this pair still presents a relatively high RMSE (0.39 m) and not such a strong correlation (R = 0.4), as shown in Fig. 5. Only a few different assumptions and parameters, as well as their nonlinear interactions, may result in systems with considerably distinct mean state and variability. Although the pair C-GLORS05-G2V3 shares several common aspects, these two systems assume different air-ice drag coefficient and also assimilate the sea ice data in a different way, for instance. The same statement could be applied to other pairs of systems, Figure 10. Time-mean sea ice volume vs. the mean RMSE. This last parameter is an average of the RMSEs that each reanalysis has when its SIT field is compared individually to the other 13 reanalyses, as shown in Fig. 5. Shades of blue and purple indicate the reanalyses which do assimilate sea ice data, while shades of red indicate the reanalyses without sea ice data assimilation. e.g., G2V3-ORAP5, which also share some similarities but are still distinct in terms of mean state and variability.
The pair with the smallest RMSE (Fig. 5), ECDA-UR025-4, at the same time has a weak linear relationship (R = 0.21). This reinforces the importance of looking at different metrics when comparing different products. If we average the RMSEs that one specific reanalysis presents against all the others (Fig. 5), thus comparing with the pan-Arctic mean ice volume of this same reanalysis, it becomes clear that products with relatively low sea ice volume (i.e., thin ice) present small RMSEs when compared with their counterparts (Fig. 10), although MERRA-Ocean is an outlier in this pattern by presenting thin sea ice but a large RMSE compared to the other reanalyses (Fig. 10, left upper corner). Figure 10 helps to explain why ECDA and UR025-4 have a small RMSE, though their anomalies are marked by a relatively weak correlation, as suggested in Fig. 5. This is also evident in the respective sea ice volume anomalies from these two reanalyses shown in Fig. 1. In the same way, Fig. 5 also indicates that the large differences in the SIT field take place near the coast of northern Greenland and the Canadian Archipelago, which are regions marked by the thickest sea ice over the studied domain.
Another main goal of this work was to characterize the timescales and length scales of the sea ice thickness anomaly as well as to report whether these parameters are influenced by the fact that a respective reanalysis assimilates sea ice data or not. In this case, sea ice data assimilation plays a clear role in the scales referred to: systems with sea ice data assimila- Figure 11. Grid-point differences (G2V3-G2V1) of timescale (a) and length scale (b) between two versions of the GLORYS system: G2V3 which assimilates sea ice data and G2V1 which does not assimilate sea ice data. tion are characterized by shorter timescales and length scales compared to the systems which do not assimilate sea ice data. Nevertheless, a comparison between the same system but built with (G2V3) and without (G2V1; not included in the 14 reanalyses of the present study) assimilation of sea ice concentration data (Fig. 11) suggests that this finding is valid in terms of pan-Arctic averages but not necessarily at every grid cell. This may explain why in the specific location addressed in Fig. 2 the reanalyses with data assimilation showed relatively longer timescales compared to the reanalyses without data assimilation. The pan-Arctic AWM timescale and AWM length scale from G2V3 are 5 months and 728.2 km, respectively. Without sea ice data assimilation (G2V1), the AWM timescale and AWM length scale increase to 5.5 months and 745.3 km, respectively.
Likely, the main reason why the assimilation of sea ice concentration data impacts the timescales and length scales of the SIT field is linked to the fact that when a reanalysis assimilates sea ice information, the system is forced towards the assimilated conditions, different from what occurs with free-running models. Eventually, data assimilation introduces SIT increments that are not necessarily physical and so contributes to an attenuation in the correlation of this variable at a certain grid cell both in time, with their future estimations, and in space, with the neighboring grid points.
We have shown that timescales and length scales are clearly influenced by whether or not the reanalyses assimilate sea ice data, as represented graphically in Fig. 12a-b. However, are these two properties also clearly influenced by other specifications and parameters? Figure 12c-h show how timescales and length scales are linked to the choices of atmospheric forcing, sea ice model, and dynamics for ice-ice interactions that control ice deformation (VP or EVP). Even though the atmospheric forcing fields are reported to play a major role in the sea ice simulations (Gerdes and Köberle, 2007;Rothrock and Wensnahan, 2007a), we could not idenwww.the-cryosphere.net/13/521/2019/ The Cryosphere, 13, 521-543, 2019 Figure 12. Histograms showing how the AWM timescale (left-hand panels) and AWM length scale (right-hand panels) are related to different reanalysis specifications: (a-b) whether or not the system assimilates sea ice data, (c-d) the source of atmospheric forcing data, (e-f) the sea ice model used, and (g-h) the dynamics used (viscous-plastic or elastic-viscous-plastic) for ice-ice interactions that control ice deformation. Shades of blue and purple indicate the reanalyses which do assimilate sea ice data, while shades of red indicate the reanalyses without sea ice data assimilation.
tify distinguished patterns between the two main sources of atmospheric forcing used by the ensemble of reanalyses: ERA-Interim and NCEP/NCAR (Fig. 12c-d). Likewise, timescales and length scales are not clearly linked to the choices of sea ice model (Fig. 12e-f) and ice deformation dynamics (Fig. 12g-h), although a certain coherence in the timescales and length scales is observed for the systems that use the Louvain-la-Neuve sea ice model (LIM; Fig. 12e-f).
Besides the spread among the points, the scatter plots displayed in Fig. 13a-b indicate a certain correlation between the time-mean SIV (mean state) and the studied scales, where relatively thin ice leads to shorter scales, in agreement with Massonnet et al. (2018). In contrast, the timescales have a marked anti-correlation with the sea ice drift as shown in Fig. 13c: reanalyses with faster sea ice present a short timescale. Such correlation is less pronounced for the length scale (Fig. 13d). Different parameters from the reanalyses could potentially influence the sea ice velocity. For instance, high air-ice and low ocean-ice drag coefficients each contribute to faster ice velocities (Tandon et al., 2018). As an example, ECCO-v4 has the second highest air-ice (smaller only compared to MOVE-CORE) and the smallest ocean-ice (g) Relation between drag air-ice coefficient and mean state. (f) Relation between drag air-ice coefficient and mean sea ice drift. Black dashed lines indicate the linear fit, while the coefficient of correlation R (and its respective p value) is also displayed in each panel. The black cross in panels (e)-(h) indicate that MOVE-CORE was not used in the respective regressions (black dashed lines), since this reanalysis adopts a much higher drag air-ice coefficient compared to the other 13 reanalyses. Shades of blue and purple indicate the reanalyses which do assimilate sea ice data, while shades of red indicate the reanalyses without sea ice data assimilation. drag coefficients (see Table 1). This may explain why ECCO-v4 has relatively high ice velocities (Fig. 13c) and, therefore, low timescales and length scales (Figs. 6 and 8). For our ensemble of reanalyses, Fig. 13e shows a close correlation between sea ice velocity and the drag air-sea coefficients. Again, this correlation is less pronounced for the length scale (Fig. 13f).
The ice strength formulation is a major player in the sea ice velocity (Ungermann et al., 2017). All reanalyses follow the linear parameterization proposed by Hibler (1979), except the GloSea5 products and PIOMAS, which employ the ice strength formulation following Rothrock (1975). A higher ice strength parameter P * in Hibler's formulation leads to thicker and slower-moving ice, which would potentially lead to larger scales. Nevertheless, a relation between P * and the scales is not so clear for the ensemble of reanalyses (not shown). In addition, Ungermann et al. (2017) presented a detailed study comparing Hibler's and Rothrock's methods and showed that, for systems characterized by relatively thinner ice, model simulations with Rothrock's formulation result in Figure 14. Scatter plot of the average weighted mean (AWM) timescale vs. the AWM length scale. The black dashed line indicates the linear regression between both parameters, while the coefficient of correlation (R) is also displayed in the plot. The gray rectangle compares the two GLORYS systems: G2V1, built without sea ice data assimilation, and G2V3, built with sea ice data assimilation lower ice strength, and therefore faster ice velocities, compared to Hibler's formulation. We do not have a clear understanding of why the GloSea5 products present such short timescales and length scales; however, we could speculate that the combination between relatively thin ice (Fig. 13a) and the use of Rothrock's ice strength formulation could support this result.
The discussion above indicates that timescales and length scales are mainly driven by the fact of whether or not the reanalyses assimilate sea ice data, but that they are also influenced by the air-ice drag coefficient and sea ice drift. Figure 14 shows a strong correlation between both timescales and length scales, whereby long timescales are associated with large length scales. Notice in this plot the difference in the timescales and length scales from G2V1 and G2V3 systems, highlighted by the gray rectangle placed near the center of the figure.
As mentioned before, the ice thickness timescales and length scales are interesting properties to be explored when designing and planning an optimal observation system both in terms of temporal sampling and spatial placement of instruments. Nevertheless, these properties also vary over time (Figs. 7 and 9). The stability over time of timescales and length scales can be estimated by the coefficient of variation (C v = SD/Mean). The C v is a non-dimensional metric used to evaluate the extent of a certain variability in relation to its mean, allowing different properties to be compared. The C v for the GECCO2 and MOVE-CORE moving timescales, estimated from the time series shown in Fig. 7b and f, is 0.23 and 0.27, respectively, while the C v for moving length scales ( Fig. 9) is 0.13 and 0.07.
Lastly, it is worthwhile mentioning that both timescales and length scales are promising properties to support the design of an optimal observing system. As suggested by the C v presented above, and by the fact that the timescale is more sensitive to the reanalysis specifications and parameters (see Fig. 13c-f), the length scale is considerably more stable than the timescale; therefore it is a more reliable variable to be taken into account for deploying observing systems. For instance, the multiple linear regression model used by Lindsay and Zhang (2006), for determining optimal locations to predict sea ice extent from SIT, could be combined with the length scale information, thereby avoiding two or more stations placed into the same radius of correlation (length scale) being selected. The timescale would be more useful if used in combination with the knowledge of its variability. Further studies are required to evaluate the performance of timescales and length scales in providing support for the optimal design of observational programs, though this work already shows some promising results in that direction.

Appendix A: List of abbreviations
This appendix displays all abbreviations and their respective written-out names referred to in the text and used in the figures. The long names of some abbreviations were previously omitted in order to preserve the readability of text, while others were already defined. All of them will be mentioned below so that the reader can easily consult their meaning at any time. This appendix presents the comparison of all reanalysis products with all different observational datasets. This comparison is based on three different metrics: the root mean square error (RMSE), the mean residual sum of squares (MRSS), and the correlation coefficient (R).  www.the-cryosphere.net/13/521/2019/ The Cryosphere, 13, 521-543, 2019