Prediction of monthly Arctic sea ice concentration using satellite and reanalysis data based on convolutional neural networks

Changes in Arctic sea ice affect atmospheric circulation, ocean current, and polar ecosystems. There have been unprecedented decreases in the amount of Arctic sea ice, due to the global warming and its various adjoint cases. In this study, a novel one-month sea ice concentration (SIC) prediction model is proposed, with eight predictors using a deep learning approach, Convolutional Neural Networks (CNN). This monthly SIC prediction model based CNN is shown to perform better predictions (mean absolute error (MAE) of 2.28%, root mean square error (RMSE) of 5.76%, normalized RMSE (nRMSE) of 15 16.15%, and NSE of 0.97) than a random forest (RF)-based model (MAE of 2.45%, RMSE of 6.61%, nRMSE of 18.64%, and NSE of 0.96) and a simple prediction model based on the yearly trend (MAE of 9.36%, RMSE of 21.93%, nRMSE of 61.94%, and NSE of 0.83) through hindcast validations. Spatiotemporal analysis also confirmed the superiority of the CNN model. The CNN model showed good SIC prediction results in extreme cases that recorded unforeseen sea ice plummets in 2007 and 2012 with less than 5.0% RMSEs. This study also examined the importance of the input variables through a sensitivity analysis. In 20 both the CNN and RF models, the variables of past SICs were identified as the most sensitive factor in predicting SIC. For both models, the SIC-related variables generally contributed more to predict SICs over ice-covered areas, while other meteorological and oceanographic variables were more sensitive to the prediction of SICs in marginal ice zones. The proposed one-month SIC prediction model provides valuable information which can be used in various applications, such as Arctic shipping route planning, management of fishery industry, and long-term sea ice forecasting and dynamics. 25


Introduction
Sea ice refers to the frozen seawater that covers approximately 15 % of the oceans in the world (National Snow and Ice Data Center, 2018). Sea ice reflects more solar radiation than the water's surface, which makes the polar regions relatively cool. Sea ice shrinks in summer due to the warmer climate and expands in the winter season. Many studies on Arctic sea ice monitoring and dynamics have been conducted because it plays a significant role in the energy and water balance of global climate systems (Ledley, 1988;Guemas et al., 2016). In particular, the change in sea ice is an important indicator that shows the degree of ongoing climate change (Johannessen et al., 2004). Global warming causes a decrease in sea ice that worsens the arctic amplification, which in turn accelerates global warming itself (Cohen et al., 2014;Francis and Vavrus, 2015). In addition, sea ice affects various oceanic characteristics and societal issues, such as ocean current circulation, by changing salinity and temperature gradation (Timmermann et al., 2009); polar ecosystems, by affecting key parts of the Arctic food web like sea ice algae (Doney et al., 2012); and economic industries, e.g., Arctic shipping routes .
Arctic sea ice has been rapidly declining, which impacts not only the Arctic climate but also possibly the mid-latitudes (L. . Numerous studies have shown significant interactions between the ocean and climate character-istics, such as sea surface temperature, solar radiation, surface temperature, and the changes in sea ice (Guemas et al., 2016). Therefore, the prediction of long-and short-term sea ice change is an important issue in projecting climate change (Yuan et al., 2016). Various approaches, including numerical modeling and statistical analysis, have been proposed to develop models for predicting sea ice characteristics (Guemas et al., 2016;Chi and Kim, 2017). Many of the studies have adopted statistical models using in situ observations or reanalysis data based on the relationship between sea ice and ocean or climate parameters (Comeau et al., 2019). The long-range forecasting models of the sea ice severity index and concentration (monthly to seasonal) using multiple linear regression were developed by Drobot (2003) and Drobot et al. (2006), respectively. Lindsay et al. (2008) examined the short-and long-term sea ice extent (SIE) prediction using a multiple linear regression model with historical information regarding the ocean and ice data. Wang et al. (2016b) developed a vector autoregressive (VAR) model to predict the intraseasonal variability in sea ice concentration (SIC) in the summer season (May-September). The suggested VAR model considering only the historical sea ice data without any atmospheric and oceanic information showed a rootmean-square error (RMSE) of ∼ 17 % for a 30 d prediction. However, the literature has reported that sea ice prediction is a very challenging task under the changing Arctic climate system (Holland and Stroeve, 2011;Stroeve et al., 2014). A short-term forecast of SIC has been also examined using statistical approaches. Wang et al. (2019) evaluated the sub-seasonal predictability of Arctic SIC using multiple variables of sea ice, the atmosphere, and the ocean based on statistical approaches -the VAR and vector Markov models. The VAR model showed quite good predictability in the short term, with an RMSE of 10 %, but still resulted in high RMSEs (∼ 20 %) for longer than 4 weeks over the pan-Arctic region during the summer season (from June to August). Meanwhile, the data-adaptive harmonic (DAH) technique, which examines a data-driven feature using cross correlations, was demonstrated to predict the Arctic SIE (Kondrashov et al., 2018). The DAH model showed a promising predictability of the SIE in September, resulting in the absolute error of about 0.3×10 6 km 2 in 2014-2016. Chi and Kim (2017) suggested a deep-learning-based model using long-and short-term memory (LSTM) in comparison with a traditional statistical model. Their model showed good performance in the 1-month prediction of sea ice concentration (SIC), with less than 9 % average monthly prediction errors. However, it had low predictability during the melting season (RMSE of 11.09 % from July to September). Kim et al. (2019) proposed a near-future SIC prediction model (10-20 years) using deep neural networks together with the Bayesian model averaging ensemble, resulting in an RMSE of 19.4 % in the annual average. This study suggests that deep-learning techniques are good for connecting variables under non-linear relationships, such as SIC and climate vari-ables. However, this study also showed low prediction accuracy during the melting season (normalized RMSE -nRMSE -of 102.25 % from June to September).  used convolutional neural networks (CNNs) to estimate SIC in the Gulf of Saint Lawrence from synthetic-aperture-radar (SAR) imagery. Their study compared their CNN model to a multilayer perceptron (MLP) model, showing the superiority of the CNN model in SIC estimation with an RMSE of about 22 %.
However, different from the classic statistical models, the previous studies using deep-learning techniques have focused on the long-term prediction of SIC (i.e., more than 1 year of prediction). The short-term forecasting of sea ice conditions is also important for maritime industries and decision-making on field logistics (Schweiger and Zhang, 2015). In addition, there is room to further improve the accuracy of short-term SIC prediction models with more advanced techniques and data. SIC describes the fraction of a specified area (typically a grid cell) covered by sea ice, and it has been widely used as a simple and intuitive proxy to identify the characteristics of sea ice. Thus, this study aimed to predict the changes in Arctic sea ice characteristics using SIC.
This study proposes a novel deep-learning-based method to predict SIC based on the predictors of spatial patterns, considering the operational forecast of sea ice characteristics. The objectives of this study were to (1) develop a novel monthly SIC prediction model using a deep-learning approach (CNN), (2) examine the prediction performance of the proposed model through comparison with a randomforest-based SIC prediction model, and (3) conduct a sensitivity analysis of predictors that affect SIC predictions.

Data
Three types of datasets were used in this study, which represent sea ice concentrations, oceanographic, and meteorological characteristics in the Arctic. This study focuses on the prediction accuracy of the proposed models as well as the sensitivity of each predictor in monthly SIC prediction. The spatial domain of this study is a region of the Arctic Ocean (40-90 • N, 180 • W-180 • E), and the temporal coverage is the 30 years between 1988 and 2017.
The first dataset is the daily sea ice concentration observation dataset, obtained from the National Snow and Ice Data Center (NSIDC), which is derived from the Nimbus-7 Scanning Multichannel Microwave Radiometer (SMMR) and the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave Imager (SSM/I and SS-MIS). The second dataset is the daily sea surface temperature dataset, obtained from National Oceanic and Atmospheric Administration (NOAA) Optimal Interpolation Sea Surface Temperature (OISST) version 2, which is constructed from Advanced Very High Resolution Radiometer (AVHRR) observation data with 0.25 • resolution from 1988 to 2017. The third dataset is the monthly European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) dataset, which is used in order to construct predictors for 1-month SIC prediction, including the surface air temperature, albedo, and v-wind vector with 0.125 • resolution.
In this study, a total of eight predictors were selected and used to predict SIC for the next month (Table 1) based on the literature and a preliminary statistical analysis of potential predictors through a feature selection process using random forest (Strobl et al., 2007). We selected the eight predictors by comparing the mean decrease accuracy (MDA) changes based on 12 monthly prediction random-forest (RF) models from 1988 to 2017. The MDA has been widely used as feature selection criteria by measuring the accuracy changes by randomly permuting input variables (Archer and Kimes, 2008). It should be noted that fewer predictors than the selected eight ones did not produce better results. The predictors are as follows: SIC 1 year before (sic_1y), SIC 1 month before (sic_1m), SIC anomaly 1 year before (ano_1y), SIC anomaly 1 month before (ano_1m), sea surface temperature (SST), 2 m air temperature (T2m), forecast albedo (FAL), and the amount of v wind (v wind).
In order to have the same spatial and temporal scales, the daily data, including SIC and SST, were transformed into monthly means and onto a polar stereographic projection with 25 km grids. The predictors were normalized to 0 to 1 or −1 to 1 (for ano_1y and ano_1m). Since sea ice decline has accelerated in recent years, especially in the summer season (Stroeve et al., 2008;Schweiger et al., 2008;Chi and Kim, 2017), we computed the SIC anomaly variables only for a more recent time period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) rather than for the entire study period . This was done in order to focus on the trends in recent sea ice changes. Since the anomalies were calculated from the recent years (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017), there is no significant multicollinearity issue that could cause overfitting (Pearson's correlation coefficient between mean SICs and anomalies (ρ) = −0.39, p < 0.01). The v wind indicates the relative amount of wind towards the North Pole: the larger the v wind, the more it blows from south to north. The v-wind data were derived using an 11×11 moving window based on a mean function from the raw 10 m height v-wind vector data. Regarding the moving window, this study set the analysis unit as an 11×11 window (neighboring five pixels; about 125 km) in order to consider the synoptic-scale climate and ocean circulation in the polar region (Crane, 1978;Emery et al., 1997).
The eight predictors selected in this study through randomforest-based feature selection have theoretical backgrounds that are related to the characteristics of SIC. First, SIC itself can affect the SIC in the future because it has a clear interannual trend through the melting and freezing seasons (Deser and Teng, 2008;Chi and Kim, 2017). This is a useful characteristic when conducting a time-series analysis, and, thus, two SIC time-series climatology predictors (SIC 1 year before and SIC 1 month before) were used in this study. Although there is no clear physical explanation of why the interannual variations would contribute to the forecasting skill, it clearly worked well in long-term SIC forecasting in previous studies (Wang et al., 2016a;Chi and Kim, 2017). Furthermore, we used two supplementary predictors that indicate the anomalies of SIC 1 year before and SIC 1 month before in order to consider anomalous sea ice conditions in the models. The anomaly data could give information about SST anomaly along the sea ice edge in terms of the re-emergence mechanism from the melting to the freezing seasons (Guemas et al., 2016). Second, changes in SST and SIC have a significant relationship to each other with regards to the heat budget (Rayner et al., 2003;Screen et al., 2013;Prasad et al., 2018). The re-emergence of sea ice anomalies is also partially explained by the persistence of SST anomalies (Guemas et al., 2016). Air temperature and albedo are related to the amount of solar radiation enabling the prediction of SIC changes. The solar radiation heats the surface of the ocean as well as the sea ice. This causes a rise in the SST while also reducing albedo on the sea ice by melting the surface snow or thinning the sea ice (Screen and Simmonds, 2010;Mahajan et al., 2011). Moreover, the surface snowmelt produces melt ponds, wet sea ice surfaces, and wet snow cover, which accelerate sea ice melting (Kern et al., 2016). Warm winds from lower latitudes toward the Arctic can also reduce sea ice (Kang et al., 2014), and local wind forces affect sea ice motion and formation (Shimada et al., 2006). The wind vector also can cause short-or long-range sea ice drifts (Guemas et al., 2016), which may influence SIC variation.

Prediction models: convolutional neural networks
(CNNs), random forest (RF), and anomaly persistence model This study proposes a SIC prediction model using a CNN deep-learning approach. A CNN is a kind of artificial neural network (ANN) model first suggested by LeCun et al. (1998) and has since been further developed with various structures and algorithms. Many studies have adopted CNN approaches to complete image recognition or classification tasks (Kim et al., 2018a;Ren et al., 2015;Yoo et al., 2019;E. Zhang et al., 2019). CNN learns the features of images and takes them into account as key information in order to extract outputs (Kim et al., 2018b;Wylie et al., 2019). Convolutional networks share their weights and connect neighboring layers using convolution layers like neurons (X. . The convolutional structure is a unique feature of CNN models that often shows higher performance than other types of ANN in image recognition studies (Krizhevsky et al., 2012;Lee et al., 2009;Zhao et al., 2020). The basic CNN struc- ture consists of a bundle of convolutional layers, a number of pooling layers, and a fully connected layer. The convolutional process is to generate feature maps from gridded input data with kernel and activation functions. A CNN model extracts the best feature map from an input image through an iterative training process including backpropagation learning and optimization algorithm.
In CNN approaches, when 3-D data (i.e., width, height, and depth -or channel) are entered, several moving kernels pass through the data for each channel and transform them into feature maps using dot-product calculation. Through a number of convolutional processes, the model uses the fully connected layer to generate the final answer. The series of convolutional processes involved in this process requires significant computation loads. To prevent heavy computation, both the stride (i.e., how to shift a moving kernel) and the pooling (i.e., how to conduct downsampling) techniques are widely used, which make the size of the input data in the following convolutional process reduced. To avoid too much data reduction, many studies have adopted a padding technique, which covers input data with extra dummy values (Wang et al., 2016a). The feature map achieved through the convolutional process is a convolved map that contains a higher level of features of an image (Chen et al., 2015). In general, a CNN model contains a larger learning capacity and provides more robustness against noise than normal MLP models because of the more trainable parameters as well as the structure of deeper networks .
In order to conduct a quantitative comparison of the prediction performance of the proposed CNN model, this study used RF, which is an ensemble-based machine-learning technique Latifi et al., 2018;Lee et al., 2018;Yoo et al., 2018). The RF model was used to solve imagebased classification problems such as building extraction, land-cover classification, freeboard detection, and crop classification (Liu et al., 2018;Guo and Du, 2017;Forkuor et al., 2018;Lee et al., 2016;Park et al., 2018;Sonobe et al., 2017). RF extracts features using classifiers of each variable (D. Zhang et al., 2019). The user can deal with two main parameters: the number of decision trees and the number of split variables at the nodes (Fagua and Ramsey, 2019). In this study, we used 50 trees and 11 random variables to be used in the decision split because random selection using one-third of variables in each split has been used widely in solving regression problems (Lee et al., 2017;Liu et al., 2015;Mutowo et al., 2019). Compared to the CNN approach, RF has a relatively low learning capacity from the perspective of the parametric size.
Finally, an anomaly persistence forecast model was also examined for predicting the monthly Arctic SIC. The anomaly persistence model is a useful reference for forecast skill for time-series data (Wang et al., 2016). Since sea ice shows a clear climatological pattern (Parkinson and Cavalieri, 2002;Deser and Teng, 2008;Chi and Kim, 2017), this study used the persistence forecast model along with the RF regression model as baseline models to figure out the performance of the CNN model for SIC prediction.

Research flow
This study examined three models in order to predict SIC using the persistence and RF-based (baselines) and CNN-based approaches ( Fig. 1). We designed 12 individual models (i.e., monthly models) to predict SIC for each month. A hindcast validation approach was used to evaluate each model's performance. Each monthly model was trained using the past data starting from 1988. For instance, 12 years of data (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999) and 29 years of data  were trained to predict SICs in 2000 and 2017, and 2000 and 2017 SIC data were used as validation data. Eight input data during the past 30 years that consist of 304×448 sized grids were used as training data in the RF and CNN models. In the case of the RF model, an additional 24 input parameters, along with the eight predictors, were considered. They are the mean, minimum, and maximum values of each predictor calculated using the 11×11 window. These additional variables for RF are to fill the conceptual gaps between the two approaches by considering the spatial patterns of predictors such as features in the CNN model. Since most SIC samples were biased to zero values because of the numerous pixels in the open sea, the training samples were balanced out considering the SIC values (0 %-100 %) using a monthly maximum SIE mask, which shows the widest SIE during the entire study period (1988-2017) for each month. As a result, in the case of 2017, about 600 000 samples on average (i.e., from about 400 000 samples in September to about 850 000 samples in March) were trained for both monthly models (i.e., RF and CNN). However, the unbalanced sampling problem still remained because the lower SIC (less than 40 %) samples were relatively small (about 20 % of the entire training samples). In the case of the anomaly persistence forecast model, the monthly SIC anomaly of each pixel persisted and the observed trend was calculated for the month ahead. For example, SICs in January 2000 were predicted by summing the 1-month persistence anomaly and 1-month-ahead SIC from a linear trend of SICs from January 1988 to December 1999 by each grid.
As described in Fig. 1, the CNN model consists of three convolutional layers and one fully connected layer.  used CNNs to estimate SIC from SAR data and showed that the use of three convolutional layers performed better than one or two layers. In this study, the rootmean-square propagation (RMSProp) optimizer with a learning rate of 0.001 and the ReLU activation function were used in the model. The RMSProp optimizer has a similar process to a gradient descent algorithm, which divides the gradients by a learning rate (Tieleman and Hinton, 2012). Fifty (50) epochs with a batch size of 1024 were used in the proposed CNN model. The best model showing the highest validation accuracy during the training process was selected and used for further analysis. The CNN model was implemented using the TensorFlow Keras open-source library, while the persistence and RF models were implemented using the in-terp1 and TreeBagger functions in MATLAB R2018a, respectively.
This study firstly evaluated the model performance by quantitatively comparing the prediction results of the three models based on five accuracy metrics: the mean absolute error (MAE; Eq. 1), anomaly correlation coefficient (ACC; Eq. 2), root-mean-square error (RMSE; Eq. 3), nRMSE (Eq. 4), and Nash-Sutcliffe efficiency (NSE; Eq. 5). In the melting season, many pixels contain relatively low SIC values compared to the freezing season. By dividing the RMSE by the standard deviation of actual SICs, the nRMSE can represent the prediction accuracy considering the range of SIC values . The ACC is a measure of skill score to evaluate the quality of the forecast model (Wang et al., 2016) and has a value between −1 (inversely correlated) and 1 (positively correlated). The NSE is a widely used measure of prediction accuracy (Moriasi et al., 2007). This can provide comprehensive information regarding data by comparing the relative variance of prediction errors with the variance of the observation data (Nash and Sutcliffe, 1970;Moriasi et al., 2007). The NSE has a range from −∞ to 1.0. A model is more accurate when the NSE value closer is to 1 but unacceptable when the value is negative (Moriasi et al., 2007). Every error matrix was computed with respect to space and time. The errors were spatially averaged after masking and then temporally averaged: With respect to prediction accuracy analysis, a specific mask that covers only pixels that have shown sea ice more than once in the past 10 years was used to prevent an inflation of overall accuracy that may have happened due to the effect of pixels on open seas in the melting season (Chi and Kim, 2017;Kim et al., 2019). For example, to calculate the prediction accuracy of predicted SIC in January 2017, the mask covered only pixels that have shown sea ice in the month of January from 2007 to 2016. To examine prediction performance in the marginal sea ice zone, the models were compared in two cases: the whole range of SICs (0 %-100 %) and low SICs (0 %-40 %). In addition, the study examined the spatial distribution maps showing the annual MAE and ACC of three models from 2000 to 2017. The spatial relationship between SIC anomalies and prediction errors was also explored. Since the actual anomalies, as well as actual prediction errors (predicted SICs-actual SICs), tended to cancel each other out by averaging negative and positive values, we used absolute anomaly and error values. Since the actual anomalies, as well as actual prediction errors (predicted SICs-actual SICs), tended to cancel each other out by averaging negative and positive values, we used absolute anomaly and error values. In order to examine temporal forecast skill, this study compared the ACC between the monthly time series of reference and predicted SICs at each grid (Wang et al., 2016). The distribution of predicted SICs by both models was also compared for the melting season (June-September). The Sea Ice Outlook (SIO) open community has investigated the pan-Arctic sea ice, especially in the September SIE, since 2008 (Stroeve et al., 2014;Chi and Kim, 2017). They have shared the predicted September SIE from June, July, and August based on heuristic, statistical, dynamical, and mixed approaches. Chi and Kim (2017) have pointed out the difficulties of sea ice prediction because the prediction errors have increased since 2012. To figure out September minimum SIE, which is the main focus of the SIO community (Stroeve et al., 2014), we compared the predicted SIEs based on the three models evaluated in this study with the other 37 SIO contributions for the September SIE predictions reported in Au-gust 2017. In the present study, the SIE was identified as an area of SIC > 15 % (Chi and Kim, 2017). Furthermore, the averaged monthly trends of prediction accuracy using RMSE and nRMSE together were examined with the trends of annual mean nRMSE by dividing the data into melting (June-September) and freezing (December-March) seasons.
In this research, we compared and examined prediction results focusing on two extreme cases of SIC: September 2007 and 2012. There was unexpectedly large Arctic sea ice shrinkage in the summer 2007 and 2012 because of the large-scale changes in climate conditions and August cy-clones, respectively (Devasthale et al., 2013). Therefore, for detailed analysis, visual interpretation comparing the spatial patterns of prediction errors and input variables was conducted by focusing on the regions showing high prediction errors in September 2007 and September 2012.
Finally, we examined the variable sensitivity for each model. Rodner et al. (2016) evaluated the variable sensitivity of built-in CNN architectures in three ways: adding random Gaussian noises, taking geometric perturbations, and setting random impulse noises (i.e., set the pixel values to zero) to input images. In this research, the analysis of variable sensitivity was conducted using their first and third methods. To examine the influence of variables on prediction accuracy, we added random Gaussian noises with the zero mean and 0.1 standard deviations and then compared any changes of RMSE for each variable (Eq. 6). In addition, to examine the spatial effects on the predictions, the prediction results were compared by setting zero values for two groups of variables, namely variables related to sea ice (sic_1y, ano_1y, sic_1m, and ano_1m) and other environmental variables (SST, T2m, FAL, and v wind): 4 Results and discussion In particular, the persistence model shows a larger decrease than the other models. Nonetheless, the CNN model produced consistently higher performance than the other models for both cases. The spatial distribution of the annual MAE of three models from 2000 to 2017 is shown in Fig. 2. From visual inspection, it appeared that the prediction errors were dominant in the marginal areas (i.e., the boundaries between the sea ice and open seas). Since the marginal sea ice, particularly thin ice, is susceptible to change (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013), the prediction accuracy may have decreased. Weak predictability in the marginal sea ice zone might be due to a relatively small training sample size over the area. In the melting season, relatively higher prediction errors appeared not only in the marginal area but also in ice-covered areas near the Arctic center ( Fig. 2f-h). On the other hand, in the freezing season, the prediction errors were shown mainly in the marginal area ( Fig. 2j-l). Furthermore, relatively higher prediction errors appeared around the Kara Sea and the Barents Sea (Fig. 2a, e, and i). The region from the Kara Sea to the Barents Sea shows consistent sea ice retreats because of inflows of warm and salty ocean water from the Atlantic Ocean into the Barents-Kara Sea (Schauer et al., 2002;Årthun et al., 2012;Kim et al., 2019) and cumulative positive solar radiation in the summer season (Stroeve et al., 2012). Using a visual comparison, it can be seen that the degree of error is higher in the RF model than in the CNN model (Fig. 2).

Monthly prediction of SIC
The spatial distribution of the temporal ACCs of three models from 2000 to 2017 is shown in Fig. 3. First of all, every prediction model showed quite good skill scores with high positive correlation (near 1.0; Fig. 3a-c). Interestingly, the ACCs were higher in the marginal area, which showed relatively high prediction errors. Even though the models were weak in predicting SIC changes in the marginal sea ice zone, they caught decreasing trends of SICs relatively well. On the other hand, the region near the Arctic center showed relatively low ACCs. In contrast to the marginal sea ice zone, the Arctic center region is relatively stable to the changes (Stroeve et al., 2008;Chevallier et al., 2013). Since SICs in the center are almost saturated (100 % of SIC) and very stable, it might cause lower ACC values even there were relatively small prediction errors. In case of the melting season (June-September; Fig. 3d-f), the degree of ACCs decreased when compared to the annual mean (Fig. 3a-c), but they also showed the decreasing trends well in accordance with global warming. Unlike the melting season, the freezing season (December-March) showed relatively lower ACCs in the marginal and Arctic center regions (Fig. 3g-i). The persistence model did not catch the decreasing trend and showed a negative correlation in the Laptev Sea (Fig. 3g). Furthermore, the ACCs were quite low in the Arctic center region.
www.the-cryosphere.net/14/1083/2020/ The Cryosphere, 14, 1083-1104, 2020 As mentioned above, the stable and saturated sea ice resulted in lower skill scores in terms of ACC. From visual inspection, the CNN model showed better prediction with a stable skill score than the other models. Figure 4 shows the histograms of NSIDC SICs and the predicted SICs by three models in the melting season (June-September) during 2000-2017. The persistence forecasting model shows poor predictability for all ranges of SICs (Fig. 4a). In addition, the model tended to overestimate for higher SICs in the melting season. The model did not catch the decreasing trends of sea ice well due to global warming. On the other hand, the RF and CNN models showed relatively weak predictability for boundary SIC values (i.e., less than 10 % and over 90 % SICs). In particular, the RF model showed weakness in predicting SICs near zero (0 %) and 100 %. By focusing on the RF and CNN models, the mean and standard deviation values of prediction errors (predicted SIC-NSIDC) were examined for lower as well as higher SICs. In the case of lower SICs (less than 5 %), both models showed overestimation. In detail, the CNN model showed a better prediction result than RF (CNN: mean error of 4.84 % and SD of 7.65 %; RF: mean error of 5.92 % and SD of 9.77 %). On the other hand, in the case of higher SICs (over 95 %), both models showed underestimation. The RF model shows a −4.62 % error and 4.57 % standard deviation, but the CNN model shows −4.17 % and 4.14 %, respectively. With the same training samples, the CNN model resulted in higher prediction accuracy in both lower and higher SICs. This might be because of the larger learning capacity of the CNN model than the RF model .
The spatial comparison of the predicted September SIEs in 2017 between the reference (NSIDC) and three approaches used in this study is shown in Fig. 5. The observed SIE in September 2017 was 4.80×10 6 km 2 , which was reported by the Sea Ice Prediction Network (http://www.arcus.org/sipn, last access: 22 March 2020). The SIE in 13 September 2017 was the eighth lowest in the satellite record since 1981 (NSIDC, 2017). The SIEs predicted by the anomaly persistence, RF, and CNN models were 4.37×10 6 , 4.95×10 6 , and 4.88×10 6 km 2 , respectively. While the anomaly persistence model underestimated the SIE, the other two models slightly overestimated it. The anomaly persistence model considered the decreasing trends of sea ice somewhat excessively. The CNN-based model showed the lowest prediction error when compared to the Sea Ice Prediction Network reference data (9×10 4 km 2 ). In terms of spatial distributions, the anomaly persistence model showed the excessive retreat of sea ice in the Beaufort and Laptev Sea (Fig. 5a). However, the RF and CNN models showed a slightly wide SIE in the Chukchi and Barents Sea (Fig. 5b and c). The overestimated SIE might be because of the July storm across the central Arctic Ocean through the Barents Sea (West and Blockley, 2017). The accuracy of 1-month SIE prediction based on three approaches was compared to the other 37 SIO contributions for September 2017 (Fig. 5d). Since the SIO reports contain only quantitative SIE values, it was not possible to compare their spatial distributions. With regard to the SIE values, the statistical approaches showed quite accurate prediction results based on Arctic sea ice thickness distributions and ice velocity data (UTokyo) and the non-parametric statistical model (Slater-Barrett NSIDC). The CNN prediction result showed relatively accurate prediction accuracy.
Since the persistence model did not work well when compared to the RF and CNN models, the subsequent analyses are focused on the RF and CNN models. Figure 6 shows monthly prediction accuracies (i.e., RMSE and nRMSE) for the RF and the CNN models. The RF model showed lower prediction accuracy than the CNN model for all months. With regards to the RMSE of the CNN model, the prediction accuracy was higher in the melting season (June-September; 5.41 %) than in the freezing season (December-March; 6.13 %). However, as mentioned, the RMSE considers the range of sample values; for instance, more zero or low SIC values were found in the melting season (Chi and Kim, 2017). Thus, the nRMSE showed the opposite pattern to the RMSE. The nRMSE using the standard deviation can show the prediction accuracy considering the different ranges of SIC by month. In the nRMSE of the CNN model, there is a different pattern between the melting season (June-September; 19.09 %) and freezing season (December-March; 14.08 %). According to the two-sample t test, the nRMSE in the melting season is higher than in the freezing season (p < 0.01; n = 18) throughout the entire period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). The difficulty of SIC prediction in the melting season is a well-known problem because of the unexpected decline of Arctic sea ice in recent years Chi and Kim, 2017).

Prediction results in extreme cases: September 2007 and 2012
SIC prediction results of the actual SIC and the SICs predicted by the RF and CNN models were found using two extreme cases: September 2007 and 2012 (Figs. 8 and 9). Even though there were unpredicted plummets in the extent of the sea ice, the CNN model showed relatively good prediction results in September 2007 and 2012 (RMSE of 5.00 % and 4.71 % and nRMSE of 21.93 % and 23.95 %, respectively).
In the case of September 2007, there were large sea ice losses through the Beaufort Sea-Chukchi Sea-Laptev Sea during summer (Fig. 8d). Both the RF and CNN models showed an overestimation of SIC over the Chukchi Sea and Laptev Sea. This implies that both models were not able to effectively learn the speed of the drastic retreat of sea ice in that region through training (Fig. 8e-f). Similarly, Fig. 9 shows the prediction results and errors based on the RF   and the CNN models in September 2012. In summer 2012, there was also a large loss of sea ice over the Beaufort Sea-Laptev Sea-Kara Sea (Fig. 9d). Both the RF and CNN models yielded overestimations of SIC in the region between the Barents Sea and the Kara Sea. This might have been caused by the fast decline of sea ice in that region because of warm seawater inflows from the Atlantic Ocean in the summer season (Schauer et al., 2002;Årthun et al., 2012;Kim et al., 2019. The results of two extreme cases showed that the prediction errors were mainly found in the regions that show high SIC anomalies (i.e., marginal ice zone with small training sample size; Figs. 8d-f and 9d-f). Together, Figs. 10 and 11 show a detailed analysis focusing on the regions containing high numbers of prediction errors in September 2007 and 2012. Interestingly in both cases, overestimation was found in no ice zones directly neighboring the marginal sea ice zone (dotted black circle area, Figs. 10 and 11c-d). Both cases show high SST and T2m anomalies together with a low FAL anomaly, caused by a melted snow layer . Those anomalous patterns of SST, T2m, and FAL were caused by anomalous strong solar radiation for both cases (Kauker et al., 2009;Kay et al., 2008;Parkinson and Comiso, 2013;Zhang et al., 2013). In regards to v wind, the anomalous warm wind toward the Arctic center, flowing in by strong southerly winds driven from the Pacific water, resulted in melting in the Beaufort Sea in 2007 Fig. 10l). However, the CNN model did not catch the past negative SIC anomalies effectively. For instance, Fig. 10d and h depict overestimation errors in the northern part of the region by showing negative SIC anomalies. Similarly, Fig. 11d, g, and h document overestimations in the northern part of the region that shows negative SIC anomalies near the Barents Sea and the Kara Sea. Such overestimation might be caused by the use of a small moving window (i.e., 11×11). Since the anomalies were found quite far from the marginal sea ice zone, the models were not able to predict changes in sea ice well. However, a larger window size might impede the overall performance of the model by forcing it to deal with too much learnable information in the CNN approach (Lai et al., 2015). A detailed exploration of the optimum window size is needed in future research. Table 3 shows the variable sensitivity results of both models from 2000 to 2017. The two models show SIC-related variables as the most sensitive factor, i.e. sic_1m and sic_1y, rather than other oceanic or climate variables. These results are consistent for each model in the annual mean, freezing season (December-March), and melting season (June-September). As the SIC-related variables play a role in the time-series climatology information of sea ice, SICs themselves can affect SIC prediction in the future (Deser and Teng, 2008;Chi and Kim, 2017). Between long-term climatologies (sic_1y and ano_1y) and short-term climatologies (sic_1m and ano_1m), the former showed higher sensitivity in both models (except sic_1y and sic_1m in the RF). The previous studies have revealed the clear yearly sea ice trends of each month by investigating monthly averaged sea ice extents of the nine Arctic regions and the total from 1979 (Cavalieri and Parkinson, 2012;Parkinson and Cavalieri, 2002). Thus, the monthly models showed long-term climatologies as factors that contribute more than the other variables (i.e., SICs from the previous January are important in the January prediction model). Although long-term climatologies were important in the monthly models, the RF model identified sic_1m as the factor that contributes more than sic_1y. This might be due to the limitation of the input variables of the RF model used in this study, resulting in a lack of detailed spatial information. The RF model considered spatial information based on 24 additional proxies using an 11×11 window (i.e., mean, minimum, and maximum). However, it may not be sufficient to examine the various spatial distributions of input variables. As a result, the RF model might be highly influenced by short-term information rather than long-term variables.  Table 4 shows the variable sensitivity, focusing on every September in 2000-2017, 2007, and 2012. Unlike the results in Table 3, T2m and FAL were identified as the most influential factors in the RF model. As reported in many studies, solar radiation has a large effect on the changes in sea ice (Kang et al., 2014;Guemas et al., 2016). In addition, the icealbedo feedback contributes to the recovery of sea ice from the losses in summer (Comiso, 2006;Tietsche et al., 2011).

Variable sensitivity
In the case of September 2007, the warm surface air temperature was the main cause of the drastic decrease in sea ice (Kauker et al., 2009). However, in the case of v wind, a Gaussian noise made an improvement to the prediction accuracy in two extreme cases for the RF model. While there are no studies revealing the effects of v wind in September 2012, there is an indirect effect from the southerly warm wind toward the Arctic center in September 2007 . Moreover, in the RF model, the degree of sensitivity of FAL is bigger in the two extreme cases than for the entire period. These pieces of evidence may point out that the RF model is less robust than the CNN model to highly anomalous SIC cases. In contrast to the RF model, the CNN model consistently identified the sic_1y as the variable that contributes the most. Although there is no clear causality between the SICs 1 year before and the anomalous decline of sea ice in September 2007 and 2012, past SICs provide information on SICs in the future as time-series data (Chi and Kim, 2017). Figure 12 shows the spatial influence of two sets of variables with impulse noise (zero values). As shown in Fig. 12b and e, the CNN model was not able to predict SICs in the existing sea ice area when using zero values for the SIC-related variables (sic_1y, sic_1m, ano_1y, and ano_1m). When the CNN model set zero values for the other environmental variables (SST, T2m, FAL, and v wind), the model was not able to predict a decrease in SICs around the marginal areas between the sea ice and open sea ( Fig. 12c and f). This is possibly due to decays in the marginal ice zone by anomalous SST, T2m, and FAL in both cases. Consistent with the rewww.the-cryosphere.net/14/1083/2020/ The Cryosphere, 14, 1083-1104, 2020  sults of the sensitivity analysis (Table 4), SIC-related variables were identified as important indicators in predicting SICs (Deser and Teng, 2008). The other meteorological and oceanographic variables tended to affect the SIC changes of the marginal-zone ice, particularly the neighboring thin-ice and no-ice zone (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013).

Novelty and limitations
Our study developed a novel 1-month SIC prediction model using the CNN deep-learning approach. The research findings from this study can make a contribution towards filling the gaps in the research on short-term sea ice change and prediction using a deep-learning approach (Grumbine,  diction model can provide valuable information, which can be used in various decision-making processes in the maritime industry and in research regarding sea ice forecasting (Schweiger and Zhang, 2015). Notably, the non-linear learning architectures of the CNN model showed good prediction accuracy based on the larger learning capacity and more con-sistent temporal SIC prediction than the traditional machinelearning approach (Wang et al., 2016;Liu et al., 2018). However, there are some challenging limitations to the proposed CNN model, particularly regarding the prediction variables. First, this study did not consider the effects of a longer timescale, or persistent effects, on sea ice changes (Guemas et al., 2016). For example, the 2007 and 2012 sea ice minimums were caused by not only the anomalous warm atmospheric conditions of the summer season but also by persistently warm winter and spring seasons, which especially affected the melting in the marginal ice zone (Devasthale et al., 2013). The proposed CNN model could be used for the longer prediction (i.e., 2-or 3-month prediction) in consideration of the persistent effects of input variables such as SST and T2m. Moreover, additional input variables that represent seasonal or longer-term variabilities in the Arctic environment should be considered in the proposed models. The persistence of sea ice volume and atmospheric-circulationrelated variables would be suitable for the long-term sea ice forecast (Guemas et al., 2016). Second, the sea ice thickness is an important factor when predicting sea ice changes because the thinner sea ice is relatively vulnerable to melt (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013). However, we did not consider sea ice thickness data because of the limited availability of reliable sea ice thickness products. Third, there is a well-known problem with deep-learning models -interpretability. Because of complicated and non-linear connections between hidden layers, the deep-learning models are hard to interpret (Koh and Liang, 2017;Guidotti et al., 2018). Recent deep-learning studies have attempted to report explainable results using various visualization approaches such as heat maps and occlusion maps (Brahimi et al., 2017;Trigueros et al., 2018). The present study explained the model using variable sensitivity analysis as well as the inspection of the spatial distribution. However, the model still has problems providing clear interpretations of the non-linear relationships among variables.

Conclusions
The main purpose of this study was to develop a novel 1month SIC prediction model using the CNN approach. The CNN model showed better prediction performance (MAE of 2.28 %, ACC of 0.98, RMSE of 5.76 %, nRMSE of 16.15 %, and NSE of 0.97) than the persistence forecast (MAE of 4.31 %, ACC of 0.95, RMSE of 10.54 %, nRMSE of 29.17 %, and NSE of 0.89) and RF models (MAE of 2.45 %, ACC of 0.98, RMSE of 6.61 %, nRMSE of 18.64 %, and NSE of 0.96). The prediction accuracy in the melting season (June-September; nRMSE of 19.09 %) was lower than the freezing season (December-March; nRMSE of 14.08 %). The overall prediction accuracy decreased in more recent years because of the accelerated sea ice melting caused by global warming. In two extreme cases, the CNN model yielded promising prediction results with respect to RMSE as well as the spatial distribution of SICs (less than 5 % RMSE). The prediction errors normally occurred in the marginal ice zone, which has higher sea ice anomalies. From the variable sensitivity analysis using CNN, the SICs 1 year before were identified as the most important factor in predicting sea ice changes. While the SIC-related variables had large effects on SIC prediction over ice-covered areas, the other meteorological and oceanographic variables were more sensitive in predicting the SICs in marginal ice zones.
Data availability. The research data can be obtained by request to the corresponding author (ersgis@unist.ac.kr).
Author contributions. YJK led the paper writing and contributed to data analysis and research design. HK and SL contributed to the research design and discussion of the results. DH contributed to data processing and analysis. JI supervised this study; contributed to the research design, paper writing, and discussion of the results; and served as the corresponding author.
Competing interests. The authors declare that they have no conflict of interest. Review statement. This paper was edited by David Schroeder and reviewed by three anonymous referees.