Prediction of monthly Arctic sea ice concentration using satellite and reanalysis data based on convolutional neural networks

Changes in Arctic sea ice affect atmospheric circulation, ocean current, and polar ecosystems. There have been unprecedented decreases in the amount of Arctic sea ice, due to global warming and its various adjoint cases. In this study, a 10 novel one-month sea ice concentration (SIC) prediction model is proposed, with eight predictors using a deep learning approach, Convolutional Neural Networks (CNN). This monthly SIC prediction model based on CNN is shown to perform better predictions (mean absolute error (MAE) of 2.28%, anomaly correlation coefficient (ACC) of 0.98, root mean square error (RMSE) of 5.76%, normalized RMSE (nRMSE) of 16.15%, and NSE of 0.97) than a random forest (RF)-based model (MAE of 2.45%, ACC of 0.98, RMSE of 6.61%, nRMSE of 18.64%, and NSE of 0.96) and the persistence model based on 15 the monthly trend (MAE of 4.31%, ACC of 0.95, RMSE of 10.54%, nRMSE of 29.17%, and NSE of 0.89) through hindcast validations. The spatiotemporal analysis also confirmed the superiority of the CNN model. The CNN model showed good SIC prediction results in extreme cases that recorded unforeseen sea ice plummets in 2007 and 2012 with less than 5.0% RMSEs. This study also examined the importance of the input variables through a sensitivity analysis. In both the CNN and RF models, the variables of past SIC were identified as the most sensitive factor in predicting SIC. For both models, the SIC-related 20 variables generally contributed more to predict SICs over ice-covered areas, while other meteorological and oceanographic variables were more sensitive to the prediction of SICs in marginal ice zones. The proposed one-month SIC prediction model provides valuable information which can be used in various applications, such as Arctic shipping route planning, management of fishery industry, and long-term sea ice forecasting and dynamics.

for maritime industries and decision making on-field logistics (Schweiger and Zhang, 2015). In addition, there is room to further improve the accuracy of short-term SIC prediction models with more advanced techniques and data. SIC describes the fraction of a specified area (typically a grid cell) covered by sea ice and it has been widely used as a simple and intuitive proxy 65 to identify the characteristics of sea ice. Thus, this study aimed to predict the changes in Arctic sea ice characteristics using SIC.
This study proposes a novel deep learning-based method to predict SIC based on the predictors of spatial patterns, considering the operational forecast of sea ice characteristics. The objectives of this study were to (1) develop a novel monthly SIC prediction model using a deep learning approach, CNN; (2) examine the prediction performance of the proposed model through 70 comparison with a random forest-based SIC prediction model; and (3) conduct a sensitivity analysis of predictors that affect SIC predictions.

Data
Three types of datasets were used in this study, which represent sea ice concentrations, oceanographic, and meteorological characteristics in the Arctic. This study focuses on the prediction accuracy of the proposed models as well as the sensitivity of 75 each predictor on monthly SIC prediction. The spatial domain of this study is a region of the Arctic Ocean (180°W -180°E / 40°N -90°N), and the temporal coverage is the 30 years between 1988 and 2017.
The first dataset is the daily sea ice concentration observation data, obtained from the National Snow and Ice Data Center (NSIDC), which is derived from the Nimbus 7 Scanning Multichannel Microwave Radiometer (SMMR) and the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave Imager (SSM/I and SSMIS). The second dataset is the 80 daily sea surface temperature data, obtained from National Oceanic and Atmospheric Administration (NOAA) Optimal Interpolation Sea Surface Temperature (OISST) version 2, which is constructed from Advanced Very High-Resolution Radiometer (AVHRR) observation data with 0.25° resolution from 1988 to 2017. The third dataset is the monthly European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis Interim (ERA-Interim) data, which is used in order to construct predictors for one-month SIC prediction, including the surface air temperature, albedo, and v-wind vector in 0. 125°. 85 In this study, a total of eight predictors were selected and used to predict SIC next month (Table 1) based on the literature and a preliminary statistical analysis of potential predictors through a feature selection process using random forest (Strobl et al., 2007). We selected the eight predictors by comparing the mean decrease accuracy (MDA) changes based on twelve monthly prediction RF models from 1988 to 2017. The MDA has been widely used as feature selection criteria by measuring the accuracy changes by randomly permuting input variables (Archer and Kimes, 2008). It should be noted that fewer predictors 90 than the selected eight ones did not produce better results. The predictors are: SIC one-year before (sic_1y), SIC one-month before (sic_1m), SIC anomaly one-year before (ano_1y), SIC anomaly one-month before (ano_1m), sea surface temperature (SST), 2-meter air temperature (T2m), forecast albedo (FAL), and the amount of v-wind (v-wind). In order to have the same spatial and temporal scales, the daily data, including SIC and SST, were transformed into monthlymeans and onto a polar stereographic projection with 25km grids. The predictors were normalized into 0 to 1 or -1 to 1 (for ano_1y and ano_1m). Since sea ice decline has accelerated in recent years, especially in the summer season (Stroeve et al, 2008;Schweiger et al., 2008;Chi and Kim, 2017), we computed the SIC anomaly variables only for a more recent time period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) rather than the entire study period . This was done in order to focus on the trends in recent sea ice 100 changes. Since the anomalies were calculated from the recent years (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017), there is no significant multicollinearity issue that could cause overfitting (Pearson's correlation coefficient between mean SICs and anomalies (ρ) = -0.39, p<0.01). The vwind indicates the relative amount of wind towards the North Pole: the larger the v-wind, the more it blows from South to North. The v-wind data were derived using an 11-by-11 moving window based on a mean function from the raw 10-meterheight v-wind vector data. Regarding the moving window, this study set the analysis unit as an 11-by-11 window (neighboring 105 5 pixels; about 125 km) in order to consider the synoptic-scaled climate and ocean circulation in the polar region (Crane, 1978;Emery et al., 1997).
The eight predictors selected in this study through random forest-based feature selection have theoretical backgrounds that are related to the characteristics of SIC. First, SIC itself can affect the SIC in the future because it has a clear inter-annual trend through the melting and freezing seasons (Deser and Teng, 2008;Chi and Kim, 2017). It is a useful characteristic when 110 conducting a time-series analysis, and thus, two SIC time-series climatology predictors (SIC one-year before and SIC onemonth before) were used in this study. Although there is no clear physical explanation of why the interannual variations would contribute to the forecasting skill, it clearly worked well in long-term SIC forecasting in previous studies Chi and Kim, 2017). Further, we used two supplementary predictors that indicate the anomalies of SIC one-year before and SIC one-month before, in order to consider anomalous sea ice conditions in the models. The anomaly data could give 115 information about SST anomaly along the sea ice edge in terms of the re-emergence mechanism from the melting to the freezing seasons (Guemas et al., 2014). Second, changes in SST and SIC have a significant relationship with each other, with regards to the heat budget (Rayner et al., 2003;Screen and et al., 2013;Prasad et al., 2018). The re-emergence of sea ice anomalies is also partially explained by the persistence of SST anomalies (Guemas et al., 2014). Air temperature and albedo are related to the amount of solar radiation enabling the prediction of SIC changes. The solar radiation heats the surface of the 120 ocean as well as the sea ice. This causes a rise in the SST while also reducing albedo on the sea ice by melting the surface snow or thinning the sea ice (Screen and Simmonds, 2010;Mahajan et al., 2011). Moreover, the surface snow melting produces melt ponds, wet sea-ice surfaces, and wet snow cover which accelerate sea ice melting (Kern et al., 2016). Warm winds from lower latitudes toward the Arctic can also reduce sea ice (Kang et al., 2014) and local wind forces affect sea ice motion and formation (Shimada et al., 2006). The wind vector also can cause short or long-range sea ice drifts (Guemas et al., 2014), 125 which may influence SIC variation.

Prediction models: Convolutional Neural Networks (CNN), Random Forest (RF), and anomaly persistence model
This study proposes a SIC prediction model using a Convolutional Neural Network (CNN) deep learning approach. CNN is a kind of artificial neural network (ANN) model first suggested by LeCun et al. (1998) and has since been further developed 130 with various structures and algorithms (Deng et al., 2013). Many studies have adopted CNN approaches to complete image recognition or classification tasks (Ren et al., 2015;Kim et al., 2018;Yoo et al., 2019;Zhang et al., 2019). CNN learns the features of images and takes them into account as key information, in order to extract outputs .
Convolutional networks share their weights and connect neighboring layers using convolution layers like neurons . The convolutional structure is a unique feature of CNN models that often shows higher performance than other 135 types of ANN in image recognition studies (Krizhevsky et al., 2012;Lee et al., 2009). The basic CNN structure consists of a bundle of convolutional layers, a number of pooling layers, and a fully connected layer. The convolutional process is to generate feature maps from gridded input data with kernel and activation functions. A CNN model extracts the best feature map from an input image through an iterative training process including backpropagation learning and optimization algorithm.
In CNN approaches, when 3-dimensional data (i.e., width, height, and depth (or channel)) are entered, several moving kernels 140 pass through the data for each channel and transform them into feature maps using dot-product calculation. Through a number of convolutional processes, the model uses the fully connected layer to generate the final answer. The series of convolutional processes involved in this process requires significant computation loads. To prevent heavy computation, both the stride (i.e., how to shift a moving kernel) and the pooling (i.e., how to conduct downsampling) techniques are widely used, which make the size of the input data in the following convolutional process reduced. To avoid too much data reduction, many studies have 145 adopted a padding technique, which covers input data with extra dummy values . The feature map achieved through the convolutional process is a convolved map that contains a higher level of features of an image (Chen et al., 2015).
In general, a CNN model contains larger learning capacity and provides more robustness against noise than normal MLP models because of the more trainable parameters as well as the structure of deeper networks .
In order to conduct a quantitative comparison of the prediction performance of the proposed CNN model, this study used 150 random forest (RF), which is an ensemble-based machine learning technique (Yoo et al., 2018). The RF model was used to solve image-based classification problems such as building extraction, land-cover classification, and crop classification (Liu et al., 2018;Guo and Du, 2017;Forkuor et al., 2018;Sonobe et al., 2017). RF extracts features using classifiers of each variable (Liu et al., 2018). The user can deal with two main parameters: the number of decision trees and the number of split variables at the nodes (Ghimire et al., 2010). In this study, we used 50 trees and 11 random variables to be used in the decision split 155 because random selection using one-third of variables in each split has been used widely in solving regression problems (Mutanga et al., 2012;Chu et al., 2014). Compared to the CNN approach, RF has a relatively low learning capacity from the perspective of the parametric size.
Finally, an anomaly persistence forecast model was also examined for predicting the monthly Arctic SIC. The anomaly persistence model is a useful reference for forecast skill for time-series data ). Since sea ice shows a clear 160 climatological pattern (Parkinson and Cavalieri, 2002;Deser and Teng, 2008;Chi and Kim, 2017), this study used the persistence forecast model along with the RF regression model as baseline models to figure out the performance of the CNN model for SIC prediction.

Research Flow
This study examined three models in order to predict SIC using the persistence and RF-based (baselines), and CNN-based 165 approaches ( Fig. 1). We designed twelve individual models (i.e., monthly models) to predict SIC for each month. A hindcast validation approach was used to evaluate each model's performance. Each monthly model was trained using the past data staring from 1988. For instance, 12-years' data (1988-1999) and 29-years' data (1988-2016) were trained to predict SICs in 2000 and 2017, and 2000 and 2017 SIC data were used as validation data, respectively. Eight input data during the past 30 years that consist of 304 * 448 sized grids were used as training data in the RF and CNN models. In the case of the RF model, 170 an additional 24 input parameters, along with the eight predictors, were considered. They are the mean, minimum, and maximum values of each predictor calculated using the 11-by-11 window. These additional variables for RF are to fill the conceptual gaps between the two approaches by considering the spatial patterns of predictors such as features in the CNN model. Since most SIC samples were biased to zero values because of the numerous pixels in the open sea, the training samples were balanced out considering the SIC values (0 -100%) using a monthly maximum sea ice extent mask, which shows the 175 widest sea ice extent during the entire study period (1988-2017) for each month. As a result, in the case of 2017, about 600,000 samples on average (i.e., from about 400,000 samples in Sep. to about 850,000 samples in Mar.) were trained for both monthly models (i.e., RF and CNN). However, the unbalance sampling problem still remained because the lower SIC (less than 40%) samples were relatively small (about 20% of the entire training samples). In the case of the anomaly persistence forecast model, the monthly SIC anomaly of each pixel persisted and the observed trend was calculated for that month ahead. For example, 180 SICs in Jan. 2000 were predicted by summing one-month persisted anomaly and one-month ahead SIC from a linear trend of SICs from Jan. 1988 to Dec. 1999 by each grid.
As described in Fig. 1, the CNN model consists of three convolutional layers and one fully connected layer. Wang et al. (2017) used CNNs to estimate SIC from SAR data and showed that the use of three convolutional layers performed better than one or two layers. In this study, the root mean square propagation (RMSProp) optimizer with a learning rate of 0.001 and the relu 185 activation function were used in the model. The RMSProp optimizer has a similar process to a gradient descent algorithm which divides the gradients by a learning rate (Tieleman and Hinton, 2012). Fifty (50)   This study firstly evaluated the model performance by quantitatively comparing the prediction results of the three models based on five accuracy metrics: mean absolute error (MAE, Eq. (1)), anomaly correlation coefficient (ACC, Eq. (2)), root 195 mean square error (RMSE, Eq. (3)), normalized root mean square error (nRMSE, Eq. (4)), and Nash-Sutcliffe efficiency (NSE, Eq. (5)). In the melting season, many pixels contain relatively low SIC values compared to the freezing season. By dividing the RMSE by the standard deviation of actual SICs, the nRMSE can represent the prediction accuracy considering the range of SIC values (Kim et al., 2018). The ACC is a measure of skill score to evaluate the quality of the forecast model  and has a value between -1 (inversely correlated) and 1 (positively correlated). The NSE is a widely-used measure 200 of prediction accuracy (Moriasi et al., 2007). It can provide comprehensive information regarding data by comparing the relative variance of prediction errors and the variance of the observation data (Nash and Sutcliffe, 1970;Moriasi et al., 2007).
The NSE has a range from −∞ to 1.0. A model is more accurate when the NSE value closer to 1, but unacceptable when the value is negative (Moriasi et al., 2007). Every error matrix was computed with respect to space and time. The errors were spatially averaged after masking, and then temporally averaged. 205 ( 3 ) With respect to prediction accuracy analysis, a specific mask that covers only pixels that have shown sea ice more than once in the past 10 years was used to prevent an inflation of overall accuracy that may have happened due to the effect of pixels on open seas in the melting season (Chi and Kim, 2017;Kim et al. 2018). For example, to calculate the prediction accuracy of predicted SIC in January 2017, the mask covered only pixels that have shown sea ice in Januarys from 2007 to 2016. To 215 examine prediction performance in the marginal sea ice zone, the models were compared in two cases: all range of SICs (0-100%) and low SICs (0-40%).
In addition, the study examined the spatial distribution maps showing the annual MAE and ACC of three models from 2000 to 2017. The spatial relationship between SIC anomalies and prediction errors was also explored. Since the actual anomalies, as well as actual prediction errors (predicted SICs -actual SICs), tended to cancel each other out by averaging negative and 220 positive values, we used absolute anomaly and error values. Since the actual anomalies, as well as actual prediction errors (predicted SICs -actual SICs), tended to cancel each other out by averaging negative and positive values, we used absolute anomaly and error values. In order to examine temporal forecast skill, this study compared the ACC between the monthly time series of reference and predicted SICs at each grid . The distribution of predicted SICs by both models was also compared for the melting season (Jun. -Sep.). Further, the averaged monthly trends of prediction accuracy using RMSE 225 and nRMSE together were examined with the trends of annual mean nRMSE by dividing the data into melting (Jun. Finally, we examined the variable sensitivity for each model. Rodner et al. (2016) evaluated the variable sensitivity of built-in CNN architectures in three ways: adding random Gaussian noises, taking geometric perturbations, and setting random impulse noises (i.e., set the pixel values to zero) to input images. In this research, the analysis of variable sensitivity was conducted 235 using their first and third methods. To examine the influence of variables on prediction accuracy, we added random Gaussian noises with zero-mean and 0.1 standard deviations, then compared any changes of RMSE for each variable (Eq. (6)). In addition, to examine the spatial effects on the predictions, the prediction results were compared by setting zero values for two groups of variables: sea ice related variables (sic_1y, ano_1y, sic_1m, and ano_1m) and other environmental variables (SST, T2m, FAL, and v-wind). 240 ( 6 ) Especially, the persistence model shows a larger decrease than the other models. Nonetheless, the CNN model produced consistently higher performance than the other models for both cases. The spatial distribution of the annual MAE of three models from 2000 to 2017 is shown in Fig. 2. From visual inspection, it appeared that the prediction errors were dominant in the marginal areas (i.e., the boundaries between the sea ice and open seas).

Monthly prediction of SIC
Since the marginal sea ice, particularly thin ice, is susceptible to change (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013), the prediction accuracy may have decreased. Weak predictability on the marginal sea ice zone might be due to a relatively small training sample size over the area. In the melting season, relatively higher prediction errors appeared not only 265 in the marginal area, but also even ice-covered areas near the Arctic center ( Fig. 2f-h). On the other hand, in the freezing season, the prediction errors were shown mainly in the marginal area ( Fig., 2j-l). Further, relatively higher prediction errors appeared around the Kara Sea and the Barents Sea (Fig. 2a, e, and i). The region from the Kara Sea to the Barents Sea shows consistent sea ice retreats because of inflows of warm and salty ocean water from the Atlantic Ocean into the Barents-Kara Sea (Schauer et al., 2002;Årthun et al., 2012;Kim et al., 2018) and cumulative positive solar radiation in the summer season 270 (Stroeve et al., 2012). Using a visual comparison, it can be seen that the degree of errors is higher in RF than CNN (Fig. 2). model showed quite good skill scores with high positive correlation (near 1.0, Fig. 3a-c). Interestingly, the ACCs were higher in the marginal area where showed relatively high prediction errors. Even though the models were weak to predict SIC changes in the marginal sea ice zone, but they caught decreasing trends of SICs relatively well. On the other hand, the region near the 280 Arctic center showed relatively low ACCs. In contrast to the marginal sea ice zone, the Arctic center region is relatively stable to the changes (Stroeve et al., 2008;Chevallier et al., 2013). Since SICs in the center is almost saturated (100% of SIC) and very stable, it might cause lower ACC values even there were relatively small prediction errors. In case of the melting season (Jun. -Sep., Fig. 3d-f), the degree of ACCs decreased when compared to the annual-mean (Fig. 3a-c), but they also showed the decreasing trends well in accordance with global warming. Unlike the melting season, the freezing season (Dec. -Mar.) 285 showed relatively lower ACCs in the marginal and Arctic center regions (Fig. 3g-i). The persistence model did not catch the decreasing trend and showed a negative correlation in the Laptev Sea (Fig. 3g). Further, the ACCs were quite low in the Arctic center region. As mentioned above, the stable and saturated sea ice resulted in lower skill scores in terms of ACC. From visual inspection, the CNN model showed better prediction with a stable skill score than the other models.  during [2000][2001][2002][2003][2004][2005][2006][2007][2008][2009][2010][2011][2012][2013][2014][2015][2016][2017]. The persistence forecasting model shows poor predictability for all ranges of SICs (Fig. 4a). In addition, the model tended to over-estimate for higher SICs in the melting season. The model did not catch well the decreasing trends of sea ice due to global warming. On the other hand, the RF and CNN models showed relatively weak predictability for boundary SIC values (i.e., less than 10% and over 90% SICs). In particular, the RF model showed a weakness to predict SICs near zero (0%) and 100%. By focusing on the RF and CNN models, the mean and standard deviation values of prediction 300 errors (predicted SIC -NSIDC) were examined for lower as well as higher SICs. In the case of lower SICs (less than 5%), both models showed over-estimation. In detail, the CNN model showed a better prediction result than RF (CNN: mean error of 4.84% and std. of 7.65%; RF: mean error of 5.92% and std. of 9.77%). On the other hand, in the case of higher SICs (over 95%), both models showed under-estimation. The RF model shows -4.62% of error and 4.57% of standard deviation, but the CNN shows -4.17% and 4.14%, respectively. With the same training samples, the CNN resulted in higher prediction accuracy 305 on both lower and higher SICs. It might be because of the larger learning capacity of CNN than RF . Since the persistence model did not work well when compared to the RF and CNN models, the subsequent analyses are focused on the RF and CNN models. Figure 5 shows monthly prediction accuracies (i.e., RMSE and nRMSE) for the RF and the CNN 310 models. The RF model showed lower prediction accuracy than the CNN model for all months. With regards to the RMSE of the CNN model, the prediction accuracy was higher in the melting season (Jun. -Sep.; 5.41%) than in the freezing season (Dec. -Mar.; 6.13%). However, as mentioned, the RMSE considers the range of sample values; for instance, more zero or low SIC values were found in the melting season (Chi and Kim, 2017). Thus, the nRMSE showed the opposite pattern to the RMSE.

Prediction results in extreme cases: September 2007 and 2012 335
SIC prediction results of the actual SIC and the SICs predicted by the RF and CNN models were conducted using two extreme cases: September 2007 and 2012 ( Fig. 7 and 8). Even though there were unpredicted plummets in the extent of the sea ice, the CNN model showed relatively good prediction results in Sep. 2007 and 2012 (RMSE of 5.00 % and 4.71%, nRMSE of 21.93% and 23.95%, respectively).
In the case of Sep. 2007, there were large sea ice losses through the Beaufort Sea -Chukchi Sea -Laptev Sea during summer 340 (Fig. 7d). Both the RF and CNN models showed an over-estimation of SIC over the Chukchi Sea and Laptev Sea. This implies that both models were not able to effectively learn the speed of the drastic retreat of sea ice in that region through training ( Fig.   7e-f). Similarly, Fig. 8 shows the prediction results and errors based on the RF and the CNN models in Sep. 2012. In summer 2012, there was also a large loss of sea ice over the Beaufort Sea -Laptev Sea -Kara Sea (Fig. 8d). Both the RF and CNN models yielded over-estimations of SIC in the region between the Barents Sea and the Kara Sea. This might have been caused 345 by the fast decline of sea ice in that region because of warm seawater inflows from the Atlantic Ocean in the summer season (Schauer et al., 2002;Årthun et al., 2012;Kim et al., 2018). The results of two extreme cases showed that the prediction errors were mainly found in the regions that show high SIC anomalies (i.e., marginal ice zone with small training sample size; Fig. 7d-f and 8d-f). Together, Figs. 9 and 10 show a detailed analysis focusing on the regions containing high numbers of prediction errors in September 2007 and 2012. Interestingly in both cases, over-estimation was found in no ice zones directly neighboring the marginal sea ice zone (dotted black circle area, Figs. 9 and 10c-d). Both cases show high SST and T2m anomalies together with a low FAL anomaly, caused by a melted snow layer (Figs. 9 and 10i-k). Those anomalous patterns of SST, T2m, and 360 FAL were caused by anomalous strong solar radiation for both cases (Kauker et al., 2009;Kay et al., 2008;Parkinson and Comiso, 2012;Zhang et al., 2013). In regards to v-wind, the anomalous warm wind toward the Arctic center, inflowed by strong southerly winds driven from the Pacific water, resulted in melting in the Beaufort Sea in 2007 (Zhang et al., 2008, Fig. 8l). However, the CNN model did not catch the past negative SIC anomalies effectively. For instance, Figs. 9d and h depict overestimation errors in the northern part of the region by showing negative SIC anomalies. Similarly, Figs. 10d, g, and h 365 document over-estimations in the northern part of the region that shows negative SIC anomalies near the Barents Sea and the Kara Sea. Such over-estimation might be caused by the use of a small moving window (i.e., 11-by-11). Since the anomalies were found quite far from the marginal sea ice zone, the models were not able to predict changes in sea ice well. However, a larger window size might impede the overall performance of the model by forcing it to deal with too much learnable information in the CNN approach (Lai et al., 2015). A detailed exploration of the optimum window size is needed in future 370 research.  Table 3 shows the variable sensitivity results of both models from 2000 to 2017. The two models show SIC-related variables as the most sensitive factor, i.e. sic_1m and sic_1y, rather than other oceanic or climate variables. These results are consistent 380 for each model in the annual mean, freezing season (Dec.-Mar.), and melting season (Jun.-Sep.). As the SIC-related variables have a role regarding the time-series climatology information of sea ice, SICs themselves can affect SIC prediction in the future (Deser and Teng, 2008;Chi and Kim, 2017). Between long-term climatologies (sic_1y and ano_1y) and short-term climatologies (sic_1m and ano_1m), the former showed higher sensitivity in both models (except sic_1y and sic_1m in the RF). The previous studies have revealed the clear yearly sea ice trends of each month by investigating monthly averaged sea 385 ice extents of the nine Arctic regions and the total from 1979 (Cavalieri and Parkinson, 2012;Parkinson and Cavalieri, 2002).

Variable sensitivity
Thus, the monthly models showed long-term climatologies as more contributing factors than the other variables (i.e., SICs in past Jan. is important in the Jan. prediction model). Although long-term climatologies were important in the monthly models, the RF model identified sic_1m as the most contributing factor than sic_1y. It might be due to the limitation of the input variables of the RF model used in this study, resulting in a lack of detailed spatial information. The RF model considered 390 spatial information based on 24 additional proxies using an 11-by-11 window (i.e., mean, minimum, and maximum). However, it may not be sufficient to examine the various spatial distributions of input variables. As a result, the RF model might be highly influenced by short-term information rather than long-term variables.  Table 4 shows the variable sensitivity focusing on every September in 2000-2017, 2007, and 2012. Unlike the results in Table   3, T2m and FAL were identified as the most influencing factors in the RF model. As reported in many studies, solar radiation has a large effect on the changes in sea ice (Kang et al., 2014;Guemas et al., 2016). In addition, the ice-albedo feedback contributes to the recovery of sea ice from the losses in summer (Comiso, 2006;Tietsche et al., 2011). In the case of September 400 2007, the warm surface air temperature was the main cause of the drastic decrease of sea ice (Kauker et al., 2009). However, in the case of v-wind, a Gaussian noise made an improvement to the prediction accuracy in two extreme cases for the RF model. While there are no studies revealing the effects of v-wind in Sep. 2012, there is an indirect effect from the southerly warm wind toward the Arctic center in Sep. 2007 . Moreover, in the RF model, the degree of sensitivity of FAL is bigger in the two extreme cases than for the entire period. These pieces of evidence may point out that the RF model 405 is less robust than the CNN model to highly anomalous SIC cases. In contrast to the RF model, the CNN model consistently identified the sic_1y as the most contributing variable. Although there is no clear causality between the SICs one-year before and the anomalous decline of sea ice in Sep. 2007 and 2012, past SICs provide information on SICs in the future as time-series data (Chi and Kim, 2017).  Figure 11 shows the spatial influence of two sets of variables with impulse noise (zero values). As shown in Figs. 11b and e, the CNN model was not able to predict SICs in the existing sea ice area when using zero values for the SIC-related variables (sic_1y, sic_1m, ano_1y, and ano_1m). When the CNN model set zero values for the other environmental variables (SST, T2m, FAL, and v-wind), the model was not able to predict a decrease of SICs around the marginal areas between the sea ice and open sea ( Fig. 11c and f). It is possibly due to decays on the marginal ice zone by anomalous SST, T2m, and FAL in both 415 cases. Consistent with the results of the sensitivity analysis (Table 4), SIC-related variables were identified as important indicators to predict SICs (Deser and Teng, 2008). The other meteorological and oceanographic variables tended to affect the SIC changes of the marginal zone ice, particularly, the neighboring thin ice and no ice zone (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013).

Novelty and limitations
Our study developed a novel one-month SIC prediction model using the CNN deep learning approach. The research findings 425 from this study can make a contribution towards filling the gaps in the research on short-term sea ice change and prediction using a deep-learning approach (Grumbine, 1998;Preller and Posey, 1989). Our short-term SIC prediction model can provide valuable information, which can be used in various decision-making processes in the maritime industry and in research regarding sea ice forecasting (Schweiger and Zhang, 2015). Notably, the non-linear learning architectures of the CNN model showed good prediction accuracy based on the larger learning capacity and more consistent temporal SIC prediction than the 430 traditional machine learning approach Liu et al., 2018).
However, there are some challenging limitations to the proposed CNN model, particularly regarding the prediction variables.
First, this study did not consider the effects of a longer time scale, or persistent effects, on sea ice changes (Guemas et al., 2014). For example, the 2007 and 2012 sea ice minimums were caused by not only the anomalous warm atmospheric conditions of the summer season but also by persistently warm winter and spring seasons, which especially affected the melting in the 435 marginal ice zone (Devasthale et al., 2013). Second, the sea ice thickness is an important factor when predicting sea ice changes because the thinner sea ice is relatively vulnerable to melt (Stroeve et al., 2008;Chevallier et al., 2013;Zhang et al., 2013).
However, we did not consider sea ice thickness data because of the limited availability of reliable sea ice thickness products.
Third, there is a well-known problem with deep-learning models -interpretability. Because of complicated and non-linear connections between hidden layers, the deep learning models are hard to interpret (Koh et al., 2017;Guidotti et al., 2018). 440 Recent deep learning studies have attempted to report explainable results using various visualization approaches such as heat maps and occlusion maps (Brahimi et al., 2017;Trigueros et al., 2018). The present study explained the model using a variable sensitivity analysis, as well as the inspection of the spatial distribution. However, the model still has problems providing clear interpretations of the non-linear relationships among variables.

Conclusion 445
The main purpose of this study was to develop a novel one-month SIC prediction model using the CNN approach. The CNN model showed better prediction performance (MAE of 2.28%, ACC of 0.98, RMSE of 5.76%, nRMSE of 16.15%, and NSE of 0.97) than the persistence forecast (MAE of 4.31%, ACC of 0.95, RMSE of 10.54%, nRMSE of 29.17%, and NSE of 0.89) and RF models (MAE of 2.45%, ACC of 0.98, RMSE of 6.61%, nRMSE of 18.64%, and NSE of 0.96). The prediction accuracy in the melting season (Jun. -Sep., nRMSE of 19.09%) was lower than the freezing season (Dec. -Mar., nRMSE of 14.08%). 450 The overall prediction accuracy decreased in the more recent years because of the accelerated sea ice melting caused by global warming. In two extreme cases, the CNN model yielded promising prediction results with respect to RMSE, as well as the spatial distribution of SICs (less than 5% RMSE). The prediction errors normally occurred in the marginal ice zone, which has higher sea ice anomalies. From the variable sensitivity analysis using CNN, the SICs one-year before was identified as the most important factor in predicting sea ice changes. While the SIC-related variables had high effects on SIC prediction over 455 ice-covered areas, the other meteorological and oceanographic variables were more sensitive in predicting the SICs in marginal ice zones.