Comment on tc-2021-194

The paper deals with using deep earning models (e.g., LSTM, CNN, and CNN-LSTM) to predict ice jams (jam or no jam) for all the rivers in Quebec. Several hydro-meteorological variables including liquid precipitation (mm), min and max temperature (°C), AFDD (from August 1st; °C), ATDD (from January 1st; °C), snow depth (cm) and net radiation (W m-2) were used as inputs to classify jam or no jam occurrence in the rivers. The dataset was divided into training and validation sets and after constructing and testing the developed models, statistical metrics were considered to assess the performance of the models. It was shown that the hybrid CNN-LSTM model outperforms among developed models.

The paper deals with using deep earning models (e.g., LSTM, CNN, and CNN-LSTM) to predict ice jams (jam or no jam) for all the rivers in Quebec. Several hydro-meteorological variables including liquid precipitation (mm), min and max temperature (°C), AFDD (from August 1st; °C), ATDD (from January 1st; °C), snow depth (cm) and net radiation (W m-2) were used as inputs to classify jam or no jam occurrence in the rivers. The dataset was divided into training and validation sets and after constructing and testing the developed models, statistical metrics were considered to assess the performance of the models. It was shown that the hybrid CNN-LSTM model outperforms among developed models.
Despite the logical and valuable results, the paper needs to be organized better. It is somewhat advisable to move some of the results and discussion to the materials and method section. I suggest that the authors broaden their discussion and don't just report performance data on models. A significant revision needs to be made to the conclusion. Overall, in my view, the paper needs a major revision.
Comments: 1) Abstract should be most informative. The abstract discusses ice jam and prediction necessity in the first half. The reader should also be able to obtain information from the data, modeling process, and validation metrics.
3) Provide a reference for empirical and statistical prediction methods (threshold methods, multi-regression models, logistic regression models, and discriminant function analysis) 4) Introduce "involved hydro-meteorological variables" in line 24. 5) Why did you choose deep learning over other machine learning models for ice jam prediction? Is that all due to an automatic feature selection? Clarification is needed. 6) There is a lot of focus on time series predictions in the literature review while you should be more specific about ice jam prediction. The literature should at least include data-driven models for predicting ice jams 7) In lines 97-98: the authors state "Deep learning methods are promising to address the requirements of ice jam predictions." Is there any research to use deep learning for ice jam prediction? If so, what is the contribution to the current research 8) Although there are several deep learning methods, why did you select CNN, LSTM, and a combined CNN-LSTM?
9) The authors consider 0 value for "NaN" precipitation values. 0 value means there is no precipitation. However, there might be precipitation and there is no measurement. Since the modeling is based on time series, it is better to impute missing values instead of considering 0 values.
10) The text contains several typos 11) Provide a reference for the statement "The most popular deep neural networks for TSC are MLP, CNNs, and LSTM." 12) The sub-heading "Input data and study area" is not appropriate. The reader may think you mean input variables of the model. Sub-heading "Data and study area" should be suitable.
13) Detailed information, at least as a statistical description, should be provided for the data used in the study. 14) Provide a reference "As a benchmark, a CNN model with the parameters and layers similar to previous studies is developed." 15) Section 2.5.1. Overcome overfitting is too long. It needs to be shortened 16) It is not clear how you optimize the structure of the model. Did you use any hyperparameter tuning method e.g. GridSearch, random search, Bayesian optimization? Or only a trial and errors approach?
17) The training chapter "2.5 Training" has focused only on the general statements of the modeling. The authors should give more information about the modeling process, optimum values for the parameters, loos function, etc. 24) The mean error of training in LSTM is much higher than validation (middle plot in figure 16) which might be a sign of overfitting.