Learning to filter: snow data assimilation using a Long Short-Term Memory network

Blandini, Giulia; Avanzi, Francesco; Campo, Lorenzo; Gabellani, Simone; Aalstad, Kristoffer; Girotto, Manuela; Yamaguchi, Satoru; Hirashima, Hiroyuki; Ferraris, Luca

doi:https://doi.org/10.5194/tc-19-4759-2025

Articles | Volume 19, issue 10

https://doi.org/10.5194/tc-19-4759-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/tc-19-4759-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 19, issue 10

Research article

|

21 Oct 2025

Research article |

| 21 Oct 2025

Learning to filter: snow data assimilation using a Long Short-Term Memory network

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

Download

Final revised paper (published on 21 Oct 2025)
Preprint (discussion started on 12 Feb 2025)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-423', Anonymous Referee #1, 13 Mar 2025

This work developed a surrogate for EnKF-DA using an LSTM network. The introduction and methods sections are well written and structured. However, there are several errors in the results that are inconsistent with the plots. More importantly, the results lack sufficient explanation and analysis regarding why the LSTM performs differently from EnKF at different sites or scenarios. The discussion could benefit from additional comparisons with previous studies and a deeper analysis of the results. Currently, it leans more toward reinforcing the need for LSTM in data assimilation, which somewhat repeats points already made in the introduction. Therefore, I recommend a major revision before publication.
Line 103-105: What is the source of the meteorological forcing data? Are they derived from gridded datasets?
Table 2: The data time span for each site needs to be mentioned.
Line 171: Forecasted model state is x_k^f
Line 254: Double “the”
Line 271: “predictions”
Line 277-278: Please clarify how the data were split: by individual data points or by continuous time spans?
Line 276: Please clarify what are site-specific limits here
Line 280: The inline formula here should not include 'star,' as 'star' was previously used to represent the LSTM output, not the input from S3M. Please keep consistent.
Line 288-290: Please use a formula to clarify this configuration. Do you mean that x^f and forcing at both time steps k and k-1 are used as LSTM inputs in the second test? Please refer to Figure 2 for clarity.
Line 292-294: This part is confusing. What is the difference between Configuration 1 and Configuration 3? Was a single LSTM selected from Configuration 1 and then applied to other sites? Please clarify.
Line 299-300: Is there a specific reason to randomly sample water years for data splitting rather than using a continuous historical time span to train the model and a continuous future time span to test it? Random sampling can create artificially easier test conditions by allowing test data (time period) to fall between training water years, which may provide the model with indirect information about future conditions.
LSTM structure and hyperparameters were not mentioned in this work.
Line 309-311 (Figure 3): Is this result from testing or operational testing? Please clarify
Line 313-314: It is somewhat difficult to distinguish the EnKF-DA and LSTM boxes in the plots. If the last box in each panel represents LSTM-DA, it suggests that the RMSE values of LSTM-DA for KHT, RME, and FMI-ARC increased compared to EnKF-DA, with KHT showing the largest increase. This appears inconsistent with the narrative presented here. Please check.
Figure 3&4: The Nash-Sutcliffe coefficient can be used as a score to evaluate the accuracy of the time series in (a)–(d).
Line 321-324: Why is the LSTM trained with outputs (states) from EnKF-DA more sensitive to the sparsity of observation data? Could you explain this here? Including observation data as an input may introduce artificial errors when filling in missing data in the input.
Line 336-337: Only Figure 5b shows improvement with memory component, rather than c and d
Line 342-348: Cite Figure 6 here.
Line 344: 0.5 m? The reduction shown in figure 6f is not that large.
Line 346: These strategies were not mentioned and explained in the method.
Section 3.3: This result does not seem meaningful, as the spatial transferability of all models appears to be poor. Please consider removing it.
Line 370-371: Any explanation for this result?
Section 3.4: Instead of presenting the spatial transferability of a single model, it might be more meaningful to compare and discuss the site-specific LSTM and the multi-site LSTM.
Please refer (this is not my work and no need to cite it.): Kratzert, Frederik, Martin Gauch, Daniel Klotz, and Grey Nearing. "HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin." Hydrology and Earth System Sciences 28, no. 17 (2024): 4187-4201.
Line 410: No results were shown to support this.
Line 415: 7 sites?

Citation: https://doi.org/10.5194/egusphere-2025-423-RC1
- AC1: 'Reply on RC1', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for their helpful comments. We appreciate the positive feedback on the introduction and methods, and we acknowledge the concerns raised. To improve our manuscript clarity and coherence we will modify part of the results sections and update Figures 3–6 to better align with the text, and we will add additional clarification around the structure of the LSTM algorithm.
  See our point-by-point reply in the attached pdf.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC1
RC2:
'Comment on egusphere-2025-423', Anonymous Referee #2, 31 Mar 2025
The paper “Learning to filter: Snow data assimilation using a Long Short-Term Memory network” presents a novel framework for snowpack prediction combining physical-based model and machine learning model. It could be a great fit for the journal. However, there are several aspects of the experimental setup and methodology that would benefit from additional clarification. I encourage the authors to provide more detailed descriptions of their experiments to enhance the transparency and reproducibility of the study. Please see my comments below.
The overall data samples are limited (both years and sites), compared to https://journals.ametsoc.org/view/journals/hydr/25/1/JHM-D-22-0220.1.xml. Could the authors comment on this issue?

What is the temporal frequency of S3M? Is it 1 hour (Line 121)?

What is the input time window size for the LSTM model? If my understanding is correct, only one timestep of meteorological forcings are used as input, based on Fig. 2 and Line 269. This is not a typical use of the LSTM model if multi-time steps are not involved. The architecture of the LSTM model also requires more details (e.g., hidden layers, hidden units).

Related to the previous comment, please clarify the “memory components” of the LSTM model. By design, the previous time series should be used as inputs to the LSTM model. What is the model without these “memory components”? If this is beneficial, do the authors consider incorporating more previous timesteps?

Loss function. As noted in Line 246, the output of negative SWE is forced back to zero, why is the regularization term still necessary in Line 260? Is the hard cut at zero only applied after training the model?

Multisite LSTM. Do the authors consider the use of site-specific information as inputs (e.g., lat-lon, slope https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR035009; https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021WR031033)

Line 326. “Reduce” RMSE by “-25” seems to increase RMSE for me. Please consider rephrasing it.

Figure 5. Why is the RMSE for “open loop” not shown here?

There are some caption inconsistencies. Please take time and revise them (e.g., the capital letters in Figure 8 caption)

Figure 8. Is there any particular reason to assess the performance based on different water year types? A similar and consistent RMSE as previous experiments would be helpful.
Citation: https://doi.org/10.5194/egusphere-2025-423-RC2
- AC2: 'Reply on RC2', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for the positive evaluation of our manuscript and we acknowledge the need for additional clarification needed to sustain transparency and reproducibility of our study. We plan to improve the quality thanks to the useful feedback received.
  See our point-by-point reply in the attached pdf. Please note that we changed the format of the first comment https as it was causing problems on the text editor we used.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (11 May 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (29 May 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (02 Jun 2025) by Nora Helbig

RR by Anonymous Referee #2 (11 Jun 2025)

RR by Anonymous Referee #1 (12 Jun 2025)

RR by Anonymous Referee #3 (02 Jul 2025)

ED: Publish subject to revisions (further review by editor and referees) (03 Jul 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (20 Aug 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (21 Aug 2025) by Nora Helbig

RR by Anonymous Referee #3 (04 Sep 2025)

ED: Publish as is (04 Sep 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (11 Sep 2025)

Short summary

Reliable snow water equivalent and snow depth estimates are key for water management in snow regions. To tackle computational challenges in data assimilation, we propose a Long Short-Term Memory neural network for operational use in snow hydrology. Once trained, it reduces computational cost by 70 percent compared to the Ensemble Kalman Filter, with a slight decrease in performances. This deep learning approach provides a scalable, efficient, and cost-effective modeling solution for hydrology.

Learning to filter: snow data assimilation using a Long Short-Term Memory network

Download

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection