|Comments on revised version of manuscript tc-2019-144|
I thank the authors for making several substantial changes that have improved the manuscript. In particular, the change from 10- to 30-year periods and the addition of statistical tests for the ALT vs soil moisture change have reinforced the confidence of the findings the authors present. I still have a minor comment on the statistics though (see below).
Furthermore, I still haven’t been able to find enough detail explaining how climate model projections were calculated. References to earlier publications are not enough to show this. See below for that issue. Line numbers below refer to the version without track changes.
I asked in the previous review for clarification regarding the periods for which the historical forcing was repeated, and the authors referred in their response to the McGuire 2018 paper. I am well aware of that paper, as it was cited in the original manuscript, but I still think this information should go into this manuscript, as I asked in my original question. It is important to understand how the future projections were constructed, something that is presently not clear.
The reference to McGuire 2018 is not very helpful, as that paper’s methods section also does not include any detail on the repeating periods of the early 20th century forcing. There is a single sentence stating in principle the same thing as in the present manuscript, but without any further detail: “All models were driven with a common projection period forcing by applying monthly climate anomalies/scale factors from a CCSM4 simulation that included the RCP4.5 and RCP8.5 (2006–2100) and the extended concentration pathways (ECP4.5 and ECP8.5, 2101–2299) on top of repeating early 20th century reanalysis forcing”. Also, this sentence is confusing, as it talks specifically about “reanalysis” datasets, unlike the present manuscript, which talks about “forcing” and “driving” datasets, the latter of which are not all reanalysis datasets. My interpretation is that the sentence in McGuire 2018 refers to the historical forcing datasets that are mentioned later in that papers’ methods section, where Table 3 in McGuire 2016 is indicated for details.
Unfortunately, Table 3 in McGuire 2016 is also not helpful to understand the repeating periods. It does state the historical forcing dataset names, but the time periods are given only for some of the models. In any case, since the 2016 paper does not involve future projections at all, it therefore doesn’t involve any repeating of historical forcing. So – unless I have missed something – it is not possible from either McGuire 2016 or 2018 to know the periods of the different repeating historic forcing atmospheric datasets with which the common CCSM4 future projections were compared.
I find this problematic, but it should be possible to now to either add this information, for example by joining it to the listing of included historical forcing datasets in Table 1, or in a supplement, or at least explain more clearly what was done. This would substantially help interpreting how the CCSM4 future projections were calculated, both in this paper and in McGuire 2018.
On a general note, the approach used here introduces a risk of bias both due to the use of a single projections model, which could be skewed towards the high or low range of the range of climate model responses to RCP forcing, and also due to projections being compared to different historical baselines. As for the model choice, I understand this is based on the model being fit-for-purpose in terms of the high-latitude water cycle, as explained and shown in McGuire 2018, so I don’t have an issue with the specific choice of model – I just think this choice, and the choice to compare future changes that are measured against different baselines, should be motivated. The authors both in the paper and in their responses refer to previous publications describing this, but I think they could afford to spend a few lines motivating these key choices and discussing their possible influence on their results also in the present paper. The methods section as it stands is very brief (about 550 words).
137-143 The description of model-observation comparison for runoff is incomplete and ambiguous. From the results, it’s clear the authors did three things: 1) compared the pattern of modeled and observed annual discharge values for 1970-1999 (visually, Fig 6), 2) determined the correlation between modeled and observed annual discharge values (Table 3), and 3) compared the distributions of modeled and observed annual discharge values for 1970-1999 (Figure 7). The methods should describe the things the authors actually did – the present two sentences “We compared model simulations with long-term (1970-1999) mean monthly discharge data from Dai et al 2009. We computed model mean annual discharge including surface and subsurface runoff for the main river basins...” do not do this. The first sentence could just as well mean that the authors did a model-observation comparison on the long-term monthly climatology.
The figure is now clipped for some of the models so that the box plots are only partly visible; please correct.
Statistics – the Pearson r is presented. Typically Pearson r denotes a sample correlation coefficient, while Pearson rho denotes a population correlation coefficient. To me it makes sense to use the population version here, as we are not looking to estimate a population correlation from a sample, but rather trying to understand the correlation in this particular set of paired points, which should be thought of as the entire population. Of course, with 10,000 data pairs it will not make a difference, but I think the notation should be correct and correspond to the formula used to really calculate the coefficient (whether it is the sample or population correlation should be stated in the Figure caption).
214 Section 3.3 numbers – in this section, the authors present mean numbers and a range for several different quantities. Please clarify in the text what the mean and range refer to – for example, if it is standard deviation, and between what values in that case.
229-230 The statement about JULES runoff is still problematic for the same reason I originally pointed out. In their response, the authors mention precipitation and that they made no changes, but the statement I am talking about refers to runoff. I ask the authors to change this statement for clarity. The interpretation of lines 229-230 reads as JULES having the highest runoff values of all models, which is not correct. If the authors want to convey that the high runoff and precipitation changes in JULES are consistent with each other, they should say so clearly and not focus on the JULES runoff value as being the highest. If they want to mention the models at the high end of the runoff projection range (which seems more likely, given the context of the preceding sentences), they should mention that both JULES and ORCHIDEE are far above the 0.2 to 0.3 mm/day range that they mentioned in the preceding sentence, and not state that JULES has the highest value, as the one for ORCHIDEE is higher.
76-78 This sentence talks about examples of model upgrades to “soil thermal dynamics and active layer hydrology”, but the last example is termed simply “cold region hydrology”. This seems a bit backwards to me – soil thermal dynamics and active layer hydrology are subsets of cold region hydrology, not the other way around. I suggest this should be rephrased to be accurate.
80-81 “models simulations”, correct plural forms.
108 Replace “forced with a common projected climate” with “the latter forced with a common projected climate”, to clarify that it is only the future period that has a common forcing.
107-108 Similarly, I think “historic (1960-2009) and future simulations (2010-2299)” works better than the presently written “an historic (1960-2009) and future simulation (2010-2299)”, as the simulations are not strictly the same but differ between models.
117 Related to the major point about the repeating historical forcing, the phrase “overlaid by repeating historic forcing atmospheric datasets from CCSM4” sounds odd. The historic forcing atmospheric datasets were not from CCSM4, only the future climate. I suggest to rephrase this sentence to something like “Future simulations were calculated from monthly CCSM4 (Gent et al., 2011) climate anomalies for the Representative Concentration Pathway (RCP 8.5, 2006-2100) and the Extension Concentration Pathway (ECP 8.5, 2101-2299) scenarios, relative to repeating historic forcing atmospheric datasets from the different modeling groups (Table 1).”
197 I would call this a correlation or association, not a trend. As terms go, “trend” works better to denote a rate of change over time, but this sentence refers to a spatial association that is not analyzed with respect to time.
197 I suggest putting “except SIBCASA, LPJGUESS and UWVIC” in parentheses rather than between commas to avoid any ambiguity that these models are the exception from the ALT increase-soil moisture decrease relation.
258 Remove “mean” from the end of the figure caption for Figure 6.