Improved modelling of Siberian river flow through the use of an alternative frozen soil hydrology scheme in a land surface model

A parameterisation to incorporate the effects of frozen soil on modelled hydrology is described and implemented within a land surface model, the Joint UK Land Surface Environment Simulator. It is shown to generally improve the modelled flow of Siberian rivers compared to observations, specifically in seasons of freezing or thawing soil. Most noticeably, the revised model increases the snowmelt flow peak by 26–100 % compared to the control model, thereby better matching observed flows. The model physics resulting in the changes to river flow are discussed and attention is given to the effect of inaccuracies in snowfall driving data which can hinder the comparison of new model processes.


Introduction
The presence of frozen soils at northern latitudes introduces a new dimension to the modelling of soil hydrology.Its effect on moisture fluxes into and within the soil drastically complicates the system when compared to the simpler unfrozen case seen at lower latitudes.By altering absorption at the soil surface and hydraulic conductivity, permeability is reduced.The presence of frozen soil then results in distinctly different runoff which becomes apparent when considering river flow (Dunne and Black, 1971).
Siberia offers three of the world's largest rivers: the Yenisei, the Ob and the Lena.Each river channels runoff from a vast basin of two to three millions square kilometres in size, and between them this accounts for approximately 50 % of freshwater going into the Arctic Ocean (Ye et al., 2003;Yang et al., 2004).High proportions of frozen soil are found in these basins for much of the year, thereby setting them as ideal rivers to model and observe the hydrological effects of frozen soil.Improvements made to the model for this region should also make for better predictions in any region with seasonally varying frozen soil.
Much research was done during the 1990s into the effects of frozen soil, with several field projects and smallscale models developed (Stadler et al., 1997;Zhao and Gray, 1997).Gradually, the inclusion of frozen soils on the larger scale was explored and established (Cherkauer and Lettenmaier, 1999;Poutou et al., 2004).From then to current day, land-surface modellers have become aware that a range of subgrid processes have been lost in the up-scaling of earlier research; including moisture storage in surface depressions and lateral flow within grid cells (Cherkauer and Lettenmaier, 2003;Hayashia et al., 2003;Niu and Yang, 2006).In addition, the runoff characteristics associated with snowmelt are unlike those at lower latitudes, so further developments are now being made to frozen soil schemes with the view to achieve better representation of river flows in northern latitudes (Gouttevin et al., 2012).These developments are also essential to the successful modelling of the current research interest, permafrost.Niu and Yang (2006) proposed the inclusion of a parameterisation for a subgrid process that affects continental-scale soil water storage and runoff.When soils are freezing or thawing over a large area, they do so in a spatially heterogeneous manner.This may be attributed to small-scale variations in soil properties, surface topography or climatological factors.A higher proportion of the incoming moisture, either from snowmelt or rainfall, will runoff in regions with high frozen soil fractions.Niu  of the frozen soil content over the scales used in global models ignores the fact that within the grid cell "...water in impermeable areas may flow laterally to permeable areas" (Niu and Yang, 2006).The averaging and heterogeneous cases could potentially be modelled by assuming the hydraulic conductivity varied linearly with the frozen water fraction.Since the true function at a single point (Clapp and Hornberger, 1978) is non-linear and this function is commonly used in GCMs, the linear approximation of the spatially averaged behaviour needs to be parameterised.
Using the National Centre for Atmospheric Research Community Land Model version 2.0 (CLM2.0)Niu and Yang applied their own TOPMODEL-based runoff scheme, SIMTOP (Niu, et al. 2005), and investigated the incorporation of supercooled soil water and a fractional permeable area (FPA) (Niu and Yang, 2006).Further details will be given where needed, but to provide a brief overview of the three aspects: 1. Schemes based on TOPMODEL look to incorporate surface topography into the soil hydrology and in doing so use a dynamic water table (Sivapalan al., 1987;Beven 1997).
2. The principle of supercooled soil water is that, within a certain temperature range, frozen and unfrozen soils can coexist due to variations in microphysical structure.
3. FPA is calculated as an exponential function of the ice content, which acts to increase infiltration rate when compared to a linear parameterisation (Niu and Yang, 2006).
Of these aspects, supercooled soil water and a TOPMODELbased scheme have thus far been included in the Joint UK Land Surface Environment Simulator (JULES; Best et al., 2011;Clark et al., 2011).This paper will implement and test the performance of FPA in JULES.Clark and Gedney (2008) studied the behaviour of various TOPMODEL-based parameterisations, including SIMTOP, using a similar model to JULES over small basins in France.This paper will further the understanding of the two TOPMODEL-based schemes by studying higher latitude basins.Through these two objectives a sense of the versatility of SIMTOP and FPA within land-surface models will be provided.
As well as extending the work of Niu and Yang into an alternative model setting, we will verify the results in a more rigorous manner, using river gauge data and river-routing diagnostics.Additionally, we will delve with greater depth into the consequences of the soil physics formed by the newly developed model.The sensitivity of each component of the new model and the combined impact will be investigated.Finally, through the modelling of regions undergoing snowmelt events we find ourselves questioning the reliability of snowfall driving data, and this aspect is explored near the end of the paper.

The JULES model
The JULES model is described in papers by Best et al. (2011) and Clark et al. (2011).The standard configuration uses four soil layers with a total depth of 3 m.The layers increase in thickness from 10 cm at the surface to 2 m at the bottom.The way that the model deals with frozen soils is described below.Experimental changes to this carried out for this research are described in Section 2.2.Several parameterisations exist as options within the model that are relevant to this study.Two relevant choices made here are the use of the thermal conductivity scheme of Johansen (1975), and a snow scheme that included a radiative canopy with thermal capacity and representation of snow below the canopy (Essery et al., 2003).
As already mentioned, JULES contains a version of TOP-MODEL which will be investigated as part of the paper.TOPMODEL was originally introduced into JULES by Gedney and Cox (2003) to account for subgrid heterogeneity of soil moisture.It does this by using surface topography within the calculation of surface and subsurface runoff.The surface runoff of TOPMODEL allows for Dunne runoff when parts of the surface are thought to be saturated, whereas the standard model requires complete saturation of the grid cell for this type of runoff to occur.Subsurface runoff is modified using an additional deep layer beneath the standard soil column, and calculation of subsurface advection according to water table prognosis and topography.In the standard model subsurface runoff is parameterised as free drainage out of the bottom soil layer.
A summary of the way that the frozen soils affect the hydraulic conductivity is outlined below.
The diffusive vertical water flux for each layer, W (kg m −2 s −1 ), is given by the Richards equation: where K (kg m −2 s −1 ) is the hydraulic conductivity and ψ (m) is the soil water suction.
The relationship between unfrozen soil water concentration and temperature when ice is present is described in terms of the soil water suction (Miller, 1965;Black and Tice, 1989): where T m (K) is the freezing point of pure water at atmospheric pressure, T (K) is the temperature of the soil, ρ i (kg m −3 ) is the density of ice, ρ w (kg m −3 ) is the density of water, L f (s −2 ) is the latent heat of fusion, g (m s −2 ) is the acceleration due to gravity, and k is a dimensionless constant that depends on the soil.The k constant is a measure of the degree to which the adsorption of the soil dominates over the capillarity and is used as a correction factor.This typically equals 1 for clay rich soils and 2.2 for granular soils.This partial freezing of water is parameterised in the model as follows.The maximum volumetric fraction of unfrozen soil water that can exist at temperature T , where T <T m is calculated by: The actual value of θ u , the unfrozen volumetric fraction, is limited by the total water content of the soil: where θ is the "liquid" total volumetric concentration.This is the volumetric concentration that would exist if all of the soil water was in liquid form.The value for the ice fraction is therefore given as the remainder as follows: (5)

Description of changes
To test FPA and SIMTOP as suggested by Niu and Yang, there have been five distinct changes to the JULES model.These have been developed with reference to Niu and Yang (2006), Niu et al. (2005) and Lawrence et al. (2010).The principle equation to be introduced is the fractional impermeable area, where F frz is the fractional impermeable area, α is a scaledependent parameter, θ ice is the volumetric frozen soil water content, and θ sat is the saturated volumetric soil water content.The fractional impermeable area is used throughout all model modifications to account for the FPA.
There are three general areas that are considered to be dependent on the FPA: the initial absorption of moisture at the surface, vertical transfer of moisture within the soil and horizontal subsurface runoff -all of which have an impact on the subsequent modelled river flow.Though alternatives do exist in JULES's own TOPMODEL scheme, the equations representing surface absorption and subsurface runoff are taken from Niu and Yang's SIMTOP model.Firstly, the surface saturated fraction, F sat , is given by: where F frz,top is the fractional impermeable area of the model's top soil layer, F max is the maximum saturated fraction for a grid cell and is dependent on surface topography, f surf is a decay factor, and z ∇ is the grid cell mean water table depth.Secondly, the subsurface runoff which is calculated for each model soil layer below the mean water table depth is given by: where i represents the soil layer, q max is the maximum subsurface runoff, and f sub is a decay factor.The final change is to include FPA into the inter-layer moisture fluxes.There is no absolute correct way to obtain the conductivity of the soil in a finite difference scheme.Experience has shown that the modelled soil moistures adjust, leaving the final solutions insensitive to the choice.In this case, the calculation is done by calculating the flux between the two layers as though all soil water is unfrozen and then multiplying by the average FPA of the two layers.A method for doing this is to adjust the hydraulic conductivity and soil water matric potential calculated in the model to the following: and where k is the hydraulic conductivity, F frz is the average fractional impermeable area of the two layers between which flux is being calculated, k sat is the hydraulic conductivity at saturation, θ = θ ice + θ water and is the total volumetric soil water content, b is the Clapp-Hornberger exponent, ψ is the matric potential, and ψ sat is the matric potential at saturation.To implement these equations into JULES, several choices regarding parameters must be made.In order to reduce the degrees of freedom in the parameter space, F max and q max use values that are already used in the JULES TOPMODEL scheme, and it is taken that f = f surf = f sub .It was not clear from the literature whether the choice to take f surf and f sub as equivalent was used; in Niu and Yang (2005) the decay factors appear to be equal, whereas Lawrence et al. (2010) suggest they may be different when incorporated into CLM4.0.The two remaining parameters are α which is related to the FPA, and f which is related to SIMTOP.A literature review revealed that 3.0 was the suggested value for α, but values for f ranged from 0.5 to 4.0.These have been given little physical consideration within the literature, so this paper will look to initiate this approach.
Figure 1 presents the relationship between F frz , α and θ ice .Clearly, the α-dependency is quite complex, with small values of α resulting in low fractional impermeable areas when the surface layer is fully frozen, and large values resulting in no fractional impermeable area when a tenth of the surface is frozen.As a consequence the following range has been suggested: 3 ≤ α ≤ 7. The lower bound condition was that F frz must be within 5 % of 1.0 when θ ice θ sat = 1.0.The upper bound condition was that F frz must be at least 1 % of θ ice θ sat for θ ice θ sat ≥ 0.1.
For f , a similar plot (Fig. 2) has been made which illustrates the f -dependency of F sat when the mean water table www.the-cryosphere.net/6/859/2012/The Cryosphere, 6, 859-870, 2012  depth and α are constant.With all f values there is an increase in F sat to 1.0 as θ ice θ sat approaches 1.0; the main difference is their values when there is no frozen soil, and it is this point we considered when setting restrictions.Although there are many neglected topographic issues, in order to get a feel for the dependency the mean water table depth has been set to 1.0 m for the plot.It would seem reasonable that with this mean depth one would expect some surface saturation to be occurring within the grid cell but not large amounts.The lower bound condition was that there must be at least 1 % saturation and the upper bound that there can be no more that 20 %.The resulting suggested range is 1 ≤ f ≤ 6. Subsurface runoff also has a dependency on f .The maximum subsurface runoff, q max , is dependent on the water content, so it can vary greatly in value during a model run.It is therefore not possible to consider the f -dependency of subsurface runoff analytically.A series of model runs were used to verify the acceptable range of f .It was found that f ≥ 5 resulted in an unrealistic curtailment of subsurface runoff.With this in mind the range was set at 1≤f ≤4.
All integer combinations of the two parameter ranges were run over the Yenisei river basin, which is discussed in more detail later in the paper.The root mean square error (RMSE) of the modelled river flow to observed flow was calculated for each combination as well as for a control run (standard JULES run) and a TOPMODEL control run.All combinations were found to have a lower RMSE than the control run, and all but one were lower than the TOPMODEL control run.The combination with lowest error was found to be α = 3, set to 1m and α set to its chosen model value of 3.0.The blue line uses the 5 parameter while the orange and green lines use parameters that lie just outside t 6 the selected parameter space.7 8 Fig. 2. Demonstrating the effect of f on the F sat function.The maximum surface saturation value has been set to an approximate value of 0.3, mean water table depth has been set to 1 m, and α set to its chosen model value of 3.0.The blue line uses the chosen model parameter while the orange and green lines use parameters that lie just outside the extremes of the selected parameter space.f = 4.These values are within the ranges suggested in the literature and they produce reasonable results.They are used for all model runs discussed in the results section.One caveat to these assumptions is that the sensitivity of α to spatial scale has not been assessed, and as such these values are only valid for a 1°squared (∼100 km 2 ) grid.
A schematic diagram is given in Fig. 3 of the differences in soil hydrology scheme between the three models used in the project: the standard JULES model (CTRL), JULES using a TOPMODEL approach (TOP), and the modification to this second model as suggested by Niu and Yang (NEW).Some attempt to separate the effects of the two aspects of the NEW version of the model (i.e. using FPA and SIMTOP) is made in the analysis.

Application to Siberian rivers
Modelling of low and mid-latitude rivers often provides a more accurate representation of river flow than that at high latitudes.In a benchmarking of JULES river flow performance, two high latitude river basins (>50°N) and six at lower latitudes were compared (Blyth et al., 2011).When compared to observations, the relative RMSE values showed quite clearly that lower latitude flows are better represented.The high latitude basin observations of the Lena and the Mackenzie demonstrate the typical flow for their latitude: low autumn and winter rates followed by a very large summer peak.The lower latitudes show a much smoother and smaller transition between minimum and maximum flows as   Arendal, 2002).In red is the Ob, in yellow is the Yenisei, and in green is the Lena.This author has edited the colours and removed other basins that were in the image.they react to the seasonal variation in precipitation.In the higher latitudes snowmelt and frozen ground are the predominant factor defining the seasonality of river flow, and therefore the modelling of these aspects is coming under greater scrutiny as there is increasing scientific interest in northern latitude rivers.
Three Siberian rivers were chosen in order to compare modelled and observed river flow: the Ob, the Yenisei and the Lena (see Fig. 4 for the domains).Climatological monthly observations were obtained from the Global Runoff Data Centre (GRDC).The observations are means taken from at least 60 consecutive years in the range 1930-1999.Locations of the observation stations for the Yenisei, Ob and Lena are 67.48°N,86.50°E; 66.57°N, 66.53°E; and 70.70°N, 127.65°E, respectively.Human management is known to have an effect on observed river flow.Since JULES does not account for management, it is important to be aware to what extent the observational data may be affected by this.The three rivers considered here are rated by Nilsson et al. (2005) in terms of their human impact.The Yenisei is strongly influenced, since it has many large hydroelectric dams (Stueffer et al., 2011), whereas the other two are only moderately impacted.The Lena is the least affected of the three (Ye et al., 2011).
Any improvements seen over the Siberian rivers investigated here should be applicable to any regions containing frozen soil, including the rest of northern Russia, Scandinavia and Canada.

Results
The land-surface model was run offline and forced with the meteorological driving data set of Sheffield (2006).These are 3-hourly data for 1982-1999 over a 1°×1°grid with the variables of precipitation, air temperature, air humidity, air pressure, downward shortwave and longwave radiation, and wind speed.
Performance comparisons are made between the standard JULES control run (CTRL), the TOPMODEL control run (TOP) and the enhanced model run (NEW).Parameters and relevant schemes were chosen to achieve the best possible results with CTRL.Soil hydrology was parameterised within CTRL and TOP using the van Genuchten numerical scheme (van Genuchten, 1980).The NEW model used the Brooks and Corey (1964) soil water numerical scheme instead of the van Genuchten scheme.This choice was to allow direct use of the Niu and Yang developments which are based on the scheme of Brooks and Corey.An adapted version of the Niu and Yang development has been tested for use with van Genuchten; however, it does not appear to create greatly differing results.Each plot in this section provides the modelled climatological mean for the years 1989-1999, with the first seven years of driving data used as spin-up.
To analyse the model, Niu and Yang attempted to make a direct comparison of modelled runoff to observations.The approach provided useful results, but they mention that it is not ideal because consideration is not taken of the timedelay of grid cell runoff.One would expect this to affect the modelled river discharge.To overcome this problem we have included the TRIP river routing model to provide modelled www.the-cryosphere.net/6/859/2012/The Cryosphere, 6, 859-870, 2012  river flow using the standard water flow velocity of 0.5 m s −1 (Oki et al., 1999).This is a simple linear model which does not take into account all processes that may be considered important when modelling river flow.For instance, it does not include overbank flow.However, even this simple representation of the lag introduced by routing through a river network should provide a more accurate representation of the basin river flow than comparison of total grid cell runoff only.
Figure 5 presents the modelled and observed river flow data.The first three month period, January-March, shows a low flow in the observations, which is not apparent in any model runs, since there is little influx of moisture during these months.The feature may arise from human management of the rivers, which, as previously stated, is not included in the model.This is supported by the smallest observational winter flow occurring in the Lena and largest in the Yenisei, which are the least and most managed of the rivers, respectively.However, the discrepancy could be occurring because of difficulties involved in observing flow in partially frozen rivers.Another possible explanation is that the rivers experience different levels of winter freezing (e.g.Sergutin and Turutin, 1983).Overall differences are very small, so this quarter is not considered important in analysing the performance of the models.
The second period, April-June, encompasses the prevalent increase in flow due to snowmelt.Contrary to expectations, in all of the basins the NEW model displays the greatest response to snowmelt (26-100 % larger than CTRL), possibly because of higher soil saturation in parts of the basins -an examination of this has not been made.In both the Yenisei and Ob plots, TOP and NEW model runs represent the seasonality better than the CTRL run which peaks a month late.Over the Lena, all three models lack the peak arising from snowmelt; instead, their flow appears to closely follow the direct precipitation.Potential inadequacies in snowfall driving data are discussed in Sect.6.
In the final half of the year, snowmelt no longer plays a part and only direct precipitation is an influence.During period 3, July-September, the precipitation lands on largely unfrozen ground, and therefore there will be few surface effects from the NEW model on flow.However, even in this period the model performs just about as well as either of the other two models, with the exception, once again, of the Lena, for which the NEW model performs particularly poorly for the whole latter half of the year.Without correctly achieving a snowmelt peak, it is not possible to tell whether the errors are due to deficiencies in the modelling of this period or if they are an artefact from incorrect absorption of water during period 2.
The final period, October-December, still receives a small amount of direct precipitation but this time onto largely frozen ground.The NEW model is equal to or better than the other two models for this period for the Yenisei and Ob.As already mentioned, results for the Lena are difficult to interpret, but for this period CTRL best represents the observations.
To summarise, the NEW model achieves the lowest yearlyaverage RMSE on the observations over the Yenisei and Ob.It displays the best representation of the observations during periods of moisture influx onto frozen ground.All models fail to capture the expected snowmelt peak in the Lena, which could be affecting their behaviour in the latter half of the year.The possible reasons are that the main physical process for this basin is still missing from the model or driving data inaccuracies are dominating the model output.
As well as the river flow generated by the models, it is useful to compare other outputs, for instance the average total soil water content over the basin.The water storage of TOP is higher than CTRL in all three basins by 0.7-1.7 %.Water storage by the NEW model is more varied however, with what seems to be a tendency to increase in basins with lower storage such as the Ob and decrease in basins with higher storage such as the Lena.Approximately 7 % more  water is stored in the Ob with the NEW model compared to CTRL, whereas approximately 2 % less water is stored by the NEW model in the Lena.These are modest changes but they do not tell the whole story; for example, in the Lena there are local reductions by the NEW model of over 17 %.Figure 6 shows that the changes are very spatially dependent.Potential reasons for the variability could be soil type, topography, vegetation or snow cover.There are also seasonally dependent changes in total soil water with increases normally occurring more prominently during the Autumn and Winter within the NEW model.It is beyond the scope of this paper to study these variations in detail, but it suggests there is an opportunity for further study into the effect of soil physics on the storage of soil water.Muskett and Romanovsky (2009), among others, have shown the potential of a number of space-based observation platforms to help further the understanding of the water balance in this area, and this is an obvious pathway to expand on this work.

Discussion: understanding the modelled soil physics
The results section has briefly discussed some of the physics of modelling Siberian river basins.However, there is an opportunity to explore at depth the behaviour of the models during all stages of the moisture transport, i.e. at the surface, inter-layer fluxes and subsurface runoff.The Ob showed simple improvements with the NEW model, and thereby acts as a suitable location to associate changes made by the new model to their origins.We will consider the effect of the equations implemented by the new model and why they might better represent the system.
Runoff is responsible for the river flows produced by JULES.The three models vary in their method of calculation.For CTRL, a simple surface runoff method is used whereby runoff depends on the direct precipitation and the surface infiltration rate, i.e.Hortonian runoff.The remaining two models also explicitly calculate the excess runoff arising from local surface saturation depending on the grid cell topography, i.e.Dunne runoff (Dunne and Black, 1970).The NEW model uses an alternative equation to describe the Dunne runoff, which has been explained in Sect. 2. Figure 7 highlights that CTRL does not recognise that much of the snowmelt will produce Dunne runoff and therefore only captures snowmelt runoff once it has been absorbed into the soil.using a free drainage method and the other two models using a zero flux bottom boundary condition with horizontal runoff from all layers below the water table.At this point CTRL captures the peak from snowmelt for the first time, explaining why there is a delay seen in river flow output.TOP shows no peak from the snowmelt; it would be expected in June/July, but this is also when the bottom layer has its highest frozen fraction, which may hinder conductivity.The NEW model shows the least subsurface runoff of the three models, which is likely to be the reason for its extra water storage.
Moisture fluxes within the soil highlight effects of changes to the soil numerical scheme.Figure 8 shows that once the incoming water has been calculated, TOP and CTRL transport water through the soil in much the same way.This is expected since both are using the van Genuchten numerical scheme.The NEW model demonstrates a faster transport of water through the soil to the bottom layer; the snowmelt becomes evident in the flux to the bottom layer as the ground at this depth starts thawing.This flux occurs about three quarters of a month earlier than in the other two models which only begin to show a flux in the first week of April.This is despite all models diagnosing the ground to be thawing at approximately the same time.Therefore, we attribute the faster flow of the FPA adjustment to hydraulic conductivity.Greater flux within the NEW model's deeper layers during November and December as the soil begins to freeze further supports this conclusion.
To summarise, in this section it has been shown, using the example of the Ob basin, that the variations in methods used to model soil water physics have important effects on runoff and water storage.We have learnt that CTRL does not capture much of the snowmelt runoff at the surface, only at the Table 1.Comparison of GSWP2 driving snowfall data to calculated snowfall data derived from Princeton precipitation data.The comparison is only made between 1 st December and 31 st March since outside of this period there is much rainfall.Therefore, the averages are not representative of the whole year since seasonality of snowfall varies between basins.

Basin
Average % difference in Princeton average calculated snowfall Number of years snowfall of GSWP2 (±0.5×10 −6 ) (×10 −6 kg m −2 s −1 ) (±0.5×10 subsurface due to increased absorption.This results in an unrealistic delay in summer river flow.Below the surface the NEW model transports moisture downwards at a much faster rate due to the FPA adjustment to hydraulic conductivity, and it exhibits much lower subsurface runoff which is most likely due to the SIMTOP subsurface runoff equation.
6 Discussion: sensitivity to snowfall driving data River flow comparisons in section 4 display systemic underestimation of the snowmelt peak.The fact that this underestimation occurs across all the models suggests it is either an issue arising from parts of the model physics not under investigation in this paper or it is a driving data issue.
Snowfall measurements are notoriously difficult to obtain (Larson and Peck, 1974;Zaitchik and Rodell, 2008;Clifford, 2010).For manual ground measurements, access is often required to remote, inhospitable locations.For automatic ground measurements, problems such as wind and varying snow densities create complex uncertainty in the data.For satellite measurements, cloud cover and unknown mixtures of ice, snow, water and vegetation make accurate and continuous estimation complicated.Snowfall data assimilation algorithms attempt to account for these inaccuracies but do not completely remove uncertainty.
To get a sense of this uncertainty, a comparison has been made between the driving data used for this paper, Princeton (Sheffield et al., 2006), and another data set from the Global Soil Wetness Project 2 (GSWP2) (Zhao and Dirmeyer, 2003).Precipitation data used to form GSPW2 form a hybrid product using GPCC and GPCP data sets, whereas the Princeton uses GPCP and the TRMM 3-hourly data sets.The overlapping years where a comparison could be made were 1982-95.Princeton data provide total precipitation, so a mechanism was needed to only consider months for which all precipitation was snow.This was done by checking if GSWP2 rainfall was negligible for each given winter (only December to March was looked at since outside these points there was a lot of rain/snow mix).If rainfall was seen as significant, then the year was not used in the average.The difference in snowfall between each data set was totalled up for all reliable years and calculated as a percentage of the Princeton snowfall.The results are summarised in Table 1.
The results for the Yenisei and Ob show a sizeable difference in estimation, but when all factors are considered, such as the incorporation of additional sources into GSWP2, then this difference would be expected.However, a very large difference over the Lena of almost 20 % signifies a clear mis-match of calculation of the driving data sets in this basin.Furthermore, this difference is between two similarly sourced sets; between two completely independent sets, one would predict the potential for an even greater difference.It is telling that the largest difference in these driving data sets corresponds to the most poorly modelled of the three basins in this paper.
Having recognized that the size of uncertainties in snowfall data links to how well the models have represented river flow in each location, we take a more direct approach to test the sensitivity.A simple scaling factor was applied to the model snowfall driving data at each time step.The scaling factor was constant spatially and temporally, thereby only providing a general feel for the effect of potential snowfall inaccuracies.With comparison to the European Space Agency (ESA) snow water equivalent (SWE) product, GlobSnow (Takala et al., 2011), we were able to establish an average scaling factor for the Ob and the Lena basins.GlobSnow is produced using a combination of satellite-based microwave radiometer and ground-based weather station data.It was used in the calculation of an adjustment factor by taking the regression line gradient of a set of Globsnow/Princeton SWE points intercepting at the origin, where the SWE associated with Princeton was the outputted SWE from JULES driven with Princeton.The points were taken as the January SWE values from all points in the basin for all years from 1983-2008, giving over 9000 points for each basin.SWE does not directly map to snowfall since it undergoes melting and sublimation once on the ground, but it is assumed here that snowfall is the dominant factor in SWE errors and therefore the same factor is applied.This assumption is supported by the analysis of Roesch (2006) who found that in climate models there was a strong positive trend between snowfall and SWE biases whereas there was no clear trend between surface temperature and SWE biases.
The factors calculated were 1.36 for the Ob and 1.71 for the Lena, each with a standard error of approximately ±0.01 (S. Hancock, personal communication, 2011).These have  been rounded to 1.35 and 1.70 for use.It is promising that the factor is greatest for the Lena, as suggested by the comparison made with the GSWP2 data.Figure 9 displays the river flow results using the Globsnow-factored snowfall in each location.
As expected, the increased snowfall has produced a much larger peak runoff in the summer for all models.This has led to reduced RMSE values for all models over both basins.Timing of CTRL has improved slightly over the Lena, but this is still the worst performing model.Overall the TOP and NEW models perform similarly to one another over the Lena, but the NEW model is still slightly better in each case due to a better representation of the winter flow.
The NEW model is correctly estimating the peak at the Ob but shows inaccuracy in the general shape, thereby drawing attention to the assumptions made in the scaling factors.It will be necessary in further studies to consider at least the spatial variability of snowfall driving data inaccuracies and the different timing of melt across the catchments, since these will be likely to affect river-flow if routing diagnostics are being used.For instance, if the majority of the increase occurs near the estuary then one would expect an earlier and steeper peak which would possibly improve the modelled Lena flow.This is just a hypothesis and needs to be tested; it is likely that other factors are involved in the inaccuracies as well as that of snowfall driving data.
The improvement of the FPA on modelling of the Ob may partly explain the features noted to be lacking from the new frozen soil scheme implemented by Gouttevin et al. (2012), but which was not found to be important over the Lena.The Fig. 9 run over the Lena clearly shows that use of a FPA is not an important consideration in this region since NEW performs much the same as TOP.
For now we conclude that it is highly likely that there are significant inaccuracies in snowfall driving data at high latitudes.This was established through comparison of two data sets.The greatest of these inaccuracies occurs within the Lena basin and the smallest within the Ob.We also conclude that modelled Siberian river flow is highly sensitive to the snowfall driving data, and in some cases such as the Lena, driving data inaccuracies make it difficult to compare alternative frozen soil schemes.Gouttevin et al. (2012) have also shown soil temperature to be sensitive to snowfall inaccuracies.Once a snowfall correction had been made, it seemed that the FPA was the dominant process in the Ob but not in the Lena.There may be a need for better understanding of absorption in permafrost regions and effects of the frozen soil active layer to further improve modelling of these regions.

Conclusion
The frozen soil parameterisation and TOPMODEL-based scheme developed by Niu and Yang (2006) has been shown, using a river-routing model, to improve the modelled river flow of two out of three Siberian basins.It has been demonstrated that there is a high sensitivity among the higher latitude basins to snowfall driving data, and an example comparison between two driving data sets has highlighted that the error is greatest over the basin that shows worsened results under the development.An attempt was made to correct for the potential inaccuracies using an observational product.The results then suggested that the Niu and Yang development provides the best representation.Therefore, we conclude that generally the developed model improves Siberian river flow, though in some cases other factors reduce the clarity of this conclusion.
Three aspects have been studied during this investigation: the standard JULES model (CTRL), TOPMODEL (TOP), and the fractional permeable area (NEW).The standard runs (CTRL) generally performed the worst, suggesting that the additional features being explored are important to the system.Introducing a TOPMODEL (TOP) approach adjusted for the lack of surface runoff arising from snowmelt in the standard model.Within the NEW model runs, the use of SIMTOP appeared to manifest mainly as changes in subsurface runoff and thereby affected the water storage of the soil column.The fractional permeable area (FPA) appeared to dominate the moisture fluxes within the soil, increasing downward flow during freeze/thaw periods.
This paper has provided evidence for the improvement to Siberian river flow through use of the discussed frozen soil scheme.The evidence supports the work of Niu and Yang (2006) by successfully applying the concept in an alternative setting.It furthers their work through a more in-depth discussion of the physical nature of parameters introduced and the consequences to modelled soil physics.
Having demonstrated that the parameterisation is appropriate for Siberian basins, the method can be used to study regions such as Scandinavia or Canada.Through obtaining useful results using the ESA snow product, a broader use for the product can now be envisaged by which a study into spatial or temporal effects of snowfall driving data biases may be undertaken.Finally, the paper has highlighted that the Ob shows a greater dependency on the fractional permeable area parameterisation than the Lena.The comparison offers an opportunity to establish the subtleties of river flows in higher latitude basins.
Edited by: T. Zhang and Yang suggest that the averaging Published by Copernicus Publications on behalf of the European Geosciences Union.D. L. Finney et al.: Improved modelling of Siberian river flow

Figure 1 .
Figure 1.Demonstrating the effect of α on the frz F function.The dotted line shows the actual frozen fraction of the soil, the coloured lines show the effective frozen fraction of Niu and Yang.The blue line uses the chosen model parameter while the orange and green lines use parameters that lie just outside the extremes of the selected parameter space.

Fig. 1 .
Fig. 1.Demonstrating the effect of α on the F frz function.The dotted line shows the actual frozen fraction of the soil, the coloured lines show the effective frozen fraction of Niu and Yang.The blue line uses the chosen model parameter while the orange and green lines use parameters that lie just outside the extremes of the selected parameter space.
Figure 2. Demonstrating the effect of f on the sat F function.The max 3 saturation value has been set to an approximate value of 0.3, mean water table 4 Figure 3.A schematic diagram outlining the key differences between the models used in the study.'Model A' is the standard JULES structure, 'Model B' uses an option within JULES to include a water table and topographic effects on hydrology (TOPMODEL), and 'Model C' further modifies TOPMODEL to include FPA and SIMTOP.SIMTOP is not described in the diagram because it has the same structure to TOPMODEL but uses different equations as described in section 2.

Fig. 3 . 2 Figure 4 . 6 Fig. 4 .
Fig. 3.A schematic diagram outlining the key differences between the models used in the study."Model A" is the standard JULES structure, "Model B" uses an option within JULES to include a water table and topographic effects on hydrology (TOPMODEL), and "Model C" further modifies TOPMODEL to include FPA and SIM-TOP.SIMTOP is not described in the diagram because it has the same structure to TOPMODEL but uses different equations as described in Sect. 2.

Figure 5 .
Figure 5. Observed river flow of the Yenisei, Ob and Lena compared to model runs for each using CTRL, TOP and NEW.The points represent the average flow rate for a given month.Each observational point is a climatological average value of 60 consecutive years in the range 1930-99, depending on the basin.The model points are a climatological average value for the decade 1989-99.The RMSE values are the yearly-average values.

Fig. 5 .
Fig. 5. Observed river flow of the Yenisei, Ob and Lena compared to model runs for each using CTRL, TOP and NEW.The points represent the average flow rate for a given month.Each observational point is a climatological average value of 60 consecutive years in the range 1930-1999, depending on the basin.The model points are a climatological average value for the decade 1989-1999.The RMSE values are the yearly-average values.

28Figure 6 .
Figure 6.Total soil moisture comparison over the Lena basin between CTRL and NEW.The average difference in total soil moisture over the decade 1989-1999 between the NEW and CTRL models.

Fig. 6 .
Fig. 6.Total soil moisture comparison over the Lena basin between CTRL and NEW.The average difference in total soil moisture over the decade 1989-1999 between the NEW and CTRL models.

2Fig. 8 .
Fig. 8. Daily average downward moisture fluxes within the soil column over the Ob.A plot of climatological area-average values over model runs for 1989-1999.

Figure 9 .
Figure 9. Modelled river flow using Globsnow-factored driving snowfall data compared to 3 observations.Scaling factors used were: 1.35 for the Ob and 1.7 for the Lena.The points 4 represent the average flow rate for a given month.Each observational point is a climatological 5 average value of 60 consecutive years in the range 1930-99, depending on the basin.The 6 model points are a climatological average value for the decade 1989-99.The RMSE values 7 are the yearly-average values to 3 significant figures.8

Fig. 9 .
Fig. 9. Modelled river flow using Globsnow-factored driving snowfall data compared to observations.Scaling factors used were: 1.35 for the Ob and 1.7 for the Lena.The points represent the average flow rate for a given month.Each observational point is a climatological average value of 60 consecutive years in the range 1930-1999, depending on the basin.The model points are a climatological average value for the decade 1989-1999.The RMSE values are the yearly-average values for 3 significant figures.