Stochastic modelling of thermokarst lakes: size distributions and dynamic regimes

Reinken, Constanze; Brovkin, Victor; de Vrese, Philipp; Nitze, Ingmar; Bergstedt, Helena; Grosse, Guido

doi:10.5194/tc-20-1967-2026

Articles | Volume 20, issue 4

https://doi.org/10.5194/tc-20-1967-2026

Articles | Volume 20, issue 4

Research article

10 Apr 2026

Research article |

| 10 Apr 2026

Stochastic modelling of thermokarst lakes: size distributions and dynamic regimes

Constanze Reinken, Victor Brovkin, Philipp de Vrese, Ingmar Nitze, Helena Bergstedt, and Guido Grosse

Abstract

Thermokarst lakes are among the most common and dynamic landscape features in ice-rich lowland permafrost regions. They influence carbon, water and energy fluxes between atmosphere and land surface, and are an important component of Arctic lowland hydrology. Despite their significant role in the climate system, thermokarst lakes are only rudimentarily or not at all represented in Earth system models (ESMs). Attempts at stand-alone modelling of their dynamics have mostly been limited to the scale of individual lakes. Because lake formation, expansion, and drainage depend on small-scale surface and sub-surface heterogeneities that are difficult to measure, a deterministic modelling-approach would be a challenge at the regional or pan-Arctic scale. We therefore treat these processes as probabilistic across a landscape and create a model of thermokarst lake dynamics using stochastic approaches. With the inclusion of stochasticity and volatility, our method allows us to account for the diversity of individual lake behaviour that results from the small-scale differences in environmental conditions. We present idealized simulations and, additionally, test novel high-resolution remote sensing data products that capture annual lake areas for model initialization and the calibration of inherent or climate-induced lake dynamics. Our model is able to capture three plausible regimes by incorporating the main processes behind thermokarst lake dynamics and represents a new step towards stochastic representation of permafrost landscapes in ESMs. Furthermore, our findings emphasize the importance of continuing the retrieval of remote sensing data for model parameterization and acquiring additional data products containing information on past thermokarst lake behaviour.

Download & links

Article (PDF, 6705 KB)

Download & links

How to cite.

Received: 16 Apr 2025 – Discussion started: 19 Jun 2025 – Revised: 25 Feb 2026 – Accepted: 16 Mar 2026 – Published: 10 Apr 2026

1 Introduction

It is estimated that permafrost underlies about 13 %–18 % of the northern hemisphere’s unglaciated surface (Zhang et al., 2000), and that roughly 1500±100 gigatons of carbon are stored across all permafrost regions (Meredith et al., 2019). This carbon can become subject to increased biodegradation upon warming and subsequent permafrost thaw. Through this process, it is partly released into the atmosphere, thereby increasing the greenhouse effect. This so-called permafrost carbon feedback is especially relevant considering that the Arctic is warming up to four times faster than the rest of the globe. However, it is either inadequately or not at all represented in current Earth system models (ESMs), as they miss essential processes and feedbacks of permafrost landscapes such as abrupt thaw (Schädel et al., 2024; Matthes et al., 2025; Brovkin et al., 2025). Using inventory models, Turetsky et al. (2020) found that carbon emissions through abrupt thaw could increase permafrost carbon emissions by 40 %, which means that current ESMs are likely underestimating the climate effects of degrading permafrost.

A major abrupt thaw process is so-called thermokarst, which is the process of landscape changes due to melting of ground ice and resulting ground subsidence. An estimated 20 %–40 % of the northern permafrost domain is covered by thermokarst-affected landscapes (Olefeldt et al., 2016; in 't Zandt et al., 2020). One prominent product of these landscape changes are thermokarst lakes, that form when water pools in thermokarst depressions. The earliest of these lakes can be traced back to the Late Glacial transition time at the end of the Late Pleistocene and the early Holocene between circa 14–10 ka BP, during which there was a strong warming and wetting trend in the Arctic (Mann et al., 2002; Velichko et al., 2002; Kaufman, 2004; Grosse et al., 2013; Brosius et al., 2021). A synthesis of lake initiation records found two peaks in lake formation rates at 13.2 and 10.4 ka BP (Brosius et al., 2021). The process of depression and lake formation has continued to occur since then and current global warming trends could lead to its acceleration.

Due to thermal erosion along their shorelines, thermokarst lakes expand radially over time. Accordingly, their shapes are often highly circular. However, due to interactions with ground ice distribution and preferential wind patterns as a main driver for waves and currents (Côté and Burn, 2002), there are also regions with oriented thermokarst lakes that have been described as elliptical, triangular, rectangular and egg-, clam- or D-shaped (Grosse et al., 2013; Hinkel et al., 2005; Morgenstern et al., 2011).

Thermokarst lakes can also grow vertically (Fedorov et al., 2014; in 't Zandt et al., 2020). Because of the lower albedo, higher absorption of long-wave radiation, and higher heat storage capacity of water in comparison to ice or dry ground, they often accelerate the thawing of surrounding permafrost, including potential melting of additional ground ice (Hopkins, 1949; Mackay, 1992; Plug and West, 2009; Jorgenson et al., 2010; Grosse et al., 2013; in 't Zandt et al., 2020). If they are deep enough not to freeze to the bottom in winter, they are typically underlain by a so-called talik, which is a layer of year-round unfrozen ground (Lachenbruch et al., 1962; Mackay, 1992; West and Plug, 2008; in 't Zandt et al., 2020). The number of these floating-ice lakes have increased with climate change, which means that the effect on permafrost thaw will likely become stronger in the future (Arp et al., 2018, 2015; Surdu et al., 2014).

Thermokarst lakes can drain gradually or abruptly in a matter of only a few days due to a variety of triggers and mechanisms that depend on climatic and environmental conditions. Drainage can generally be categorized into lateral and internal drainage, with internal drainage happening through sub-surface drainage networks once the lake talik penetrates the permafrost layer (Hopkins, 1949; Mackay, 1988). This type is therefore more common in discontinuous permafrost (Chen et al., 2023). A growing lake can increase its likelihood of internal drainage by creating sub-surface drainage channels through facilitating permafrost degradation (Yoshikawa and Hinzman, 2003; Rowland et al., 2011). Lateral drainage can occur when a lake expands and either a topographic drainage gradient is reached, a newly formed channel network from degrading polygonal ice wedges or thermo-erosion allows seepage, or a positive water balance leads to bank overflow (Mackay, 1988; Hinkel et al., 2007). New surface drainage channels can be created by external factors, such as coastal erosion and tapping by a river, stream or other lake (Mackay, 1988; Hinkel et al., 2007; Arp et al., 2010; Jones et al., 2020). High lake drainage rates have also been linked to a combination of abundant rain- and snowfall and extremely warm mean annual air temperatures (Nitze et al., 2020). Lake areas can also display seasonal changes due to substantial snow-meltwater influx, full or episodic connectivity to fluvial systems, or seasonal changes in the evaporation-precipitation ratio (Cooley et al., 2019; Mullen et al., 2023).

Despite these numerous drainage mechanisms, some lakes have persisted for several thousands of years (Edwards et al., 2016; Bouchard et al., 2017; Anderson et al., 2019) and in some cases, have reached surface areas of more than a hundred square kilometres through lake expansion and merging with other lakes (Weller and Derksen, 1979; Grosse et al., 2013).

Many of the regions with a high number of lakes are lowland regions, typically with a ground ice content of 30 % or more (Jones et al., 2022). Among others, they include the large Arctic river deltas, the Alaskan North Slope as well as the coastal lowlands of the Kara, Laptev, and East Siberian seas (in 't Zandt et al., 2020). Some of these areas are underlain by Yedoma deposits that originated in the late Pleistocene (Strauss et al., 2017). Despite being especially ice-rich, Yedoma deposits also have an especially high organic carbon content ranging from 2 % to 5 %, which adds to the relevance of thermokarst lakes for global carbon emissions (Walter Anthony et al., 2018; Freitas et al., 2025). The accelerated permafrost degradation underneath them can make them significant sources of carbon. Largely anaerobic sediment conditions facilitate biodegradation of organic carbon that produces methane, an especially potent greenhouse gas (e.g. Heslop et al., 2020). Consequently, thermokarst lakes often display larger methane emissions in comparison to other types of Arctic lakes or the surrounding land surface (Wik et al., 2016; Walter Anthony et al., 2018).

Thermokarst lakes can alter local and global climate not only through their impact on the carbon cycle. They also influence energy and water fluxes between land surface and atmosphere as well as surface and subsurface hydrology. Waterbodies typically lead to higher evaporation rates, decreased surface roughness and enhanced diurnal and seasonal cycles of heat storage and release by the land. By altering water storage and drainage networks, they can impact the timing of surface runoff into rivers, streams and the ocean. Using the MPI Earth System Model (MPI-ESM), de Vrese et al. (2023) showed that such changes to energy and water cycling could have effects beyond permafrost regions through biogeophyscial teleconnections. The main biogeophysical effect behind a drying Arctic is likely a decrease in summertime cloud cover. This so-called permafrost cloud feedback could amplify global warming (de Vrese et al., 2024).

In turn, local climate influences the behaviour of thermokarst lakes. However, it is not entirely clear how their distributions will develop under global warming. Several studies have tried to obtain past trends of lake areas in regions across the Arctic using satellite images, with studies observing both negative (Jones et al., 2011; Nitze et al., 2017; Lara and Chipman, 2021) and positive (Nitze et al., 2017; Lara and Chipman, 2021; Veremeeva et al., 2021; Luo et al., 2022; Zhou et al., 2024) trends. Recent satellite data analysis by Webb et al. (2022) and Chen et al. (2023) have found a large-scale drying trend since the year 2000 across lake-rich regions in the northern permafrost domain. According to results from Webb and Liljedahl (2023), decreasing lake area is especially dominant in discontinuous permafrost zones, while continuous permafrost zones show similar numbers of sites with increasing and sites with decreasing lake area. The study also found no proof of a relationship between precipitation-evaporation balance and lake area trends, thereby supporting the hypothesis that the primary driver of Arctic lake area change is permafrost thaw. However, it has been questioned that coarse, kilometre-scale remote sensing data, as used by Webb and Liljedahl (2023), is capable of correctly representing regional scale landscape developments (Olthof et al., 2023; Webb et al., 2023). Nitze et al. (2020) detected an increase in lake drainage events linked to abundant rain- and snowfall and extremely warm mean annual air temperatures in Northwestern Alaska, and suggest that this connection could also be explained by permafrost thaw around the lake margins as a result of these weather phenomena.

Despite the abundance of remote sensing analysis, stand-alone models have been limited to the scale of individual or only a few lakes, even though the region-wide development is of relevance for regional and global climate feedbacks. Ling and Zhan developed a two-dimensional finite element heat transfer model to simulate talik development under thermokarst lakes (Ling and Zhang, 2003) as well as refreezing of taliks in drained lake basins (Ling and Zhang, 2004). This model, however, treats the lake area as constant. Plug and West's numerical model (Plug and West, 2009) simulates the expansion of a lake in cross-section through thermal processes and mass wasting. It focuses on only one single lake and does not include any site specific processes. Building on this, Kessler et al. (2012) developed a three-dimensional thermokarst lake model also integrating evolution of methane fluxes with growing lake dimensions. Using a three-dimensional Stefan equation, Ohara et al. (2022) model the thermokarst lake dynamics and talik structure of an individual lake in northern Alaska. Langer et al. (2016) coupled the bulk model FLake (Mironov et al., 2003; Kirillin et al., 2011; Boike et al., 2015), which simulates thermal dynamics in waterbodies, with the permafrost land surface model CryoGrid3 (Westermann et al., 2016), resulting in a one-dimensional land surface model including a simplified representation of lateral waterbody growth. This representation depends on site specific parameterizations and assumes constant lateral growth. While being useful for individual lake settings, these mentioned models are not designed to capture the diversity of individual lake behaviour and interactions across a landscape.

The lack of pan-Arctic models can be ascribed to the fact that thermokarst lakes are results of small-scale processes that depend on meter-scale surface and sub-surface heterogeneities, which cannot be resolved in larger scale models. This scaling gap leads to a poor representation of permafrost-related processes and possible biases in climate models and ESMs. Even in light of current efforts to increase resolutions of ESMs, a deterministic and physics-based representation of thermokarst lake dynamics on the landscape or pan-Arctic scale remains challenging due to the lack of area-wide high-resolution data on the relevant soil conditions, particularly ground ice distribution. While significant advances in high-resolution remote sensing of the land surface have been made, such as the mapping of ice wedge polygons (Liljedahl et al., 2024), the detection of sub-surface heterogeneities and properties is still difficult.

In Nitzbon et al. (2020) CryoGrid3 was extended beyond the one-dimensional site level by using laterally coupled tiles. While this tiling approach allows for a representation of spatial heterogeneity in surface and subsurface conditions, pan-Arctic or landscape scale simulations would still require initialization data for these heterogeneities including the spatial distribution of ground ice, since each tile contains a one-dimensional vertical representation of the subsurface that needs to be parameterized. The one-dimensional, deterministic, and physics-based nature of CryoGrid3 also suggests that a high resolution would be necessary for reliable results with such large scale simulations, making it computationally expensive.

To simulate thermokarst lake changes over areas of hundreds of kilometres, van Huissteden et al. (2011) tried to bridge the scaling gap between thermokarst processes and available sub-surface data with a stochastic modelling approach. For this approach, lake change processes are assumed to relate linearly to the deviation from a reference climate, specifically annual precipitation, July temperature and mean annual air temperature. The model is grid-based and initialized with random fractions of ice content for each grid cell following a normal distribution. It therefore requires some prior information on maximum ice content and ground ice distribution in the simulated region in order to accurately represent these conditions. The cells in which thaw is happening are selected randomly, however cells with a large ice content are preferred. Drainage is only included through lakes expanding onto a prescribed drainage network consisting of rivers, making the simulation outcome highly depended on these network representations. Other various forms of drainage, such as internal drainage through sub-surface channels, are not included.

In a study of landscape-scale thermokarst lake evolution by Victorov et al. (2019 b), lake formation and expansion are also considered to be stochastic. Based on four cases with different assumptions on these processes, theoretical statistical distributions for lake number and lake areas were derived analytically and empirically verified at 16 sites. For all cases, no drainage is assumed to happen. Under this assumption, the study finds that the number of lakes follows the Poisson distribution on an area that is relatively homogeneous in terms of geomorphological and geocryological landscape conditions. The distribution of lake sizes differs depending on the assumptions. However, the empirical data for the sites shows a lognormal distribution in the majority of the cases, suggesting that “a model based on the proportionality of the growth rate of thermokarst lakes to the average heat loss density through the side surface” might be most suitable to simulate changes in lake areas (Victorov et al., 2019b).

In another study, Victorov et al. (2019 a) considered the development of drained lake basins, often called “khasyreis” in Russian literature, and derived a theoretical statistical distribution of their sizes with a similar analytical approach. Assuming that the distribution of fluvial drainage sources is assumed to correspond to a Poisson distribution and following the same assumptions about lake formation and expansion as in Victorov et al. (2019 b), they find that the average radii and diameters of drained lake basins should follow the Rayleigh distribution, while their number across an area corresponds to a Poisson distribution. In Victorov et al. (2023), they calibrated their model and further modified it using meter-scale satellite imagery for different thermokarst sites in Siberia. Here, they also differentiate between thermokarst lakes that have formed on undisturbed area and secondary lakes that have formed in drained lake basins. They found that the observed size distributions for both mostly fit with theoretical integral-exponential distributions, but have different distribution parameter, and conclude that formation, expansion and drainage of thermokarst lakes is in dynamic equilibrium. Their approach does not directly consider influences of changing climate variables, but rather focuses on inherent lake dynamics under current conditions and drainage due to pre-existing fluvial erosion networks. Therefore it is not entirely suitable for projections of thermokarst landscape developments into the future.

In light of the limitations of previous models for ESM scales and in part based on the assumptions by Victorov et al. (2019 b), we propose a stochastic modelling approach, where formation, expansion and drainage are considered to be probabilistic across a region and simulated using common stochastic processes. We represent formation and abrupt drainage with two independent Poisson processes that calculate the number of new and disappearing lakes per time step. For the simulation of variations in individual lake areas, we use Geometric Brownian Motion, which thereby represents lake expansion and gradual drainage. With this method, it is possible to parameterize the stochastic components of lake dynamics and not only capture possible trends in lake areas, but also their volatility. We conduct three idealized simulations to explore the dynamic regimes that our model is able to capture, and test our approach by parameterizing with remote sensing data products containing a 20 year timeseries of individual annual lake areas.

2 Method

2.1 Model Framework

Our model is written in python and has an object-based approach. A diagram of the modelling scheme can be seen in Fig. B1. The model represents lakes as circular objects in a 2D plane, thereby only considering a lake's surface area. While thermokarst lakes can have different shapes depending on the region and geomorphology (e.g. Grosse et al., 2013), we regard a circular representation to be an appropriate simplification for a first approach, since they can be expected to expand radially under the assumption of flat terrain with broadly uniform ice content. The 2D approach is based on the simplified assumption that thermokarst lakes have similar shapes that can be deduced from their surface area, and puts the focus on land surface and atmosphere exchanges while keeping the model simple enough for computational limitations of a potential implementation into ESMs.

Each lake is defined by the coordinates of its centre and its surface area. The coordinates are randomly chosen from a uniform distribution, meaning that each lake initiation location is randomly distributed in space. The lake's location is relevant for the model's merging algorithm. At each time step, the model will check for overlapping lakes and merge them by adding the area of the smaller lake to the larger lake and calculating a new centre of mass. The smaller lake is then deleted from the lake inventory. We use this simplistic merging algorithm to keep the model from creating overlapping lake areas and as a first step to account for lake merging without resolving spatial surface and sub-surface heterogeneities.

We use a time step of one year, thereby disregarding seasonal changes in lake area. The lakes' number n and annual surface areas $a_{i = 1, \dots, n}$ change over time based on the following approaches. A theoretical derivation of a partial differential equation following our assumptions can be found in Appendix C.

Victorov et al. (2019 b) assume lake formation to be probabilistic and to occur independently on disjoint sites. This assumption of independence is in line with the fact that thermokarst happens due to a localized thawing process that depends on soil conditions at that particular point that are not directly influenced by thermokarst at a different point. When additionally assuming that the formation of the initial thermokarst depressions took place during a relatively short time or approximately at the same time, it can be shown that the number of depressions in an area obeys the Poisson law, meaning that the probability of k thermokarst depressions appearing during one year is

\begin{matrix} (1) & P_{f} (k, t) = \frac{Λ_{f} (t)^{k}}{k!} e^{- Λ_{f} (t)}, \end{matrix}

where Λ_f(t) is the effective formation rate for the whole study area A. It is calculated as Λ_f(t)=λ_fA_f(t), where A_f(t) is the water area in the system at time t. The rate λ_f therefore corresponds to the average number of depressions per unit A_f(t) and unit t. An analysis of observational data from 16 sites showed strong agreement between the distribution of lake numbers and the Poisson distribution (Victorov et al., 2019 b).

When considering a continuous possibility of depression formation over time, instead of only synchronous formation, Victorov et al. (2019 b) assume that the formation occurs independently in different time intervals. An argument for the temporal independence of thermokarst events is the random nature of triggers, such as extreme weather or disturbances through wildfire. Following these assumptions, lake formation can be simulated using a Poisson process. Following Pinsky and Karlin (2011), a Poisson process fulfils the following conditions.

For any time point, the process increments are independent random variables.
For s≥0 and time t>0, the random variable $X (s + t) - X (s)$ has the Poisson distribution (Eq. 1).
The random variable X(0) has the value 0.

We translate this idea into our computational model by implementing an operation that picks the number of new lakes at each time step from a Poisson distribution using the function numpy.random.poisson. The model is initialized with an array containing a maximum number of possible lakes with an initial area of zero. At each time step, the picked number of new lakes is activated from the initialization array and each of them get an initial area of a₀=1 ha. From then on, their indices will be tracked in an array for active lakes and they will be subject to annual lake area changes.

Victorov et al.'s (2019b) analytical derivation of lake size distribution is based on the assumption that lake growth happens independently for individual lakes and that it is proportional to the heat reserves in the lake and inversely proportional to the lateral area of the lake basin. This relationship follows when assuming that the growth rate is proportional to the heat loss through the side surface of the lakes, which is the process that leads to further permafrost degradation in the surrounding area and thereby to lake expansion. From this, Victorov et al. (2019 b) derive the following equation for the increment of the lake radius Δr_j for the jth year, assuming annual dynamics:

\begin{matrix} (2) & Δ r_{j} = ϵ_{j} r_{j} . \end{matrix}

Here, r_j is the lake radius at year j and ϵ_j can be written as

\begin{matrix} (3) & ϵ_{j} = \frac{α c T °}{2} ϵ_{j}^{0}, \end{matrix}

with c denoting specific heat, T° the average temperature of water, α the share of the heat amount in the water mass that leaves through the lateral area of the lake basin, and $ϵ_{j}^{0}$ a random variable that takes the impact of episodic factors into account. These factors can include the thickness of the snow cover, the volume of storm runoff, soil temperatures, precipitation and other variables. Since the values ϵ_j are independent, the central theorem can be applied, which says that the sum of a large number of independent items is normally distributed. It follows that the growth process of thermokarst radii can be considered as a Markov random process with the following transition function (Victorov et al., 2019 b), assuming that the future behaviour only depends on the current state and not on the past (Pinsky and Karlin, 2011; Ibe, 2013):

\begin{matrix} (4) & f (v, r, t) = \frac{1}{\sqrt{2 π} σ r \sqrt{t}} e^{- \frac{(\ln \frac{r}{v} - μ t)^{2}}{2 σ^{2} t}} . \end{matrix}

The initial radius of the thermokarst depression is denoted with v, while μ,σ are the distribution parameter and r is the depression radius at time t. Assuming that the initial size v is one unit radius, it can be derived that the radii have a lognormal distribution at any time t (Victorov et al., 2019 b):

\begin{matrix} (5) & f_{r}^{0} (r, t) = \frac{1}{\sqrt{2 π} σ r \sqrt{t}} e^{- \frac{(\ln r - μ t)^{2}}{2 σ^{2} t}} . \end{matrix}

This equation still holds true when exchanging the lake radius r with the lake's surface area a, because these two variables have a quadratic relationship. Agreement of observed lake sizes with this distribution was found in 14 out of the 16 sites that were used for empirical verification (Victorov et al., 2019 b).

To model a lognormally distributed variable such as the lake area, the stochastic process of Geometric Brownian Motion (GBM) can be used, which fulfils the condition of a Markov process and is a continuous-time stochastic process, in which the logarithm of the random variable follows Brownian Motion (BM) (Ibe, 2013). BM is a “random walk” process that models random continuous motion (Ibe, 2013). GBM is defined by the stochastic differential equation (SDE)

\begin{matrix} (6) & d Y (t) = μ Y (t) Δ t + σ Y (t) d B (t) \end{matrix}

with Y(t)=e^X(t), where X(t) is a BM with drift μ and can be written as $X (t) = μ t + σ B (t)$ (Ibe, 2013). The parameter σ is the volatility and B(t) is a standard BM, which has a mean of zero. The drift parameter μ alters the process so that there is an overall trend in the behaviour of the random variable. It can also be called the “deterministic component” of the process, whereas the volatility parameter is the “stochastic component” (Ibe, 2013), as it is a measure of the process' randomness. It can therefore be viewed as a representation of the random variable $ϵ_{j}^{0}$ from Eq. (2).

To compute the individual lake area a_i(t) at time t, we use the SDE's solution:

\begin{matrix} (7) & a_{i} (t) = a_{i} (t - 1) e^{(μ - \frac{1}{2} σ^{2}) t + σ B (t)}, \end{matrix}

where $i = 1, \dots, n$ .

We include the BM term B(t) in our model using the numpy function random.normal, which randomly chooses a value from a normal distribution. In our case, we use time step √Δt of our simulation as the SD for this distribution.

Since GBM can be described as a form of exponential growth with volatility, it is also in line with the mentioned assumptions on lake expansion from Victorov et al. (2019 b). For lakes that are assumed to be cylindrical objects with the same shape, these assumptions can be simplified to a rate of change of lake surface area and a coefficient of proportionality that includes a random variable taking into account the impact of episodic factors as in Eq. (2).

We treat gradual and abrupt drainage as two separate processes, since we assume them to have different triggers. We include gradual drainage in the GBM by allowing the drift parameter μ to be negative (Eq. 7). This way, lake areas in the model can decrease from one year to another not only due to volatility but also due to changes in the sign of the drift. This type of drainage is often driven by slower processes such as evaporation, slow creation of sub-surface drainage channels as a result of seepage through the active layer, sedimentation, or vegetation growth within the lake. Abrupt drainage is triggered by more sudden events such as bank erosion or connection with an existing topographic drainage gradient like a river or a neighbouring drained lake basin. In our case, we define abrupt drainage as a drainage event during which a lake drains completely in less than a year. Lakes that are subject to such an abrupt drainage event are removed from the pool of active lakes and will therefore stay drained throughout the rest of the simulation. Gradually drained lakes, on the other hand, are not removed and can still become subject to expansion or further gradual drainage. However, if an area has reached a value of zero due to gradual drainage, it will effectively stay zero for the rest of the simulation due to the nature of GBM (Eq. 7).

We model the number of abrupt drainage events per time step t with another Poisson process with probability for k events

\begin{matrix} (8) & P_{d} (k, t) = \frac{Λ_{d} (t)^{k}}{k!} e^{- Λ_{d} (t)}, \end{matrix}

where Λ_d(t) is the effective abrupt drainage rate for the whole study area A in year t, which is calculated as Λ(t)=λ_dA_d(t). A_d(t) can either consist of only surface water across the region or the sum of water and drained area. With this scaling, we make sure that the drainage rate is only above zero when lakes are present in the landscape, and that the drainage rate increases the more populated with lakes or depressions the area becomes. Since it is not clear, which of the two options for A_d(t) is more realistic, we implement two different variants that also concern the scaling of formation rate λ_f. When a lake is completely drained, the model saves its index in a respective array for drained lakes and deletes it from the array for active lakes, to which GBM is applied. It also tracks any area that used to be filled with water and classifies it as “drained area”. While the majority of this area is likely not susceptible for new thermokarst as long as no new ground ice has accumulated underneath, a new lake could still form on it due to water pooling in the existing depression and the basin margins might still contain some ground ice that has not been fully depleted, leading to a particularly high chance of renewed lake formation there. Also, an only partially drained lake could still have an increase in surface water due to high inflow. This makes it difficult to decide how to handle the influence of drained area on lake dynamics in the model. While the model always makes it possible for partially drained lakes to grow again, the possibility of new lakes forming on drained area depends on which model variant is used. In Variant 1, both the sum of water area $\overline{A_{water}}$ and the sum of drained area $\overline{A_{drained}}$ across the region are subtracted from the overall area of the simulated region A to obtain A_f(t), and both are added together to obtain A_d(t):

\begin{matrix} (9) & \begin{aligned} A_{f, v1} (t) & = A - (\overline{A_{water} (t)} + \overline{A_{drained} (t)}) \\ A_{d, v1} (t) & = (\overline{A_{water} (t)} + \overline{A_{drained} (t)}) . \end{aligned} \end{matrix}

In Variant 2, we only consider $\overline{A_{water}}$ , essentially counting $\overline{A_{drained} (t)}$ towards A_f(t) and neglecting it for A_d(t):

\begin{matrix} (10) & \begin{aligned} A_{f, v2} (t) & = A - \overline{A_{water} (t)} \\ A_{d, v2} (t) & = \overline{A_{water} (t)} . \end{aligned} \end{matrix}

The influence of water and drained area is only incorporated implicitly. The areas A_f(t) and A_d(t) influence the number of formation or drainage events across a region, but not the spatial distribution of new lakes. It is therefore possible to see new lakes forming on drained area in both variants. In Variant 1, however, more drained area will lead to a smaller likelihood of lake formation and a higher likelihood of lake drainage. This effect does not exist in Variant 2, because in this model version these likelihoods are only dependent on the water area. For the calculation of drained area, Variant 2 subtracts new water area from the existing drained area in proportion to the current drained area fraction, and does not consider the location of the new water area explicitly. For instance, in a cell with a drained area fraction of 0.3, 30 % of new water is assumed to appear on this drained area. In Variant 1, none of this water is assumed to appear on drained area.

If the area of a merged lake drains gradually, we include the possibility for post-merge fission, which allows merged lakes to split again. The model memorizes the areas of lakes at the time they merged. After drainage, it checks if any of the areas of merged lakes have reached a value below the sum of the areas at time of merging. If that is the case, the lakes split and reassume their initial locations. The area is distributed among them according to their initial ratios. The lake that was incorporated into the bigger lake during the merging process and was removed from the pool of active lakes, gets reintroduced into that pool.

We introduce an area fraction limit A_lim, which can stop lakes from growing any further. This limit represents the fraction of the study area that can theoretically be populated by lakes. When it is reached, no expansion of existing surface areas is allowed, but lakes can still shrink at that point. Expansion is prohibited until the area fraction is below the limit again. Formation, however, is not explicitly inhibited. Instead, the formation probability will naturally be zero at A_lim due to the implementation of the area-fraction scaling. In Variant 1 this includes drained lakes, meaning that the ratio of A_d,v1(t) (Eq. 9) and the study area needs to reach the limit for lakes to stop growing. In Variant 2, only the water area fraction influences lake dynamics. It is therefore the ratio of A_d,v2(t) (Eq. 10) to the study area, that needs to reach the limit. Table 1 shows an overview of the main differences between the two model variants. Values for the limit should be determined separately for different landscapes in the Arctic, considering the average ground ice content, topography and other landscape conditions. This task, however, is not trivial. For the purposes of this work, we mainly use a fraction limit of A_lim=1 as default, essentially assuming that 100 % of any simulated area can become covered with surface water.

Table 1The two model variants and their respective definitions for the areas used for scaling of formation and abrupt drainage rates λ_f and λ_d according to Eqs. (1) and (8), and the underlying assumptions, or rather implicit effects, on the formation and abrupt drainage possibilities.

Download Print Version | Download XLSX

Since the model does not contain explicit seasonality, it is not suitable to investigate dynamics with temporal scales of less than a year. For this reason, the used time step should be one year or longer, even though the calculation of water and drained area fractions is largely time step independent, except for a small effect due to the merging algorithm (Appendix D).

Lake dynamics are influenced by environmental and climatic conditions. We therefore assume that our parameters (formation rate λ_f, abrupt drainage rate λ_d, drift μ and volatility σ) have different values for different geocryological regions/landscapes and that they can change with a climate forcing. The latter dependence can be accounted for by implementing them as a function of a set of climate variables.

2.2 Parameterization Approach

The values for drift μ and volatility σ as well as formation λ_f and drainage rate λ_d can theoretically be estimated using time series of observed lake areas. For the estimation of μ and σ, it is necessary to calculate the logarithmic returns R_log for each lake, i.e.

\begin{matrix} (11) & R_{log, i} (t) = \ln (\frac{a_{i} (t)}{a_{i} (t - 1)}) . \end{matrix}

The volatility σ can then be estimated by taking the standard deviation (SD) across all lakes for each time step in order to obtain a time series of σ. For this, we use the function np.nanstd, which calculates σ(t) with equation

\begin{matrix} (12) & σ (t) = \sqrt{\frac{\sum_{n = 1}^{N} | R_{log, n} (t) - \overline{R_{log, n} (t)} |}{N}}, \end{matrix}

where N denotes the number of values and $\overline{R_{log, n} (t)}$ the mean of R_log,n(t). That mean can be used to get an estimate for μ(t), after correcting it for the negative effect on the returns that is being exerted by σ:

\begin{matrix} (13) & μ (t) = \overline{R_{log, n} (t)} + \frac{σ (t)^{2}}{2} . \end{matrix}

A timeseries of estimates for formation rate λ_f can be obtained by simply identifying and counting lakes that had a water area of zero until a specific year. For the drainage rate λ_d all lake objects can be counted that have a water area of zero from one specific year until the end of the recorded timespan. Because λ_d represents the abrupt drainage rate specifically and therefore should not contain lakes that have drained gradually, we only consider lakes that have lost 50 % of their maximum water area in one time step. Depending on which simulation variant we choose, we need to scale formation and drainage rate using either A_f,v1(t) and A_d,v1(t) or A_f,v2(t) and A_d,v2(t) according to Eqs. (9) and (10), respectively.

For the implementation of a climate dependence, the correlation between these parameters and climate variables should be investigated and quantified. We assume thawing degree days (TDD) and total annual precipitation P to be the most relevant climate variables. TDD are the sum of the temperatures for every day in a year, where the average daily temperature is above freezing. On these days, it is possible that some of the ground ice starts to melt, thereby initiating ground subsidence and possible lake formation, drainage and lake expansion. High precipitation can also lead to lake formation, lake expansion, or drainage by facilitating permafrost degradation and water pooling. Furthermore it leads to an increased water input into both existing lakes as well as drainage channels. In order to reduce the stochastic effects, we smooth the timeseries of formation and drainage rate by taking the 3 year running mean of these rates and comparing them to the 3 year running mean of the climate variables.

2.3 Simulations

2.3.1 Idealized

Our model is able to capture three general dynamic regimes: complete drainage of the landscape (a), oscillating behaviour of water and drained area (b), and stabilization of area fractions (c). To demonstrate these regimes, we perform three exemplary 1000 year simulations with idealized parameterizations, which can be found in Table 2. These parameter values are hypothetical and remain constant, meaning that they are independent of a climate-forcing. They are chosen to create qualitative and demonstrative examples of the regimes within the simulation period of 1000 years with a time step of one year, and should therefore not be interpreted as realistic representations of observed lake dynamic speed. Lake expansion, for example, can generally be assumed to be slower by about an order of magnitude. Most observed mean expansion rates lie between 0.1 and 2 m yr⁻¹ with some single lakes showing higher rates of up to 6.01 m yr⁻¹ (Jones et al., 2011). Since the parameters are hypothetical, the simulations could also be interpreted as simulations over a 10 ka year time span with a time step of 10 years. In that case, the parameter values in Table 2 would have to be considered to be per decade and not per year. All three of the simulations start with a landscape without lakes or freshly drained lake basins. The simulation area is 40 km×40 km. This size is representative of a grid cell size for the Arctic in the Max Planck Institute's land surface model JSBACH/ICON-Land (Jungclaus et al., 2022) with R2B6 resolution.

Table 2Hypothetical parameter values, fraction limit and model variant used for the three different idealized simulations. Parameters are per year; for a 10 year time step interpretation, values would be per decade.

Download Print Version | Download XLSX

2.3.2 Observation-based

We also test our approach by applying the parameterization method to a dataset of lake areas (Nitze and Nicholson, 2025) derived from time-series of the JRC Global Surface Water Dataset (Pekel et al., 2016), which was itself derived from the multi-temporal orthorectified Landsat 5, 7, and 8 archive data. Pekel et al. (2016) classified the pixels in these satellite images as open water, land, and non-valid using expert systems, visual analytics, and evidential reasoning. As part of a larger effort to create a pan-Arctic lake change dataset, we calculated lake footprints based on the methodology by Nitze et al. (2017, 2018) for the Arctic part of UTM Zone 54 N and an extended time series from 2000–2021 (Nitze and Nicholson, 2025). That is, for each lake larger than 1 ha, lake polygon footprints were obtained as shown in Fig. 1. Then, the annual surface water areas within each lake polygon were extracted from the JRC Global Surface Water Dataset for the period 1984–2021 using Google Earth Engine, employing the geemap (Wu, 2020) and eemont (Montero, 2021) Python packages. Within each polygon, the fraction of permanent water area, land area, no-data area, and seasonal water area for each year were extracted. However, lake area values before 2000 were not reliably available due to missing or sparse Landsat observations.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f01

Figure 1Study region with location of the lake change dataset. An Arctic stereographic projection map shows the location of the study region in Eastern Siberia at the coast between Laptev and East Siberian Sea (a). The study region and calculated lake footprint dataset encompass the northern part of UTM Zone 54. The lakes from the dataset are shown as blue polygons (b). We focus on a random 40 km×40 km cell within that region (c).

As an additional aid, we used a dataset by Chen et al. (2023), who identified lake drainage events across the Arctic using an object-based analysis assisted by the JRC surface water data product as well as another surface water product published by the Global Land Analysis and Discovery (GLAD) team (Pickens et al., 2020). Both products are derived from the archive of 30 m-resolution Landsat imagery. For their analysis, Pickens et al. (2020) defined drainage events as any time a lake lost more than 50 % of its area and was initially larger than 1 ha. They identified the main year of drainage for each event. Due to the discontinuous nature of the Landsat data, clear identification of lake drainage could only be done for the years 2000–2020, which is the period that we focus on.

Our study area (Fig. 1) encompasses about 48 623 km² and lies in the coastal region of the Yana-Indigirka Lowland in Siberia. We chose this test region for its high density of thermokarst lakes and basins. It spans the northern part of UTM Zone 54 N between 138 and 144 ° E, and 70 and 73 ° N. According to the ecoregion classification by Olson et al. (2001), the study area can be classified as Northeast Siberian coastal tundra. In the landform categorization of the circum-Arctic Permafrost Region Pond and Lake database (PeRL) (Muster et al., 2017), the area is denoted as alluvial-limnetic with continuous permafrost and about 50 % ground ice content. It is underlain by Yedoma deposits and contains 22 939 lakes that were identified in the lake area dataset. We also tested our approach for a random 40 km×40 km cell within the region (Fig. 1), representing a grid cell size for the Arctic in JSBACH/ICON-Land with R2B6 resolution in line with the idealized simulations in Sects. 2.3.1 and 3.1. This cell contains 522 identified waterbodies. Before applying the parameterization method to the lake area data, we exclude all values for polygons and years where any fraction of the area was classified as no-data.

After obtaining parameter estimates with the method described in Sect. 2.2, we investigated whether there was a quantifiable relationship between the stochastic parameters and climate variables TDD and total annual P (Fig. 5a) between 2000 and 2019. We obtained the climate data from reanalysis data from the Global Soil Wetness Project Phase 3 (GSWP3) (Lange et al., 2021), and calculated TDD from the daily temperatures. We then tried to find correlations between the parameters and the climate variables by calculating the Pearson and Spearman correlation coefficients between the two timeseries. For these calculations, we used the corr method from the python package pandas. As it is not clear how much the system lags in response to the forcing, we checked the correlation between parameter estimates and climate variables in the same year as well as between parameter estimates and climate variables in the previous year.

Instead of deriving a climate-dependent function for each parameter, it is also possible to calculate the parameter as constants by taking the mean of the calculated timeseries, thereby assuming them to be stable over that time period. For the calculation of μ and σ, we first created the rolling means for 3-year windows to get more robust results. The constant rates λ_f and λ_d are calculated using the counting method described in Sect. 2.2 and dividing the sum of counted lake objects by the years in the timeseries. As an example, we tested this approach in the 40 km×40 km cell, chosen for its representativeness of a typical land surface model resolution and its relative computational inexpensiveness.

3 Results

3.1 Idealized Simulations

In the following, we discuss the idealized simulations as described in Sect. 2.3.1 and show timeseries of simulated water and drained area as well as lake number. For each regime, we also include three examples of spatial plots that show the active and drained lakes in the system from a bird's eye view at different times of the simulation. These spatial plots should be seen as visual aids to understand the lake distributions, rather than realistic representations of the simulated landscapes, since lake surfaces are simplified as circles in our model and the boundaries of the available landscape only affect area fraction implicitly. This is why lakes can be seen growing beyond the area's boundaries, even though overall water or disturbed area fraction is limited to a value of 1 or lower in all three simulations. A comparison of the Gini coefficients (e.g. Cowell and Flachaire, 2015) for the three simulations can be seen in Appendix E.

Regime A: Complete Drainage

For a complete drainage regime, we use Variant 1. As described in Sect. 2.1, this variant's lake formation will become less likely the more the landscape is filled with water and drained area. At the same time drainage probability will increase. As a consequence, the system will likely reach a point where every lake has drained and no new lakes can form. In Fig. 2a an initial increase in water area fraction an initial increase in water area fraction and number of lakes in the study region can be seen, as first lakes form in the previously lake-free system. In year 70 of the simulation, a lake-filled landscape has established and first lakes have started to drain. A spatial representation of this can be seen in Fig. 2b. Newly formed lakes have been randomly distributed in the simulation area of 40 km×40 km. Since μ is positive, lakes have also expanded, causing them to reach different sizes and occasionally merge. With more larger lakes, abrupt drainage is more likely to result in stepwise decreases of water area fraction, as can be seen in Fig. 2a between year 122 and 123. In year 122 the system is mainly dominated by one large lake (Fig. 2c), which completely drains in year 123. Lakes continue to drain until the water area fraction has reached zero and no active lakes remain (Fig. 2d). Since lake formation probability is scaled by A_f,v1 according to Eqs. (8) and (9), no new lakes can form after the drained area has reached a certain threshold. This leads to a stable system, in which almost all area is covered by drained basins (Fig. 2a). While there are no known regions that are completely covered by drained lake basins, but do not have any extant lakes, there are areas with an especially high fraction of drained area, namely the Yamal Peninsula in Siberia (von Baeckmann et al., 2024) with 78 % and the Yukon–Kuskokwim Delta (Jones et al., 2022). It can not be ruled out that the remaining lakes in these regions will also drain in the future without new lakes forming.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f02

Figure 2Model output for an idealized simulation for a complete drainage regime (“Regime A”) using the parameter values from Table 2: Timeseries of water and drained area fraction as well as lake number (a) and spatial representation of lake distribution and drained area extent for simulation year 70 (b), year 122 (c) and year 150 (d). Water area is displayed in blue and drained area in light brown.

Download

Regime B: Oscillation

In order to achieve a regime, in which water and drained area oscillate with an inverse correlation, it is necessary for new lakes to be able to form on already drained area. We therefore use Variant 2 for this regime. Within the first ca. 80 years/iterations of the simulation, the system behaves similarly to regime A, while first lakes form and expand. Eventually, some become affected by abrupt drainage (Fig. 3a). The resulting landscape in year 70 (Fig. 3b) resembles the landscape at the same time of regime A in Fig. 2b. A similar complete drainage regime as described for regime A happens between year 109 and 110. In this simulation, however, new lakes eventually appear, since Variant 2 scales the formation probability with A_f,v2 (Eqs. 8 and 9), which only contains the water area fraction. This means, that lake formation probability will rise when lakes drain and water area fraction decreases, regardless of large parts of the area being covered by drained basins. This behaviour repeats itself six times within the simulation period, as can be seen in Fig. 3a. Figure 3c represents a snapshot of the system at year 315, where lake expansion and merging have lead to one big lake, which is close to draining, leading to the second of the abrupt system changes due to a stepwise decrease of water area fraction to almost zero. In year 880 (Fig. 3d), a landscape with several newer differently sized lakes that resemble the lake distribution from year 70 (Fig. 3b) is visible. However, the lakes in year 880 are already the sixth generation of lakes. This type of landscape, where active lakes have appeared on drained lake basins, has been found in several regions across the Arctic (Bockheim et al., 2004; Jones et al., 2012; Roy-Léveillée and Burn, 2017; Fuchs et al., 2019; Bergstedt et al., 2021). Interpreting the simulations as spanning 10 ka years and having a time step of 10 years, would mean that there are six simulated lake generations over the course of 10 ka years, which is the same order of magnitude that has been observed, for example on the Seward Peninsula in Alaska, where Jones et al. (2012) found six generations of overlapping lakes that have formed over the course of the Holocene with the oldest being roughly 9 ka years old.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f03

Figure 3Model output for an idealized simulation for an oscillating regime (“Regime B”) using the parameter values from Table 2: Timeseries of water and drained area fraction as well as lake number (a) and spatial representation of lake distribution and drained area extent for simulation year 70 (b), year 315 (c) and year 880 (d). Water area is displayed in blue and drained area in light brown.

Download

Regime C: Stabilization

Quasi-stabilization of the area fractions can occur when they have reached the pre-determined fraction limit, which keeps lakes from expanding further. For stabilization, drainage and formation rate also need to be chosen in a way that both processes are in balance at the fraction limit or their probabilities are close to zero. Additionally, lakes should not become too big, as abrupt drainage of a large lake would have a higher impact on the water area fraction than drainage of a smaller lake and would lead to more pronounced dips in water area fraction. To achieve a quasi-stabilization, we therefore also decrease μ and σ compared to the other regimes. When the sum of water area fraction reaches the limit 0.5 in Variant 2, lakes stop expanding. While lakes continue to form, they are also subject to abrupt drainage. This process happens often enough to counterbalance the water area increase through formation, so that the water area fraction oscillates at the fraction limit for the rest of the simulation period (Fig. 4a). Figure 4b shows the landscape at year 150, with many relatively small lakes. By year 500 these have expanded and merged into bigger lakes. Additionally, some smaller lakes have formed and few lakes have drained (Fig. 4c). The larger size of the older lakes in year 900 (Fig. 4d) can mostly be explained by the merging with newly formed lakes. In this regime, some individual lakes survive the simulation period while only slightly changing size. Thermokarst-affected regions with relatively stable lake area fractions and no large-scale trends over several decades have been identified using remote sensing data (e.g. Jones et al., 2009).

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f04

Figure 4Model output for an idealized simulation for a stabilizing regime (“Regime C”) using the parameter values from Table 2: Timeseries of water and drained area fraction as well as lake number (a) and spatial representation of lake distribution and drained area extent for simulation year 150 (b), year 500 (c) and year 900 (d). Water area is displayed in blue and drained area in light brown.

Download

3.2 Observation-based Simulation

For the whole study region, we found that 30 % of the data points were not useable due to missing pixels in the satellite data, with each year being differently affected and some years having as little as 14 % of useable data. Still, we obtained a timeseries of μ and σ with the remaining data points via the method described in Sect. 2.2. We could not confidently identify a lake that has formed within the time period and therefore assume formation rates of zero for each year. Chen et al. (2023) found 36 drainage events within the cell between 2000 and 2020. However, some of these events did not lead to a complete drainage of the lake, thereby not completely fitting into our definition of “abrupt drainage”. We therefore only considered those drainage events, for which the authors detected a drainage percentage of 90 %, which left 10 events. The area had an average TDD increase of 7.26 ° D over the 20 year period and an increase in P of 0.448 mm (Fig. 5a).

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f05

Figure 5Thawing degree days TDD and annual total precipitation P between 2000 and 2020 in the study region (a). The data was derived from daily GSWP3 data (Lange et al., 2021). Exponential fit between calculated volatility σ and TDD of the previous year (b). Spearman (c) and Pearson (d) correlation coefficient between parameters and the previous year's TDD and annual precipitation in the study region between 2000 and 2020.

Download

Based on our analysis of the available data, we cannot confidently determine a climate influence on our lake parameters. Therefore, we do not have a robust basis for a climate dependent model parameterization. The resulting correlation coefficients for the comparison with the corresponding year's climate are shown in Fig. F1a and b, and for the previous year's climate in Fig. 5d and c. The strongest correlation was found between volatility σ and previous year's TDD, with a Spearman correlation coefficient of −0.67 (Fig. 5c) and Pearson correlation coefficient of −0.56 (Fig. 5d). This translates into an exponential regression with a determination coefficient R² of 0.45 (Fig. 5b), which we considered to be too weak in order to confidently use this regression function for the volatility calculation in the model. The other correlation coefficients for this analysis were estimated to be between 0.12 and −0.19 (Fig. 5d and c), indicating weak or no correlation. This suggests that either the system has no immediate response to changes in thawing degree days or total annual precipitation during the time from 2000 and 2020, or that the stochastic component is too high to detect it. It is also important to note, that the large data gaps likely influence the robustness of derived parameter estimates and that the drainage rate was calculated based on relatively few drainage events within a 20 year period.

In the 40 km×40 km cell, about 33 % of data points contain an area fraction that is classified as no-data. Chen et al. (2023) found one drainage event in year 2013 that fits into our definition of “abrupt drainage”. Similarly to the whole study area, a significant correlation for this subset was found neither for a comparison with the previous year's climate variables (Fig. 6a and b) nor the corresponding year's climate variables (Fig. F1c and d).

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f06

Figure 6Spearman (a) and Pearson (b) correlation coefficient between parameters and previous year's thawing degree days TDD and annual precipitation P in the random 40 km×40 km cell within the study region between 2000 and 2020.

Download

The calculated constant parameter values for the 40 km×40 km cell are shown in Table 3. It cannot be confidently determined, however, whether these values describe inherent and climate-independent lake dynamics or whether they are specific to the particular climate change scenario. The volatility estimate σ was about 5.65 times higher than the drift μ. This suggest that environmental factors that influence individual lake behaviour were the dominant component that drove regional-scale lake dynamics in this area between 2000 and 2020. This is also in line with the low signal-to-noise ratio of 0.37, which we calculated for the mean logarithmic returns (Eq. 11) across all lakes. When the estimates for μ and σ were put into Eq. (7), the term $(μ - \frac{1}{2} σ^{2}$ ) took on the value 0.01, which was smaller than the average value of the term B(t). This means that the stochastic term σB(t) would dominate simulated variations in lake area.

Table 3Parameter values obtained from remote sensing data products with the parameterization method described in Sect. 2.2.

Download Print Version | Download XLSX

We tested the constant parameterization with ten 100 year long ensemble simulations for Variant 1 and Variant 2, respectively. In these simulations, we essentially assumed that the parameter values do not change during that time period, because either the landscape characteristics and inherent dynamics will continue to dominate dynamics or the trends in climate variables will stay the same as during the parameterization period from 2000–2020. We used the lake size distribution from the year 2000 for initialization and distributed the differently sized lakes randomly in the spatial domain.

Figures 7a and 8a show the simulated water and drained area fraction for each member in the ensemble for the two variants, as well as the ensemble mean and the SD. On this timescale, distinct differences between the two variants can not be determined. The ensemble means of both variants showed an increase of water area fraction. In Variant 1, the fraction of water area in the study region increased by 231 % from 0.16–0.53 during the 100 years of simulation. In Variant 2, the water area fraction increased similarly by 241 %, resulting in a final water area fraction of 0.55 by year 2099. The yearly growth rate of lake radii averaged 0.53 m a⁻¹ across all ensemble members for Variant 1 and only 0.33 m a⁻¹ for Variant 2. The SD of water and drained area fraction in Variant 1 temporarily reached values of 0.16 and 0.07, respectively. In Variant 2, they reached similar values of up to 0.14 and 0.07. Ensemble runs with Variant 1 exhibited between 5 and 17 abrupt drainage events over the simulation period, while Variant 2 experienced between 4 and 15. In addition, Variant 1 simulated between 5 and 18 lakes that lost more than 25 % of their area until the end of the simulation through gradual drainage, whereas Variant 2 simulated between 4 and 20. With Variant 1, lake numbers decreased from the initial 512 lakes to between 161 and 266 lakes by the end of the simulation, while Variant 2 led to 94–254 remaining lakes. Figures 7b and 8b show the lake size distribution across all ensemble members with mean M and SD at the start year 2000, year 2020 and year 2099. With both variants, the distribution flattened slightly and remained skewed to the right, while mean and SD increased. A spatial representation from an example ensemble member for these three time points can be seen in Figs. 7c–e and 8c–e. As explained in Sect. 3.1 such representations should be viewed as visual aids only. By year 2020 (Figs. 7d and 8d) some lakes had slightly decreased in size as a consequence of the relatively high volatility parameter σ. Most, however, had grown or merged. By year 2099 (Figs. 7e and 8e) in the two example simulations one or more especially large lakes had formed, dominating the landscape.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f07

Figure 7Model output for an ensemble simulation with Variant 1 and using the parameter values from Table 3: Time series of water (top) and drained area fraction (bottom) of all ensemble members as well as ensemble means and SD (a). Histograms for size distribution across all ensemble members (b) at start of simulation (left), year 20 (middle) and year 2099 (right) with mean M and SD. Spatial representation of lake distribution and drained area extent of an example ensemble simulation for start of simulation (c), year 20 (d) and year 2099 (e) with water area displayed in blue and drained area in light brown.

Download

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f08

Figure 8Model output for an ensemble simulation with Variant 2 and using the parameter values from Table 3: Time series of water (top) and drained area fraction (bottom) of all ensemble members as well as ensemble means and SD (a). Histograms for size distribution across all ensemble members (b) at start of simulation (left), year 20 (middle) and year 2099 (right) with mean M and SD. Spatial representation of lake distribution and drained area extent of an example ensemble simulation for start of simulation (c), year 20 (d) and year 2099 (e) with water area displayed in blue and drained area in light brown.

Download

4 Discussion

While the works by Victorov et al. (Victorov et al., 2019 b, a, 2023) use analytical approaches to derive equilibrium distributions of lake and drained lake basin areas and numbers, our model relies on numerical methods and can represent three different dynamic regimes, depending on parameterization and model variant. Furthermore, it allows for the inclusion of a climate forcing by implementing the model parameters as functions of climate variables besides also capturing internal lake dynamics that result from accelerated permafrost degradation due to the accumulation of water, rather than climate changes.

In principle, the model can provide information on water and drained area fractions in thermokarst-affected areas over time and in response to a climate forcing. With this information, modelling of land–atmosphere fluxes of carbon, energy and water from permafrost-shaped landscapes could potentially become more accurate. Especially, estimates of net carbon emissions and the ratio between emitted methane and carbon dioxide could be improved, since new thermokarst lakes can be a significant regional methane source (e.g. Walter et al., 2007; Turetsky et al., 2020). Besides water and drained area fractions, our model can also provide the size distributions of lakes, which gives information on lake shoreline lengths and presumable lake depths. Furthermore, our model can keep track of the age of single lakes since the start of a simulation, which gives an indication of potential vegetation cover along the shoreline. These attributes have been shown to influence microbial decomposition processes and carbon emission pathways in waterbodies (Kutzbach et al., 2004; Juutinen et al., 2009; Knoblauch et al., 2015; Polishchuk et al., 2018; Rehder et al., 2021, 2023).

Some ESMs could use our model output to set the area of surface water in grid cells that have been previously identified to have thermokarst potential. For instance, JSBACH/ICON-Land (Reick et al., 2021), the land component of the Max Planck Institute's Earth System Model ICON, or JULES (Smith et al., 2022), the land component of the UK Earth System Model, divide their grid cells into subgrid-scale tiles that are only defined by their fraction within the grid box and include a representation of surface water. Theoretically, our model could be coupled asynchronously to such ESMs, calculating changes in water area fraction based on the ESM's climate variable output and providing it with the updated fraction as a boundary condition for the land–atmosphere interactions in the next time step. This could not only improve the simulated carbon exchange between land surface and atmosphere, but also the surface roughness as well as moisture and energy transfer. Furthermore, our implementation of abrupt drainage could improve projections of freshwater discharge into the ocean. Alternatively to an asynchronous coupling, a simple function of water area fraction in relation to changing climate variables that emulates our model responses, could provide a way to incorporate thermokarst lake dynamics directly into ESMs without significantly increasing computational costs. It is important to note, however, that prescribing water area fractions with an external model or through incorporation of one single emulator function could violate the ESMs water and energy balance. Calculations of energy and water cycle as well as coupling schemes would need to be adjusted within the ESM. Whether this is possible depends on the technical details of the ESM and its infrastructure. A successful coupling also requires an accurate parameterization of our model, which remains a challenge. Without it, possible model applications remain mostly conceptual and could, for instance, involve sensitivity studies investigating the relative importance of the resolved lake dynamic processes.

The three regimes that our model can capture can be expected in thermokarst lake landscapes according to observational studies (Bockheim et al., 2004; Jones et al., 2011, 2009; Roy-Léveillée and Burn, 2017; Bouchard et al., 2017; Fuchs et al., 2019; Bergstedt et al., 2021; Anderson et al., 2019; Jones et al., 2022; von Baeckmann et al., 2024). They result from interactions and feedbacks between lake formation, expansion and drainage and the landscape changes induced by these processes. However, both developed model variants constitute simplified and incomplete representations of these feedbacks. The development of a more varied topography as lakes form, expand and drain, and its effect on formation and drainage potential is only represented implicitly through the scaling of abrupt drainage and formation probability using water and drained area fractions in the system. It is important to note that this scaling is incorporated linearly, meaning that abrupt drainage probability will increase linearly with more disturbed area in the system as steeper topographic gradients develop, while formation probability will decrease linearly as less area becomes available for formation. These feedbacks could be more complex in reality, however. Furthermore, lake expansion and gradual drainage as well as the fraction limit for disturbed area remain unchanged with landscape changes in the current model version, but these aspects should be taken into account for a more comprehensive feedback representation.

Model Variant 1 represents the fact that there will be no ground ice under recently drained lakes and therefore no new thermokarst, but it does not capture the possible and often observed refilling of drained lake basins with rain or meltwater (e.g. Hinkel et al., 2005), or the presence of remnant ground ice not fully depleted by an earlier lake generation. New thermokarst lakes are implicitly not allowed to form on drained areas, and once the area of a lake has become zero, the model does not allow it to increase again. This variant therefore does not account for lakes that form due to other processes besides thermokarst. In Variant 2, lake expansion and formation are only limited by the water area in the system. This is equivalent to the assumption that lakes can expand onto drained area and also form on it. This variant thereby could represent lake formation due to re-wetting and new ponding in drained lake basins, which could be caused by water pooling in existing depressions. However, for these type of lakes, the process for lake expansion would not be the accelerated permafrost degradation that is the mechanism behind the expansion of most thermokarst lakes. The implementation of lake expansion in the model might therefore not be suitable in these cases. A third option to treat the influence of drained area would be to implement the refreezing of drained lake basins and renewed ground ice formation, which has been observed in many Arctic regions (Jorgenson and Shur, 2007; Lindgren et al., 2012). In this implementation, the drained area would first be considered to be unavailable for lake formation, but become available again after some time. This would essentially represent the processes of refreezing of the taliks in the drained lake basins, the accumulation of new ground ice, and the associated upheaval of the land surface. These processes become especially important when simulating longer timeseries of several hundred years. Early-stage ice wedge polygons have been observed in basins that drained a few 100 years ago (Lindgren et al., 2012). In satellite images, remnant lakes can be found at different locations within a drained lake basin, including the margins, as parts have been lifted up by ground ice accumulation (Jorgenson and Shur, 2007; Jones et al., 2012; Lindgren et al., 2012; Jones et al., 2022). The inclusion of a refreezing algorithm and its parameterization would be challenging, though, since the involved processes are complex and depend on small-scale subsurface conditions. Including the refreezing stochastically could be a way forward.

Our merging algorithm leads to strong decreases in lake numbers, which are likely not realistic. This suggests that the merging implementation is not an accurate representation of real merging. The way that lakes merge in reality is highly case-dependent and influenced by the shape of the lakes as well as the surrounding landscape and subsurface (Jones et al., 2011). Actual merging often results in one lake draining into the other without increasing its surface area proportionally. The implemented algorithm is not able to capture this variety of merging dynamics and likely leads to overestimation of merging events. It can only represent simplified merging, after which the resulting lake has a circular shape and a surface area equal to the sum of the surface areas of the two previously separate lakes. As a result, some of our simulations yield landscapes with single lakes of up to 1588 km², which is unrealistic. Furthermore, the merging algorithm is computationally expensive and requires lakes to be distributed in space. Altering the model to include a stochastic implementation of lake merging should be a first step when creating a new model version. Continuum percolation theory includes some approaches, such as the Boolean model (e.g. Chiu et al., 2013), that could be a basis for such an implementation, but are not directly suitable for our case of lakes with a varying size distribution. The lake area dataset described in Sect. 3.2 could be an aid in parameterizing a merging algorithm. Currently this data does not capture lake merging, because lake objects are identified using the largest extent of connected pixels within a timeseries of raster data. It would be possible to get an estimate for the frequency of merging events with further processing, however.

The representations of lake expansion and gradual drainage in the model is based on idealized assumptions on thermokarst feedbacks, which limits the applicability to different kinds of lakes. It might therefore not be suitable for more complex lake behaviour and lake area changes that are driven by non-thermokarst processes. With an accurate calibration, the stochastic nature of the model and the inclusion of the volatility term could be able to capture some of this variety in lake dynamics, however. This is also why the model is not limited by the lack of small-scale subsurface data. The somewhat simple structure of the model makes it fairly flexible and allows for relatively easy adjustment of the lake process equations as well as expansion and inclusion of previously neglected aspects, such as ground ice accumulation in drained lake basins or different merging dynamics. With efforts to increase data availability in permafrost regions, this could allow the model to become more comprehensive in the future.

The numbers of abrupt drainage events that we simulated using our model with observation-based parameterization are roughly in line with observed drainage events in Jones et al. (2020), where 98 drained lakes within a 30 000 km² area between 1955 and 2017 were found, which corresponds to an average drainage rate of 1.58 lakes per year. In this case, the definition of drainage events included all lakes that lost more than 25 % of their surface area. In our model ensemble simulations we found between 4 and 17 abrupt drainage events for a 40 km×40 km cell over a period of 100 years and an additional 4–20 lakes having lost more than 25 % of their area through gradual drainage. Extrapolating these numbers to a 30 000 km² this would be a drainage rate between 1.5 and 6.98 lakes per year. While the data from Jones et al. (2020) and our parameterization data (Nitze and Nicholson, 2025) both cover a coastal plain, the drainage rates could still be specific to the different respective study regions in Alaska near Teshekpuk Lake and the Yana-Indigirka Lowlands. In Jones et al. (2011), the same definition of a more than 25 % reduction in surface area was used to detect drainage events from high-resolution remotely sensed imagery from 1950/51, 1978 and 2006/07 for a 700 km² region on the Seward Peninsula, Alaska. An average drainage rate of 2.3 lakes per year was found, which is between 14 and 66 times as high as the simulated drainage rates for our study region. In the same study, a mean lake expansion rate between 0.34 and 0.39 m a⁻¹ was observed, which is in the same range as our simulated mean rates of 0.53 and 0.33 m a⁻¹. Observed rates for different regions in the Tibetan Plateau between 1969 and 2010 were even smaller with an average of 0.13 m a⁻¹ (Luo et al., 2022). Due to the high volatility in our model simulations, the exponential nature of the implemented lake expansion and the nature of our merging algorithm, the range of individual growth rates is about four orders of magnitude higher than the observed ranges of 0.02–6.01 m a⁻¹ (Jones et al., 2020) and 0.02–1.86 m a⁻¹ (Luo et al., 2022), and should not be compared.

The simulated water areas in our observation-based experiments increase over the simulation period of 100 years to just over 50 %, which is a relatively high value that is very rare in current Arctic landscapes. This further suggests that the model parameters derived from the 20 year observational dataset are likely not suitable for simulations on centennial scales, as they seem to overestimate expansion. While experiments for roughly the same region in van Huissteden et al. (2011) also showed an initial increase of thawed area from 8 % to about 25 % in the first 70 years of their simulations, this initial increase is followed by a decrease in thawed area with some ensemble members reaching thawed areas as low as 2.2 % and 2.7 % after 100 years. The different behaviour of the two models can likely be explained with the different approaches regarding lake drainage. Our model takes all main parameter including drainage rates from remote sensing data directly, whereas van Huissteden et al. (2011) rely on an initialization with a river network. Even though our approach is theoretically able to capture different drainage processes besides drainage through rivers, it still seems to underestimate lake drainage when the model is parameterized using current 20 year long remote sensing datasets.

While our model does not require a-priori knowledge of sub-surface conditions, such as the model by van Huissteden et al. (2011), it is still very dependent on parameterization data that gives information on long-term developments of lake numbers and surface water areas within a landscape. Even though high-resolution remote sensing data with resolutions of less than 30 m has been collected globally for more than two decades, the currently available data products (Schlaffer et al., 2012; Pekel et al., 2016; Nitze et al., 2017; Muster et al., 2017; Chen et al., 2023; Nitze and Nicholson, 2025) still do not provide the necessary consistency and coverage in both space and time to detect a robust trend in lake dynamics and to parameterize an influence of a climate forcing. The volatility of lake dynamics seems to dominate a climate change signal in the 20 year dataset that we used. Additionally, issues with the remote sensing apparatus or the transmission can cause artefacts and data loss in the satellite imagery, and lead to large parts of the data not being useable for our parameterization method, thereby making our parameter estimates less reliable. Timeseries of relative lake area changes that were derived from lake area dataset for our study region show a low signal-to-noise ratio of 0.37 and a volatility estimate that is 5.65 times higher than the drift estimate. Furthermore, gaps and artefacts in the satellite images that were used to derive the dataset, lead to 33 % of data points that cannot be used for model parameterization. Because the model parameters are assumed to be region-specific, the parameterization should only be done for domains that are relatively homogeneous in terms of geocryological and landscape conditions. This is why it is not always advisable to increase the size of parameterization datasets by including a larger area. Using a space-for-time approach for parameterization should also be handled with care for that reason. Still, such an approach could provide first estimates on how different climate states effect thermokarst lake dynamics. It could also help in identifying thresholds of climate variables that lead to a different process dominating the overall lake area trend in the system, and provide additional information for model calibration.

An increase of the length of the dataset would likely bring the most robust parameterizations, however. A power analysis, with power of 0.8, significance level of 0.05, and the calculated signal-to-noise ratio as the effect size, suggests that a time series length of at least 60 years would be necessary to discern a signal in the investigated 40 km×40 km cell. For the complete study region a signal-to-noise ratio of 0.40 and a minimum required timeseries length of 51 years was found. For this analysis, we used the python package statsmodel to solve the function for the power of a t-test for the minimum sample size. Whether a climate influence could be quantified with such time series is still unclear. Even though high-resolution historical aerial imagery exists from the 1940s onwards for some regions across the Arctic (e.g. Muster et al., 2017), the gaps in the timeseries would likely still cause problems for a robust signal detection. A more extensive statistical analysis using different synthetic datasets created by the model and our parameterization approach to reconstruct the input parameters, could help in formulating and quantifying requirements for parameterization datasets.

While the dependence on climate variables might become quantifiable for parameter μ and σ in a few decades with continued retrieval of remote sensing data, the same is likely not true for λ_f and λ_d. A robust estimation of these two parameters needs a number of observed formation and abrupt drainage events that satellite data will not be able to provide in the near future. In addition to this issue, our definitions of abrupt and gradual drainage make it difficult to differentiate between the two mechanisms in remotely sensed observational data. Large partial losses of lake areas can be interpreted as either process without additional analysis. Deriving a size-dependent hazard for abrupt drainage could be helpful, but also requires a large amount of data. It might therefore be necessary to think about calibration approaches beyond remotely sensed lake area timeseries. Lake age databases could be used to create a paleo-record of lake formation events. Data on lake ages are sporadic, however, since determining lake ages usually involves the dating of sediment cores (e.g. Lenz et al., 2016), which are costly to obtain. Still, a synthesis study yielded a dataset of basal ages for 1207 lakes across the high northern latitudes (Brosius et al., 2018, 2021, 2023). Anthony et al. (2014) also conducted radiocarbon dating methods on samples from 49 drained lake basins in the North Siberian yedoma region. Recent advancement in detecting drained lake basins from satellite data (Wang et al., 2012; Bergstedt et al., 2021) could provide information on present-day drained area fractions in different regions and potentially be used to calibrate the model, especially in comparison with estimates of the timing of the drainage events. Research and data synthesis on the sub-surface capillary network of the Arctic tundra could further increase our understanding of drainage probability (Liljedahl et al., 2024).

5 Conclusions

This study provides a stochastic modelling attempt of regional changes in lake distributions and water area fractions in thermokarst-affected permafrost landscapes. The conceptual model accounts for processes of formation, expansion, merging, and drainage of lakes. Since thermokarst lake dynamics are highly dependent on local environmental conditions, their behaviour can be volatile on the landscape scale. With our suggested method, it is possible to parameterize this volatility and thereby consider the differences in individual lake dynamics, without having to resolve small-scale landscape heterogeneities, which would not only require high-resolution modelling, but also extensive surface and sub-surface data, that is not available at the necessary scale.

We developed two variants of the conceptual model. In a first variant, new lakes are implicitly not allowed to form on previously drained area, while in the second variant lakes can expand onto previously drained area and also form on it. Both variants deliver similar results for simulations for our coastal study site in the Yana-Indigirka Lowland and over a timescale of 100 years. To the best of our knowledge, our model is the first to consider the interactions between active and drained lakes as represented in these variants. While both variants are highly simplified representations, they constitute an important first step towards a more complex and realistic model, and give an opportunity to investigate the impacts of the two underlying assumptions on lake dynamics as well as the relative importance of the involved processes.

We tested three different regimes of the model dynamics: complete drainage, oscillation, and stabilization of the lake area. All these regimes are possible and could be relevant for particular geographical regions. For example, the oscillation regime reflects the long-term development of a lake-rich area with periodic establishment and draining.

We attempted to quantify model parameters using existing remote sensing data. However, it is difficult to validate simulation results with the current state of observational data. Even though the model is conceptual, it is still very dependent on the parameterization. Currently available remote sensing based time series of surface water are at maximum 40 years long, but mostly shorter and have many data gaps, particularly in the high latitudes. Longer term data (centuries to millennia) are very scarce and less precise as they are typically based on soil samples and geochronological dating. Furthermore, thermokarst lake dynamics are volatile, making it hard to detect a climate influence in the available datasets using our method. It is therefore not yet possible to accurately parameterize or calibrate the lake dynamics response to climate forcing, and use the model for projections under different climate change scenarios. This currently limits the potential of application of the model for a large-scale ESM.

The parameterization of lake expansion and gradual drainage might become possible with slightly longer time series. A first power analysis suggests that at least 50 years of observational data would be necessary to detect a signal of changes in lake area dynamics within the study region. However, the lack of data on longer time scales will still prevail and formation and abrupt drainage rates likely can’t be captured with a few more decades of satellite data retrieval. It is therefore necessary to think about different calibration approaches for these two parameters.

Appendix A: List of Symbols

Appendix B: Modelling Scheme

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f09

Figure B1Modelling scheme. Green indicates input into the model, i.e. initialization data and a climate forcing. The climate forcing can, for instance, consist of an annual time series for precipitation P and thaw degree days TDD. The grey box contains all operations that happen at each time step. Blue indicates the model objects, i.e. the active and inactive lake pool. Red indicates the main computations and processes, which are: merging of overlapping lakes, Geometric Brownian Motion (GBM) simulating lake expansion and gradual drainage, a Poisson process that calculates the number of drainage events, and a Poisson process that calculates the number of formation events, i.e. new lakes that are added to the lake pool. N is the initial number of active lakes at each time step, M is the initial number of inactive lakes. a_n is the water surface area of the nth lake, whereas d_n is its drained area. a_n,new and d_n,new are both of these areas after applying expansion or gradual drainage via GBM. K is the number of overlaps and therefore merging events. D is the number of drainage events, and F the number of formation events. C(t) contains the forcing in form of a selected climate variable at time t.

Download

Appendix C: Partial Differential Equation for Lake Size Distribution

Lake expansion is modelled with GBM (Eq. 7), which has the following corresponding Fokker–Planck equation (e.g. Stojkoski et al., 2020):

\begin{matrix} (C1) & \frac{\partial}{\partial t} p (a, t) = - μ \frac{\partial}{\partial a} a p (a, t) + \frac{σ^{2}}{2} \frac{\partial^{2}}{\partial a^{2}} a^{2} p (a, t), \end{matrix}

with p(a,t) being the probability density for a lake having area a at time t, μ being the drift of the GBM and σ being the volatility. The formation and abrupt drainage processes in our model can be understood as birth and death terms and incorporated into the equation. The equation then gives the change of the number density of lakes n(a,t) per unit area in a size bin around area a at time t rather than the probability density. Furthermore, the model includes a merging algorithm that influences n(a,t). We expand Eq. (C1) to

\begin{matrix} (C2) & \begin{aligned} \frac{\partial}{\partial t} n (a, t) = & - μ \frac{\partial}{\partial a} a n (a, t) + \frac{σ^{2}}{2} \frac{\partial^{2}}{\partial a^{2}} a^{2} n (a, t) \\ - D (a, t) + F (a, t) + M (a, t), \end{aligned} \end{matrix}

with D(a,t) being the drainage term that gives the expected number of lakes of size bin around a removed due to abrupt drainage at time t, F(a,t) being the formation term giving the number of lakes added through formation and M(a,t) being the merging term giving the number of lakes removed or added through merging. Both abrupt drainage and formation are modelled with independent Poisson processes with effective rates Λ_d(t) and Λ_f(t). The rates can change over time due to climate dependence and because they are scaled by the water and/or drained area within the system at each time step (see Eqs. 1 and 8). Abruptly drained lakes are chosen randomly according to the number of lakes obtained from the abrupt drainage Poisson process. Every size bin should therefore lose a number of lakes based on the fraction of removed lakes of the total expected number of lakes N(t). The drainage term is therefore

\begin{matrix} (C3) & D (a, t) = \frac{Λ_{d}}{N (t)} n (a, t) . \end{matrix}

Newly formed lakes all have the same initial area a₀. The formation term can therefore be written as

\begin{matrix} (C4) & F (a, t) = Λ_{f} δ (a - a_{0}), \end{matrix}

with δ(a−a₀) being the Dirac delta function (Dirac, 1927), which is 0 when a≠a₀, making sure that newly formed lakes are only counted for the size bin around a₀. When two lakes of areas a_x and a_y merge, the resulting area of the merged lake is $a_{z} = a_{x} + a_{y}$ . If R(a_x,a_y) is the rate at which two lakes of area a_x and a_y merge per time step, then the expected number of lakes in the size bin around area a_x that is removed or added due to the merging is

\begin{matrix} (C5) & \begin{aligned} M (a_{z}, t) = & \frac{1}{2} \int_{0}^{a_{z}} R (a_{y}, a_{z} - a_{y}) n (a_{y}, t) n (a_{z} - a_{y}, t) d a_{y} \\ - n (a_{z}, t) \int_{0}^{\infty} R (a_{z}, a_{y}) n (a_{y}, t) d a_{y} . \end{aligned} \end{matrix}

The first integral term $\int_{0}^{a_{z}} R (a_{y}, a_{z} - a_{y}) n (a_{y}, t) n (a_{z} - a_{y}, t) d a_{y}$ describes the expected number of lakes that can merge and lead to a new summed area of a_z. The factor $\frac{1}{2}$ needs to be added to avoid double counting. The second term $n (a_{z}, t) \int_{0}^{\infty} R (a_{z}, a_{y}) n (a_{y}, t) d a_{y}$ describes the expected number of lakes that had an area of a_z and were removed because they merged with a bigger lake. The probability or rate R of two lakes merging remains difficult to determine analytically, as it depends on the size distribution at that time step as well as how many lakes have merged in the previous time step. The inclusion of post-merge fission, which is driven by area increases through the GBM, further complicates the changes of lake numbers.

Appendix D: Time Step Sensitivity

We conducted a time step sensitivity study to investigate the robustness of the default annual stepping that we used in our simulations. To do this, we simulated ensembles of 10 members with variant 1 of the model and time steps of $Δ t_{1} = 0.5, Δ t_{2} = 1$ and Δt₃=2. We used the parameterization seen in Table 3. No significant changes in model behaviour could be found. Figure D1a shows the ensemble means of water and drained area fraction for the three different experiments, which all follow a somewhat similar trajectory. Any small-scale differences can be attributed to the stochasticity of the processes. Slight differences can be seen in the rate of lake number losses in Fig. D1b, with the simulations using a smaller time step losing lakes more quickly. This can be explained by the implementation of the merging algorithm, which checks for overlapping lakes at every time step and therefore finds these overlaps earlier when the time step is smaller.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f10

Figure D1Model output for an ensemble simulation with Variant 1 and using the parameter values from Table 3 and time steps $Δ t = 0.5, 1$ and 2 years: Time series of the ensemble means and SD for water and drained area fraction (a) and the ensemble mean for lake number (b).

Download

While the implementation of the merging algorithm has a slight time-step dependence when lake density is high, our model is time-step independent in its calculations of water and drained area fractions. It delivers similar results even with a semi-annual time step.

Appendix E: Gini Coefficients for Idealized Simulations

The Gini coefficient (e.g. Cowell and Flachaire, 2015) is a measure of inequality in a statistical distribution. A value of one indicates large inequality. For our model simulations, this would mean that there is only one active lake that contains all of the current water area of the system. A value of zero indicates perfect equality, where the water area is evenly distributed over several lakes. Figure E1 shows the Gini coefficients for the idealized simulations and regimes. Since regime A and B were done with the same parameterization, they initially show a similar increase of the Gini coefficient while first lakes form and stochastic expansion and merging leads to increasing variety of lake sizes. Regime A quickly runs into a complete drainage of the landscape with no remaining active lakes. In regime B, however, new lakes keep forming, leading to a changing Gini coefficient. Sudden drops of the Gini coefficient can be seen, where large lakes that have emerged through expansion and merging, drain and leave smaller lakes with a more equal size distribution. Regime C has slower expansion of lakes, which is why the Gini coefficient does not increase as steeply. In this regime, the model runs into a quasi-stabilization of lake areas, which is represented here by the Gini coefficient that oscillates around a value of 0.62 for roughly the last 670 years of the simulation.

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f11

Figure E1Gini coefficients of the three idealized simulations (Regime A: Drainage, Regime B: Oscillation, Regime C: Stabilization) as described in Sect. 3.1.

Download

Appendix F: Climate Correlation Analysis

https://tc.copernicus.org/articles/20/1967/2026/tc-20-1967-2026-f12

Figure F1Spearman (a) and Pearson (b) correlation coefficient between parameters and that year's climate variables thawing degree days TDD and annual precipitation P in whole the study area region between 2000 and 2020, as well as Spearman (c) and Pearson (d) correlation coefficient between parameters and that year's climate variables TDD and P in the 40 km×40 km cell.

Download

Code and data availability

The model code is publicly available on Zenodo under https://doi.org/10.5281/zenodo.19053374 (Reinken et al. , 2026). The dataset of lake areas used as parameterization data is available on Zenodo under https://doi.org/10.5281/zenodo.15011122 (Nitze and Nicholson, 2025).

Author contributions

Constanze Reinken developed the model and parameterization method, wrote the model code, executed the simulations and analysis, and drafted the manuscript with input and corrections from all other authors. Victor Brovkin, Philipp de Vrese, and Constanze Reinken collaborated on the research question, project approach, and overall storyline of the manuscript. Ingmar Nitze provided the remote sensing data product used in the study. Helena Bergstedt, Guido Grosse and Ingmar Nitze offered insight into the state of remote sensing data and data products in the Arctic as well as dynamics of thermokarst lakes and lake drainage.

Competing interests

At least one of the (co-)authors is a member of the editorial board of The Cryosphere. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

This work used resources of the Deutsches Klimarechenzentrum (DKRZ) granted by its Scientific Steering Committee (WLA) under project ID bm1236. Further, datasets provided by Tobias Stacke (Max-Planck Institute for Meteorology) via the DKRZ data pool were used. We want to thank Annett Bartsch and Clemens von Baeckmann (b.geos) for their support in navigating remote sensing data publications, as well as Céline Gieße (University of Hamburg) for her thorough internal review of an early draft.

Financial support

This work was supported by the European Research Council project Q-Arctic (grant no. 951288) and by the German Research Foundation as part of the CLICCS Clusters of Excellence (DFG EXC 2037). Constanze Reinken gratefully acknowledges funding from the International Max Planck Research School on Earth System Modelling (IMPRS-ESM).

The article processing charges for this open-access publication were covered by the Max Planck Society.

Review statement

This paper was edited by Hanna Lee and reviewed by two anonymous referees.

References

Anderson, L., Edwards, M., Shapley, M. D., Finney, B. P., and Langdon, C.: Holocene Thermokarst Lake Dynamics in Northern Interior Alaska: The Interplay of Climate, Fire, and Subsurface Hydrology, Front. Earth Sci., 7, https://doi.org/10.3389/feart.2019.00053, 2019. a, b

Anthony, K. M. W., Zimov, S. A., Grosse, G., Jones, M. C., Anthony, P. M., Chapin III, F. S., Finlay, J. C., Mack, M. C., Davydov, S., Frenzel, P., and Frolking, S.: A shift of thermokarst lakes from carbon sources to sinks during the Holocene epoch, Nature, 511, 452–456, https://doi.org/10.1038/nature13560, 2014. a

Arp, C. D., Jones, B. M., Schmutz, J. A., Urban, F. E., and Jorgenson, M. T.: Two mechanisms of aquatic and terrestrial habitat change along an Alaskan Arctic coastline, Polar Biol., 33, 1629–1640, https://doi.org/10.1007/s00300-010-0800-5, 2010. a

Arp, C. D., Jones, B. M., Liljedahl, A. K., Hinkel, K. M., and Welker, J. A.: Depth, ice thickness, and ice-out timing cause divergent hydrologic responses among Arctic lakes, Water Resour. Res., 51, 9379–9401, https://doi.org/10.1002/2015WR017362, 2015. a

Arp, C. D., Jones, B. M., Melanie Engram, V. A. A., Lei Cai, A. P., Hinkel, K., Bondurant, A. C., and Creighton, A.: Contrasting lake ice responses to winter climate indicate future variability and trends on the Alaskan Arctic Coastal Plain, Environ. Res. Lett., 13, https://doi.org/10.1088/1748-9326/aae994, 2018. a

Bergstedt, H., Jones, B. M., Hinkel, K., Farquharson, L., Gaglioti, B. V., Parsekian, A. D., Kanevskiy, M., Ohara, N., Breen, A. L., Rangel, R. C., Grosse, G., and Nitze, I.: Remote Sensing-Based Statistical Approach for Defining Drained Lake Basins in a Continuous Permafrost Region, North Slope of Alaska, Remote Sens.-Basel, 13, 2539, https://doi.org/10.3390/rs13132539, 2021. a, b, c

Bockheim, J. G., Hinkel, K. M., Eisner, W. R., and Dai, X. Y.: Carbon Pools and Accumulation Rates in an Age-Series of Soils in Drained Thaw-Lake Basins, Arctic Alaska, Soil Sci. Soc. Am. J., 68, 697–704, https://doi.org/10.2136/sssaj2004.6970, 2004. a, b

Boike, J., Georgi, C., Kirilin, G., Muster, S., Abramova, K., Fedorova, I., Chetverova, A., Grigoriev, M., Bornemann, N., and Langer, M.: Thermal processes of thermokarst lakes in the continuous permafrost zone of northern Siberia – observations and modeling (Lena River Delta, Siberia), Biogeosciences, 12, 5941–5965, https://doi.org/10.5194/bg-12-5941-2015, 2015. a

Bouchard, F., MacDonald, L. A., Turner, K. W., Thienpont, J. R., Medeiros, A. S., Biskaborn, B. K., Korosi, J., Hall, R. I., Pienitz, R., and Wolfe, B. B.: Paleolimnology of thermokarst lakes: a window into permafrost landscape evolution, Arctic Science, 3, 91–117, https://doi.org/10.1139/as-2016-0022, 2017. a, b

Brosius, L. S., Treat, C. C., Walter Anthony, K. M., Lenz, J., Jones, M. C., and Grosse, G.: High-latitude Lake Basal Ages and Origins – link to datafile, PANGAEA [data set], https://doi.org/10.1594/PANGAEA.894737, 2018. a

Brosius, L. S., Anthony, K. M. W., Treat, C. C., Lenz, J., Jones, M. C., Bret-Harte, M. S., and Grosse, G.: Spatiotemporal patterns of northern lake formation since the Last Glacial Maximum, Quaternary Sci. Rev., 253, 106773, https://doi.org/10.1016/j.quascirev.2020.106773, 2021. a, b, c

Brosius, L. S., Walter Anthony, K. M., Treat, C. C., Jones, M. C., Dyonisius, M., and Grosse, G.: Panarctic lakes exerted a small positive feedback on early Holocene warming due to deglacial release of methane, Communications Earth & Environment, 4, 1–11, https://doi.org/10.1038/s43247-023-00930-2, 2023. a

Brovkin, V., Bartsch, A., Hugelius, G., Calamita, E., Lever, J. J., Goo, E., Kim, H., Stacke, T., and de Vrese, P.: Permafrost and Freshwater Systems in the Arctic as Tipping Elements of the Climate System, Surv. Geophys., 46, 303–326, https://doi.org/10.1007/s10712-025-09885-9, 2025. a

Chen, Y., Cheng, X., Liu, A., Chen, Q., and Wang, C.: Tracking lake drainage events and drained lake basin vegetation dynamics across the Arctic, Nat. Commun., 14, https://doi.org/10.1038/s41467-023-43207-0, 2023. a, b, c, d, e, f

Chiu, S. N., Stoyan, D., Kendall, W. S., and Mecke, J.: Random closed sets I – The Boolean model, in: Stochastic Geometry and its Applications, chap. 3, John Wiley & Sons, Ltd, 64–107, https://doi.org/10.1002/9781118658222.ch03, 2013. a

Cooley, S. W., Smith, L. C., Ryan, J. C., Pitcher, L. H., and Pavelsky, T. M.: Arctic-Boreal Lake Dynamics Revealed Using CubeSat Imagery, Geophys. Res. Lett., 46, 2111–2120, https://doi.org/10.1029/2018GL081584, 2019. a

Côté, M. M. and Burn, C. R.: The oriented lakes of Tuktoyaktuk Peninsula, Western Arctic Coast, Canada: a GIS-based analysis, Permafrost. Periglac., 13, 61–70, https://doi.org/10.1002/ppp.407, 2002. a

Cowell, F. A. and Flachaire, E.: Statistical Methods for Distributional Analysis, in: Handbook of Income Distribution, vol. 2, edited by: Atkinson, A. B. and Bourguignon, F., Elsevier, 359–465, https://doi.org/10.1016/B978-0-444-59428-0.00007-2, 2015. a, b

de Vrese, P., Georgievski, G., Gonzalez Rouco, J. F., Notz, D., Stacke, T., Steinert, N. J., Wilkenskjeld, S., and Brovkin, V.: Representation of soil hydrology in permafrost regions may explain large part of inter-model spread in simulated Arctic and subarctic climate, The Cryosphere, 17, 2095–2118, https://doi.org/10.5194/tc-17-2095-2023, 2023. a

de Vrese, P., Stacke, T., Gayler, V., and Brovkin, V.: Permafrost Cloud Feedback May Amplify Climate Change, Geophys. Res. Lett., 51, e2024GL109034, https://doi.org/10.1029/2024GL109034, 2024. a

Dirac, P. A. M.: The physical interpretation of the quantum dynamics Free, Proc. R. Soc. Lond. Ser.-A, 113, https://doi.org/10.1098/rspa.1927.0012, 1927. a

Edwards, M., Grosse, G., Jones, B. M., and McDowell, P.: The evolution of a thermokarst-lake landscape: Late Quaternary permafrost degradation and stabilization in interior Alaska, Sediment. Geol., 340, 3–14, https://doi.org/10.1016/j.sedgeo.2016.01.018, 2016. a

Fedorov, A. N., Gavriliev, P. P., Konstantinov, P. Y., Hiyama, T., Iijima, Y., and Iwahana, G.: Estimating the water balance of a thermokarst lake in the middle of the Lena River basin, eastern Siberia, Ecohydrology, 7, 188–196, https://doi.org/10.1002/eco.1378, 2014. a

Freitas, N. L., Walter Anthony, K., Lenz, J., Porras, R. C., and Torn, M. S.: Substantial and overlooked greenhouse gas emissions from deep Arctic lake sediment, Nat. Geosci., 18, 65–71, https://doi.org/10.1038/s41561-024-01614-y, 2025. a

Fuchs, M., Lenz, J., Jock, S., Nitze, I., Jones, B. M., Strauss, J., Günther, F., and Grosse, G.: Organic Carbon and Nitrogen Stocks Along a Thermokarst Lake Sequence in Arctic Alaska, J. Geophys. Res.-Biogeo., 124, 1230–1247, https://doi.org/10.1029/2018JG004591, 2019. a, b

Grosse, G., Jones, B., and Arp, C.: Thermokarst Lakes, Drainage, and Drained Basins, in: Treatise on Geomorphology, chap. 8.21, edited by: Shroder, J. F., Academic Press, 325–353, https://doi.org/10.1016/B978-0-12-374739-6.00216-5, 2013. a, b, c, d, e

Heslop, J. K., Walter Anthony, K. M., Winkel, M., Sepulveda-Jauregui, A., Martinez-Cruz, K., Bondurant, A., Grosse, G., and Liebner, S.: A synthesis of methane dynamics in thermokarst lake environments, Earth-Sci. Rev., 210, 103365, https://doi.org/10.1016/j.earscirev.2020.103365, 2020. a

Hinkel, K. M., Frohn, R. C., Nelson, F. E., Eisner, W. R., and Beck, R. A.: Morphometric and spatial analysis of thaw lakes and drained thaw lake basins in the western Arctic Coastal Plain, Alaska, Permafrost. Periglac., 16, 327–341, https://doi.org/10.1002/ppp.532, 2005. a, b

Hinkel, K. M., Jones, B. M., Eisner, W. R., Cuomo, C. J., Beck, R. A., and Frohn, R.: Methods to assess natural and anthropogenic thaw lake drainage on the western Arctic coastal plain of northern Alaska, J. Geophys. Res.-Earth, 112, https://doi.org/10.1029/2006JF000584, 2007. a, b

Hopkins, D. M.: Thaw Lakes and Thaw Sinks in the Imuruk Lake Area, Seward Peninsula, Alaska, J. Geol., 57, 119–131, https://doi.org/10.1086/625591, 1949. a, b

Ibe, O. C.: Markov Processes for Stochastic Modeling, 2nd edn., Elsevier, https://doi.org/10.1016/C2012-0-06106-6, 2013. a, b, c, d, e

in 't Zandt, M. H., Liebner, S., and Welte, C. U.: Roles of Thermokarst Lakes in a Warming World, Trends Microbiol., 28, 769–779, https://doi.org/10.1016/j.tim.2020.04.002, 2020. a, b, c, d, e

Jones, B., Grosse, G., Arp, C., Jones, M., Walter Anthony, K., and Romanovsky, V.: Modern Thermokarst Lake Dynamics in the Continuous Permafrost Zone, Northern Seward Peninsula, Alaska, J. Geophys. Res.-Biogeo., 116, G00M03, https://doi.org/10.1029/2011jg001666, 2011. a, b, c, d, e

Jones, B. M., Arp, C. D., Hinkel, K. M., Beck, R. A., Schmutz, J. A., and Winston, B.: Arctic Lake Physical Processes and Regimes with Implications for Winter Water Availability and Management in the National Petroleum Reserve Alaska, Environ. Manage., 43, 1071–1084, https://doi.org/10.1007/s00267-008-9241-0, 2009. a, b

Jones, B. M., Arp, C. D., Grosse, G., Nitze, I., Lara, M. J., Whitman, M. S., Farquharson, L. M., Kanevskiy, M., Parsekian, A. D., Breen, A. L., Ohara, N., Rangel, R. C., and Hinkel, K. M.: Identifying historical and future potential lake drainage events on the western Arctic coastal plain of Alaska, Permafrost. Periglac., 31, 110–127, https://doi.org/10.1002/ppp.2038, 2020. a, b, c, d

Jones, B. M., Grosse, G., Farquharson, L. M., Roy-Léveillée, P., Veremeeva, A., Kanevskiy, M. Z., Gaglioti, B. V., Breen, A. L., Parsekian, A. D., Ulrich, M., and Hinkel, K. M.: Lake and drained lake basin systems in lowland permafrost regions, Nature Reviews Earth & Environment, 3, 85–98, https://doi.org/10.1038/s43017-021-00238-9, 2022. a, b, c, d

Jones, M. C., Grosse, G., Jones, B. M., and Walter Anthony, K.: Peat accumulation in drained thermokarst lake basins in continuous, ice-rich permafrost, northern Seward Peninsula, Alaska, J. Geophys. Res.-Biogeo., 117, https://doi.org/10.1029/2011JG001766, 2012. a, b, c

Jorgenson, M., Romanovsky, V., Harden, J., Shur, Y., O'Donnell, J., Schuur, E., Kanevskiy, M., and Marchenko, S.: Resilience and vulnerability of permafrost to climate change, Can. J. Forest Res., 40, 1219–1236, https://doi.org/10.1139/X10-060, 2010. a

Jorgenson, M. T. and Shur, Y.: Evolution of lakes and basins in northern Alaska and discussion of the thaw lake cycle, J. Geophys. Res.-Earth, 112, https://doi.org/10.1029/2006JF000531, 2007. a, b

Jungclaus, J. H., Lorenz, S. J., Schmidt, H., Brovkin, V., Brüggemann, N., Chegini, F., Crüger, T., De-Vrese, P., Gayler, V., Giorgetta, M. A., Gutjahr, O., Haak, H., Hagemann, S., Hanke, M., Ilyina, T., Korn, P., Kröger, J., Linardakis, L., Mehlmann, C., Mikolajewicz, U., Müller, W. A., Nabel, J. E. M. S., Notz, D., Pohlmann, H., Putrasahan, D. A., Raddatz, T., Ramme, L., Redler, R., Reick, C. H., Riddick, T., Sam, T., Schneck, R., Schnur, R., Schupfner, M., Storch, J.-S., Wachsmann, F., Wieners, K.-H., Ziemen, F., Stevens, B., Marotzke, J., and Claussen, M.: The ICON Earth System Model Version 1.0, J. Adv. Model. Earth Sy., 14, https://doi.org/10.1029/2021ms002813, 2022. a

Juutinen, S., Rantakari, M., Kortelainen, P., Huttunen, J. T., Larmola, T., Alm, J., Silvola, J., and Martikainen, P. J.: Methane dynamics in different boreal lake types, Biogeosciences, 6, 209–223, https://doi.org/10.5194/bg-6-209-2009, 2009. a

Kaufman, D.: Holocene thermal maximum in the western Arctic (0–180°W), Quaternary Sci. Rev., 23, 529–560, https://doi.org/10.1016/j.quascirev.2003.09.007, 2004. a

Kessler, M. A., Plug, L. J., and Walter Anthony, K. M.: Simulating the decadal- to millennial-scale dynamics of morphology and sequestered carbon mobilization of two thermokarst lakes in NW Alaska, J. Geophys. Res.-Biogeo., 117, https://doi.org/10.1029/2011JG001796, 2012. a

Kirillin, G., Hochschild, J., Mironov, D., Terzhevik, A., Golosov, S., and Nützmann, G.: FLake-Global: Online lake model with worldwide coverage, Environ. Modell. Softw., 26, 683–684, https://doi.org/10.1016/j.envsoft.2010.12.004, 2011. a

Knoblauch, C., Spott, O., Evgrafova, S., Kutzbach, L., and Pfeiffer, E.-M.: Regulation of methane production, oxidation, and emission by vascular plants and bryophytes in ponds of the northeast Siberian polygonal tundra, J. Geophys. Res.-Biogeo., 120, 2525–2541, https://doi.org/10.1002/2015JG003053, 2015. a

Kutzbach, L., Wagner, D., and Pfeiffer, E.-M.: Effect of microrelief and vegetation on methane emission from wet polygonal tundra, Lena Delta, Northern Siberia, Biogeochemistry, 69, 341–362, https://doi.org/10.1023/B:BIOG.0000031053.81520.db, 2004. a

Lachenbruch, A. H., Brewer, M. C., Greene, G. W., and Marshall, B. V.: Temperatures in Permafrost, in: Temperature, Its Measurement and Control in Science and Industry, chap. 80, edited by: Herzfeld, C., Reinhold Publishing Co, New York, 791–803, 1962. a

Lange, S., Menz, C., Gleixner, S., Cucchi, M., Weedon, G. P., Amici, A., Bellouin, N., Schmied, H. M., Hersbach, H., Buontempo, C., and Cagnazzo, C.: WFDE5 over land merged with ERA5 over the ocean (W5E5 v2.0), ISIMIP Repository, https://doi.org/10.48364/ISIMIP.342217, 2021. a, b

Langer, M., Westermann, S., Boike, J., Kirillin, G., Grosse, G., Peng, S., and Krinner, G.: Rapid degradation of permafrost underneath waterbodies in tundra landscapes–Toward a representation of thermokarst in land surface models, J. Geophys. Res.-Earth, 121, 2446–2470, https://doi.org/10.1002/2016JF003956, 2016. a

Lara, M. and Chipman, M.: Periglacial Lake Origin Influences the Likelihood of Lake Drainage in Northern Alaska, Remote Sens.-Basel, 13, https://doi.org/10.3390/rs13050852, 2021. a, b

Lenz, J., Wetterich, S., Jones, B. M., Meyer, H., Bobrov, A., and Grosse, G.: Evidence of multiple thermokarst lake generations from an 11 800 year-old permafrost core on the northern Seward Peninsula, Alaska, Boreas, 45, 584–603, https://doi.org/10.1111/bor.12186, 2016. a

Liljedahl, A., Witharana, C., and Manos, E.: The capillaries of the Arctic tundra, Nat. Water, 2, 611–614, https://doi.org/10.1038/s44221-024-00276-9, 2024. a, b

Lindgren, P., Grosse, G., Jones, M., Jones, B., and Walter Anthony, K.: Characterizing Post-Drainage Succession in Thermokarst Lake Basins on the Seward Peninsula, Alaska with TerraSAR-X Backscatter and Landsat-based NDVI Data, Remote Sens.-Basel, 4, 3741–3765, https://doi.org/10.3390/rs4123741, 2012. a, b, c

Ling, F. and Zhang, T.: Numerical simulation of permafrost thermal regime and talik development under shallow thaw lakes on the Alaskan Arctic Coastal Plain, J. Geophys. Res.-Atmos., 108, https://doi.org/10.1029/2002JD003014, 2003. a

Ling, F. and Zhang, T.: Modeling study of talik freeze-up and permafrost response under drained thaw lakes on the Alaskan Arctic Coastal Plain, J. Geophys. Res., 109, https://doi.org/10.1029/2003JD003886, 2004. a

Luo, J., Niu, F., Lin, Z., Liu, M., Yin, G., and Gao, Z.: Abrupt increase in thermokarst lakes on the central Tibetan Plateau over the last 50 years, Catena, 217, 106497, https://doi.org/10.1016/j.catena.2022.106497, 2022. a, b, c

Mackay, J. R.: Catastrophic lake drainage, Tuktoyaktuk Peninsula area, District of Mackenzie, Current Research, Part D, Geological Survey of Canada, 83–90, 1988. a, b, c

Mackay, J. R.: Lake Stability in an Ice-Rich Permafrost Environment: Examples From The Western Arctic Coast, in: Aquatic Ecosystems in Semi-arid Regions: Implications for Resource Management, edited by: Robarts, R. D. and Bothwell, M. L., Environment Canada, Saskatoon, 1–25, 1992. a, b

Mann, D. H., Peteet, D. M., Reanier, R. E., and Kunz, M. L.: Responses of an arctic landscape to Lateglacial and early Holocene climatic changes: the importance of moisture, Quaternary Sci. Rev., 21, 997–1021, https://doi.org/10.1016/S0277-3791(01)00116-0, 2002. a

Matthes, H., Damseaux, A., Westermann, S., Beer, C., Boone, A., Burke, E., Decharme, B., Genet, H., Jafarov, E., Langer, M., Parmentier, F.-J., Porada, P., Gagne-Landmann, A., Huntzinger, D., Rogers, B. M., Schädel, C., Stacke, T., Wells, J., and Wieder, W. R.: Advances in Permafrost Representation: Biophysical Processes in Earth System Models and the Role of Offline Models, Permafrost. Periglac., 36, 302–318, https://doi.org/10.1002/ppp.2269, 2025. a

Meredith, M., Sommerkorn, M., Cassotta, S., Derksen, C., Ekaykin, A., Hollowed, A., Kofinas, G., Mackintosh, A., Melbourne-Thomas, J., Muelbert, M. M. C., Ottersen, G., Pritchard, H., and Schuur, E. A. G.: Polar Regions, in: IPCC Special Report on the Ocean and Cryosphere in a Changing Climate, edited by: Pörtner, H.-O., Roberts, D., Masson-Delmotte, V., Zhai, P., Tignor, M., Poloczanska, E., Mintenbeck, K., Alegría, A., Nicolai, M., Okem, A., Petzold, J., Rama, B., and Weyer, N., Cambridge Univ. Press, https://doi.org/10.1017/9781009157964.005, 2019. a

Mironov, D., Kirillin, G., Heise, E., Golosov, S., Terzhevik, A., and Zverev, I.: Parameterization of Lakes in Numerical Models for Environmental Applications, paper presented at 7th Workshop on Physical Processes in Natural Water, 135–143, 2003. a

Montero, D.: eemont: A Python package that extends Google Earth Engine, Journal of Open Source Software, 6, 3168, https://doi.org/10.21105/joss.03168, 2021. a

Morgenstern, A., Grosse, G., Günther, F., Fedorova, I., and Schirrmeister, L.: Spatial analyses of thermokarst lakes and basins in Yedoma landscapes of the Lena Delta, The Cryosphere, 5, 849–867, https://doi.org/10.5194/tc-5-849-2011, 2011. a

Mullen, A. L., Watts, J. D., Rogers, B. M., Carroll, M. L., Elder, C. D., Noomah, J., Williams, Z., Caraballo-Vega, J. A., Bredder, A., Rickenbaugh, E., Levenson, E., Cooley, S. W., Hung, J. K. Y., Fiske, G., Potter, S., Yang, Y., Miller, C. E., Natali, S. M., Douglas, T. A., and Kyzivat, E. D.: Using High-Resolution Satellite Imagery and Deep Learning to Track Dynamic Seasonality in Small Water Bodies, Geophys. Res. Lett., 50, e2022GL102327, https://doi.org/10.1029/2022GL102327, 2023. a

Muster, S., Roth, K., Langer, M., Lange, S., Cresto Aleina, F., Bartsch, A., Morgenstern, A., Grosse, G., Jones, B., Sannel, A. B. K., Sjöberg, Y., Günther, F., Andresen, C., Veremeeva, A., Lindgren, P. R., Bouchard, F., Lara, M. J., Fortier, D., Charbonneau, S., Virtanen, T. A., Hugelius, G., Palmtag, J., Siewert, M. B., Riley, W. J., Koven, C. D., and Boike, J.: PeRL: a circum-Arctic Permafrost Region Pond and Lake database, Earth Syst. Sci. Data, 9, 317–348, https://doi.org/10.5194/essd-9-317-2017, 2017. a, b, c

Nitzbon, J., Westermann, S., Langer, M., Martin, L. C. P., Strauss, J., Laboor, S., and Boike, J.: Fast response of cold ice-rich permafrost in northeast Siberia to a warming climate, Nat. Commun., 11, 2201, https://doi.org/10.1038/s41467-020-15725-8, 2020. a

Nitze, I. and Nicholson, T.: Surface water data: Supplementary Dataset used in: Reinken et al.: Stochastic Modelling of Thermokarst Lakes: Size Distributions and Dynamic Regimes, Zenodo [data set], https://doi.org/10.5281/zenodo.15011122, 2025. a, b, c, d, e

Nitze, I., Grosse, G., Jones, B. M., Arp, C. D., Ulrich, M., Fedorov, A., and Veremeeva, A.: Landsat-Based Trend Analysis of Lake Dynamics across Northern Permafrost Regions, Remote Sens.-Basel, 9, 640, https://doi.org/10.3390/rs9070640, 2017. a, b, c, d

Nitze, I., Grosse, G., Jones, B. M., Romanovsky, V. E., and Boike, J.: Remote sensing quantifies widespread abundance of permafrost region disturbances across the Arctic and Subarctic, Nat. Commun., 9, 5423, https://doi.org/10.1038/s41467-018-07663-3, 2018. a

Nitze, I., Cooley, S. W., Duguay, C. R., Jones, B. M., and Grosse, G.: The catastrophic thermokarst lake drainage events of 2018 in northwestern Alaska: fast-forward into the future, The Cryosphere, 14, 4279–4297, https://doi.org/10.5194/tc-14-4279-2020, 2020. a, b

Ohara, N., Jones, B. M., Parsekian, A. D., Hinkel, K. M., Yamatani, K., Kanevskiy, M., Rangel, R. C., Breen, A. L., and Bergstedt, H.: A new Stefan equation to characterize the evolution of thermokarst lake and talik geometry, The Cryosphere, 16, 1247–1264, https://doi.org/10.5194/tc-16-1247-2022, 2022. a

Olefeldt, D., Goswami, S., Grosse, G., Hayes, D., Hugelius, G., Kuhry, P., McGuire, A. D., Romanovsky, V. E., Sannel, A., Schuur, E., and Turetsky, M. R.: Circumpolar distribution and carbon storage of thermokarst landscapes, Nat. Commun., 7, https://doi.org/10.1038/ncomms13043, 2016. a

Olson, D. M., Dinerstein, E., Wikramanayake, E. D., Burgess, N. D., Powell, G. V. N., Underwood, E. C., D'Amico, J. A., Itoua, I., Strand, H. E., Morrison, J. C., Loucks, C. J., Allnutt, T. F., Ricketts, T. H., Kura, Y., Lamoreux, J. F., Wettengel, W. W., Hedao, P., and Kassem, K. R.: Terrestrial Ecoregions of the World: A New Map of Life on Earth, Bioscience, 51, 933–938, https://doi.org/10.1641/0006-3568(2001)051[0933:TEOTWA]2.0.CO;2, 2001. a

Olthof, I., Fraser, R. H., van der Sluijs, J., and Travers-Smith, H.: Detecting long-term Arctic surface water changes, Nat. Clim. Change, 13, 1191–1193, https://doi.org/10.1038/s41558-023-01836-9, 2023. a

Pekel, J.-F., Cottam, A., Gorelick, N., and Belward, A. S.: High-resolution mapping of global surface water and its long-term changes, Nature, 540, 418–422, https://doi.org/10.1038/nature20584, 2016. a, b, c

Pickens, A. H., Hansen, M. C., Hancher, M., Stehman, S. V., Tyukavina, A., Potapov, P., Marroquin, B., and Sherani, Z.: Mapping and sampling to characterize global inwland water dynamics from 1999 to 2018 with full Landsat time-series, Remote Sens. Environ., 243, https://doi.org/10.1016/j.rse.2020.111792, 2020. a, b

Pinsky, M. A. and Karlin, S.: An Introduction to Stochastic Modeling, 4th edn., Academic Press, https://doi.org/10.1016/C2009-1-61171-0, 2011. a, b

Plug, L. J. and West, J. J.: Thaw lake expansion in a two-dimensional coupled model of heat transfer, thaw subsidence, and mass movement, J. Geophys. Res.-Earth, 114, https://doi.org/10.1029/2006JF000740, 2009. a, b

Polishchuk, Y. M., Bogdanov, A. N., Muratov, I. N., Polishchuk, V. Y., Lim, A., Manasypov, R. M., Shirokova, L. S., and Pokrovsky, O. S.: Minor contribution of small thaw ponds to the pools of carbon and methane in the inland waters of the permafrost-affected part of the Western Siberian Lowland, Environ. Res. Lett., 13, 045002, https://doi.org/10.1088/1748-9326/aab046, 2018. a

Rehder, Z., Zaplavnova, A., and Kutzbach, L.: Identifying Drivers Behind Spatial Variability of Methane Concentrations in East Siberian Ponds, Front. Earth Sci., 9, https://doi.org/10.3389/feart.2021.617662, 2021. a

Rehder, Z., Kleinen, T., Kutzbach, L., Stepanenko, V., Langer, M., and Brovkin, V.: Simulated methane emissions from Arctic ponds are highly sensitive to warming, Biogeosciences, 20, 2837–2855, https://doi.org/10.5194/bg-20-2837-2023, 2023. a

Reick, C. H., Gayler, V., Goll, D., Hagemann, S., Heidkamp, M., Nabel, J. E. M. S., Raddatz, T., Roeckner, E., Schnur, R., and Wilkenskjeld, S.: JSBACH 3 – The land component of the MPI Earth System Model: documentation of version 3.2, Berichte zur Erdsystemforschung, https://doi.org/10.17617/2.3279802, 2021. a

Reinken, C., Brovkin, V., and de Vrese, P.: Stochastic Thermokarst Lake Model (v1.0.0), Zenodo [code], https://doi.org/10.5281/zenodo.19053374, 2026. a

Rowland, J. C., Travis, B. J., and Wilson, C. J.: The role of advective heat transport in talik development beneath lakes and ponds in discontinuous permafrost, Geophys. Res. Lett., 38, https://doi.org/10.1029/2011GL048497, 2011. a

Roy-Léveillée, P. and Burn, C.: Old Crow Flats: Thermokarst Lakes in the Forest–Tundra Transition, in: Landscapes and Landforms of Western Canada, edited by: Slaymaker, O., Springer, Cham, 267–276, https://doi.org/10.1007/978-3-319-44595-3_19, 2017. a, b

Schädel, C., Rogers, B. M., Lawrence, D. M., Koven, C. D., Brovkin, V., Burke, E. J., Genet, H., Huntzinger, D. N., Jafarov, E., McGuire, A. D., Riley, W. J., and Natali, S. M.: Earth system models must include permafrost carbon processes, Nat. Clim. Change, 1–3, https://doi.org/10.1038/s41558-023-01909-9, 2024. a

Schlaffer, S., Sabel, D., Bartsch, A., and Wagner, W.: Regional water bodies remote sensing products with links to geotiff images, PANGAEA [data set], https://doi.org/10.1594/PANGAEA.779754, 2012. a

Smith, N. D., Burke, E. J., Aas, K. S., Althuizen, I. H. J., Boike, J., Christiansen, C. T., Etzelmüller, B., Friborg, T., Lee, H., Rumbold, H., Turton, R. H., Westermann, S., and Chadburn, S. E.: Explicitly modelling microtopography in permafrost landscapes in a land surface model (JULES vn5.4_microtopography), Geosci. Model Dev., 15, 3603–3639, https://doi.org/10.5194/gmd-15-3603-2022, 2022. a

Stojkoski, V., Sandev, T., Basnarkov, L., Kocarev, L., and Metzler, R.: Generalised Geometric Brownian Motion: Theory and Applications to Option Pricing, Entropy, 22, 1432, https://doi.org/10.3390/e22121432, 2020. a

Strauss, J., Schirrmeister, L., Grosse, G., Fortier, D., Hugelius, G., Knoblauch, C., Romanovsky, V., Schädel, C., Schneider von Deimling, T., Schuur, E. A. G., Shmelev, D., Ulrich, M., and Veremeeva, A.: Deep Yedoma permafrost: A synthesis of depositional characteristics and carbon vulnerability, Earth-Sci. Rev., 172, 75–86, https://doi.org/10.1016/j.earscirev.2017.07.007, 2017. a

Surdu, C. M., Duguay, C. R., Brown, L. C., and Prieto, D. F.: Response of ice cover on shallow lakes of the North Slope of Alaska to contemporary climate conditions (1950–2011): radar remote-sensing and numerical modeling data analysis, The Cryosphere, 8, 167–180, https://doi.org/10.5194/tc-8-167-2014, 2014. a

Turetsky, M. R., Abbott, B. W., Jones, M. C., Anthony, K. W., Olefeldt, D., Schnuur, E. A. G., Grosse, G., Kuhry, P., Hugelius, G., Koven, C., Lawrence, D. M., Gibson, C., Sannel, A. B. K., and McGuire, A. D.: Carbon release through abrupt permafrost thaw, Nat. Geosci., 13, 138–143, https://doi.org/10.1038/s41561-019-0526-0, 2020. a, b

van Huissteden, J., Berrittella, C., Parmentier, F. J. W., Mi, Y., Maximov, T. C., and Dolman, A. J.: Methane emissions from permafrost thaw lakes limited by lake drainage, Nat. Clim. Change, 1, 119–123, https://doi.org/10.1038/nclimate1101, 2011. a, b, c, d

Velichko, A., Catto, N., Drenova, A., Klimanov, V., Kremenetski, K., and Nechaev, V.: Climate changes in East Europe and Siberia at the Late glacial–holocene transition, Quatern. Int., 91, 75–99, https://doi.org/10.1016/S1040-6182(01)00104-5, 2002. a

Veremeeva, A., Nitze, I., Günther, F., Grosse, G., and Rivkina, E.: Geomorphological and Climatic Drivers of Thermokarst Lake Area Increase Trend (1999–2018) in the Kolyma Lowland Yedoma Region, North-Eastern Siberia, Remote Sens.-Basel, 13, 178, https://doi.org/10.3390/rs13020178, 2021. a

Victorov, A. S., Kapralova, V. N., and Arkhipova, M. V.: Modeling of the Morphological Pattern Development for Thermokarst Plains with Fluvial Erosion Based on Remote Sensing Data, Izv. Atmos. Ocean. Phy.+, 55, 1338–1345, https://doi.org/10.1134/S000143381909055X, 2019a. a, b

Victorov, A. S., Orlov, T. V., Kapralova, V. N., Trapeznikova, O. N., Sadkov, S. A., and Zverev, A. V.: Stochastic Modeling of Natural Lacustrine Thermokarst Under Stable and Unstable Climate, in: Natural Hazards and Risk Research in Russia, edited by: Svalova, V., Innovation and Discovery in Russian Science and Engineering, Springer International Publishing, Cham, 241–267, https://doi.org/10.1007/978-3-319-91833-4_18, 2019b. a, b, c, d, e, f, g, h, i, j, k, l

Victorov, A. S., Kapralova, V. N., and Orlov, T. V.: Developing a Model of the Morphological Pattern for Thermokarst Plains with Fluvial Erosion Involving Remote Sensing Data, Izv. Atmos. Ocean. Phy.+, 59, 1239–1248, https://doi.org/10.1134/S0001433823090220, 2023. a, b

von Baeckmann, C., Bartsch, A., Bergstedt, H., Efimova, A., Widhalm, B., Ehrich, D., Kumpula, T., Sokolov, A., and Abdulmanova, S.: Land cover succession for recently drained lakes in permafrost on the Yamal Peninsula, Western Siberia, The Cryosphere, 18, 4703–4722, https://doi.org/10.5194/tc-18-4703-2024, 2024. a, b

Walter, K. M., Edwards, M. E., Grosse, G., Zimov, S. A., and Chapin, F. S.: Thermokarst lakes as a source of atmospheric CH4 during the last deglaciation, Science, 318, 633–636, https://doi.org/10.1126/science.1142924, 2007. a

Walter Anthony, K., Schneider von Deimling, T., Nitze, I., Frolking, S., Emond, A., Daanen, R., Anthony, P., Lindgren, P., Jones, B., and Grosse, G.: 21st-century modeled permafrost carbon emissions accelerated by abrupt thaw beneath lakes, Nat. Commun., 9, 3262, https://doi.org/10.1038/s41467-018-05738-9, 2018. a, b

Wang, J., Sheng, Y., Hinkel, K. M., and Lyons, E. A.: Drained thaw lake basin recovery on the western Arctic Coastal Plain of Alaska using high-resolution digital elevation models and remote sensing imagery, Remote Sens. Environ., 119, 325–336, https://doi.org/10.1016/j.rse.2011.10.027, 2012. a

Webb, E. and Liljedahl, A.: Diminishing lake area across the northern permafrost zone, Nat. Geosci., 16, 202–209, https://doi.org/10.1038/s41561-023-01128-z, 2023. a, b

Webb, E. E., Liljedahl, A. K., Cordeiro, J. A., Loranty, M. M., Witharana, C., and Lichstein, J. W.: Permafrost thaw drives surface water decline across lake-rich regions of the Arctic, Nat. Clim. Change, 12, 841–846, https://doi.org/10.1038/s41558-022-01455-w, 2022. a

Webb, E. E., Liljedahl, A. K., Loranty, M. M., Witharana, C., and Lichstein, J. W.: Reply to: Detecting long-term Arctic surface water changes, Nat. Clim. Change, 13, 1194–1196, https://doi.org/10.1038/s41558-023-01837-8, 2023. a

Weller, M. W. and Derksen, D. V.: The Geomorphology of Teshekpuk Lake in Relation to Coastline Configuration of Alaska's Coastal Plain, Arctic, 32, 152–160, https://www.jstor.org/stable/40508958 (last access: 27 January 2025), 1979. a

West, J. and Plug, L.: Time-dependent morphology of thaw lakes and taliks in deep and shallow ground ice, J. Geophys. Res., 113, https://doi.org/10.1029/2006JF000696, 2008. a

Westermann, S., Langer, M., Boike, J., Heikenfeld, M., Peter, M., Etzelmüller, B., and Krinner, G.: Simulating the thermal regime and thaw processes of ice-rich permafrost ground with the land-surface model CryoGrid 3, Geosci. Model Dev., 9, 523–546, https://doi.org/10.5194/gmd-9-523-2016, 2016. a

Wik, M., Varner, R. K., Anthony, K. W., MacIntyre, S., and Bastviken, D.: Climate-sensitive northern lakes and ponds are critical components of methane release, Nat. Geosci., 9, 99–105, https://doi.org/10.1038/ngeo2578, 2016. a

Wu, Q.: geemap: A Python package for interactive mapping with Google Earth Engine, Journal of Open Source Software, 5, 2305, https://doi.org/10.21105/joss.02305, 2020. a

Yoshikawa, K. and Hinzman, L. D.: Shrinking thermokarst ponds and groundwater dynamics in discontinuous permafrost near council, Alaska, Permafrost. Periglac., 14, 151–160, https://doi.org/10.1002/ppp.451, 2003. a

Zhang, T., Heginbottom, J., Barry, R., and Brown, J.: Further statistics on the distribution of permafrost and ground ice in the Northern Hemisphere, Polar Geography, 24, 126–131, https://doi.org/10.1080/10889370009377692, 2000. a

Zhou, G., Liu, W., Xie, C., Song, X., Zhang, Q., Li, Q., Liu, G., Li, Q., and Luo, B.: Accelerating thermokarst lake changes on the Qinghai–Tibetan Plateau, Sci. Rep.-UK, 14, 2985, https://doi.org/10.1038/s41598-024-52558-7, 2024. a