Articles | Volume 15, issue 3
Research article
01 Apr 2021
Research article |  | 01 Apr 2021

Calving Front Machine (CALFIN): glacial termini dataset and automated deep learning extraction method for Greenland, 1972–2019

Daniel Cheng, Wayne Hayes, Eric Larour, Yara Mohajerani, Michael Wood, Isabella Velicogna, and Eric Rignot

Sea level contributions from the Greenland Ice Sheet are influenced by the rapid changes in glacial terminus positions. The documentation of these evolving calving front positions, for which satellite imagery forms the basis, is therefore important. However, the manual delineation of these calving fronts is time consuming, which limits the availability of these data across a wide spatial and temporal range. Automated methods face challenges that include the handling of clouds, illumination differences, sea ice mélange, and Landsat 7 scan line corrector errors. To address these needs, we develop the Calving Front Machine (CALFIN), an automated method for extracting calving fronts from satellite images of marine-terminating glaciers, using neural networks. The results are often indistinguishable from manually curated fronts, deviating by on average 86.76 ± 1.43 m from the measured front. Landsat imagery from 1972 to 2019 is used to generate 22 678 calving front lines across 66 Greenlandic glaciers. This improves on the state of the art in terms of the spatiotemporal coverage and accuracy of its outputs and is validated through a comprehensive intercomparison with existing studies. The current implementation offers a new opportunity to explore subseasonal and regional trends on the extent of Greenland's margins and supplies new constraints for simulations of the evolution of the mass balance of the Greenland Ice Sheet and its contributions to future sea level rise.

1 Introduction

The evolution of Greenland's tidewater glaciers is an important constraint on the evolution of the Greenland Ice Sheet (Nick et al.2013). Likewise, changes in Greenland are important in tracking and predicting future sea level rise over the next century (Andersen et al.2015; Fürst et al.2015; van den Broeke et al.2016). Constraining Greenland's glacial evolution is thus an important part of improving the understanding of the Earth system as a whole. One constraint on glacial evolution is the position of glacial calving fronts and ice margins over time (King et al.2018). While satellite imagery allows for the extensive documentation of this evolving constraint, most calving front delineation is still done with time-consuming manual labor (Carr et al.2017; Bunce et al.2018; Catania et al.2018). This results in the under-utilization of available satellite imagery and causes gaps in seasonal records that introduce uncertainty when modeling past and projected climate change (Catania et al.2020). Significant efforts have been made to improve this situation, which include the ESA-CCI dataset of 26 Greenlandic glaciers from 1990–2016, the PROMICE dataset of 47 glaciers from 1990–2018, and the MEaSUREs dataset of 200+ glaciers from 2000–2017 (ENVEO2017; Andersen et al.2019; Joughin et al.2015). Yet the increasing availability of new datasets through missions like Landsat 8 and the release of old datasets through improved reprocessing call for new automated ways of detecting the calving front. In particular there is a strong need for these automated ways to be robust, specifically against cloud cover, ice mélange, shadows, and Landsat 7 scan line corrector errors. Traditional automated techniques such as the edge detection utilized by Seale et al. (2011) and Paravolidakis et al. (2016) have significant challenges with respect to these issues. Modern machine-learning techniques and deep neural networks provide a robust, scalable, and accurate solution to these processing challenges. Existing work by Mohajerani et al. (2019) pioneers the usage of these techniques by applying the Ronneberger et al. (2015) UNet deep neural network for Jakobshavn, Helheim, Sverdrup, and Kangerlussuaq glaciers. It achieves a mean distance error of 96.3 m, but is restricted by the preprocessing requirement of aligning the flow direction to be vertical and inability to handle branching/nonlinear calving fronts. Zhang et al. (2019) evaluates a modified UNet applied to TerraSAR-X data over Jakobshavn Glacier and achieves a mean distance error of 104 m but is limited in scope. Baumhoer et al. (2019) expands the application of the UNet to Sentinel-1 imagery of Antarctica, extracting full coastline delineations and achieving a mean distance error of 108 m. Ultimately, these case studies provide the groundwork for the automatic, accurate, large-scale, long-time-series, high-temporal-resolution, and potentially multi-sensor extraction of glacial terminus positions. This study seeks to assess the feasibility of achieving robust automatic extraction for a selection of Greenland's glaciers and to provide the resulting dataset for use by the wider community. Additionally, this study seeks to assess improvements to the neural network design and post-processing methods.

In this study, Sect. 2 covers the data source along with the spatial and temporal coverage. Section 3 examines the CALFIN algorithm and method for processing the data. Section 4 validates the algorithm through error analysis. Sections 5 and 6 show and discuss the results – the calving front dataset and algorithm.

2 Data source and scope

For the production of the CALFIN dataset, Landsat optical images are used for their long time series availability and reasonable spatial distribution/resolution. The area of interest for the dataset production is restricted to Greenland, in particular the calving fronts for 66 Greenlandic basins shown in Fig. 1, spanning the 1972 to 2019 time period shown in Fig. 2. The basins are selected for their high discharge volumes, wide spatial distribution, and diverse morphological features. The product used is the 60/30 m resolution near-infrared band. The 15 m resolution panchromatic band was not used, due to computational and logistical limitations. A unique characteristic of this data source is the presence of Landsat 7 scan line corrector errors from 2003–2013, which manifests as black stripes that interfere with automated calving front extraction methods.

For the training and validation of the CALFIN methodology, TerraSAR-X and Sentinel-1A/B synthetic-aperture radar (SAR) images are added to enforce the applicability of the method across different sensors and domains. The area of interest for the training and validation of the methodology thus includes Antarctic SAR data in addition to the Greenlandic Landsat optical data (see Sect. 3.2 and Fig. S4 in the Supplement). The TerraSAR-X product used is the stripmap 3 m resolution HH polarization band. The Sentinel-1A/B product used is the Extra Wide Swath, Ground Range Multi-Look Detected, 40 m resolution HH polarization band. The other data products and polarization bands are not used since the backscatter intensity provides sufficient information for the data processing methodology to succeed. A characteristic of SAR data is the presence of speckle noise, which is addressed by the methodology described in the following section.

Figure 1Spatial coverage map. Spatial distribution of 66 selected Greenlandic glaciers. The velocity map is taken from Nagler et al. (2015).

Figure 2Temporal coverage map. Number of fronts per year from 1972–2019 for 10 high discharge volume basins. For the full temporal coverage map, see Supplement Fig. S1.


3 Methods

The automated data processing methodology uses innovative techniques and state-of-the-art neural networks to process raw Landsat and Sentinel-1A/B data into useful calving front shapefiles. The following section explores this methodology, as outlined by the flowchart below (Fig. 3).

Figure 3Methodology flowchart. The CALFIN workflow, which processes single band raster imagery into calving front and ocean mask shapefiles. Note that Sentinel-1A/B imagery is only used for validation, as it is not corrected and thus not qualified for geolocation/extraction.


3.1 Preprocessing

The first stage involves preprocessing the input data for use with the neural network, as illustrated in Fig. 4. The proceeding steps cover the details of handling Landsat data but can be applied to Sentinel-1 data for validation purposes.

Figure 4Preprocessing pipeline. (a) First, input the raw Landsat GeoTIFF rasters with < 20 % clouds. (b) Next, subset using QGIS/GDAL and the domain shapefile to clip each raster. (c) Then, filter the clouded/NODATA subsets. (d) Now, resize the subsets to 256×256 px. (e) Finally, enhance contrast and stack with the raw subset.


To begin, raster images are selected from areas centered around one of nine primary glacial basins. These basins include Kong Oscar, Hayes, Rink Isbrae, Upernavik, Jakobshavn, Kangiata Nunaata Sermia, Helheim, Kangerlussuaq, and Petermann. Next, all L1TP (precision and terrain corrected) rasters from Landsat 1–8 with low cloud coverage (< 20 %) are collected. A few L1GS/L1GT (non-corrected) products are also selected, which are manually georeferenced, and used to fill in Landsat 1–2 time series gaps (1972–1985). This results in a total of 4956 Landsat rasters. Next, predefined basin domain shapefiles that enclose the terminus are used to clip the Landsat raster subsets. Additional filtering removes subsets that still contain  30 % NODATA pixels or  20 % cloud pixels detected in the Landsat QA band, as subsets that exceed these thresholds are not likely to contain detectable fronts. At this stage, 20 188 GeoTIFF subsets are accumulated. Each subset is then resized to 256×256 px and lastly enhanced using pseudo-HDR toning (HDR) and shadows/highlights (S/H) through Adobe Photoshop. The raw, HDR, and S/H enhanced subsets are then stacked into a single RGB image. At this point, the images are ready for processing into calving front masks.

3.2 Neural network processing

Images are processed using the Calving Front Machine neural network (CALFIN-NN), as illustrated in Fig. 5. Neural networks like CALFIN-NN work by learning patterns in training data and finding them in new data. CALFIN-NN is trained using manually delineated calving front masks. Once trained, CALFIN-NN outputs a probability mask that shows each pixel's likelihood of lying on the coastline/calving front. CALFIN-NN also generates a ice/ocean probability mask as a secondary output. Following this, the calving front is extracted during post-processing, discussed in Sect. 3.3.

Figure 5The CALFIN-NN processing architecture. Each orange “Xception” block consists of convolution kernels that detect features in the previous block. Blocks are reduced in size periodically to pool increasingly complex and numerous feature maps. U-shaped connections help refine the probability masks during up-sampling. Note that the seven repeated Xception blocks in the middle section are omitted for brevity.


Neural networks are the foundation of several automated delineation methods, including Mohajerani et al. (2019), Zhang et al. (2019), and Baumhoer et al. (2019). This method builds upon this work and uses a modification of the DeepLabV3+ Xception neural network from Chen et al. (2018), as shown in Fig. 5. The first half, the encoder, uses the Xception-65 network to extract image features (Chollet2017). It does this by assembling basic features, like edges and corners, into more abstract features, such as glacier/land textures. The second half of the network, the decoder, takes the output of the encoder and up-samples the features to predict the final probability mask outputs.

Several architectural modifications are made to the original DeepLabV3+ Xception model to enhance its performance. To accurately recognize line-like features such as calving fronts, additional Atrous Spatial Pyramidal Pooling (ASPP) blocks are added in between the encoder and decoder, with the dilation scales 0, 1, 2, 3, 4, and 5. The number of middle blocks (MB in Fig. 5) is reduced from 16 to 8, as the extra discriminative power from those blocks is not needed. The input size is reduced from 512 to 224 px to facilitate better computational performance, allowing for additional training and thus higher accuracy. Since the input resolution is reduced, the encoder is also modified to remove several down-sampling “max-pool” layers. The last contribution adds a two-channel output to the decoder, allowing for both calving front masking and ice/ocean masking. Together, these changes reduce number of model parameters from 40 to 29 M while also increasing the overall accuracy.

Several techniques are used during the training of CALFIN-NN to improve its performance. First, a large set of training data is manually delineated (see Fig. S4), totalling 1541 Landsat and 232 Antarctic Sentinel-1A/B image–mask pairs, with the Antarctic data taken from the same training scenes used by Baumhoer et al. (2019). Data augmentation is used to increase the accuracy of the network by expanding the training set, which entails adding random amounts of flips, Gaussian noise, sharpening filters, rotations of up to 12, crops, and scaling to the pre-processed training images. Through empirical testing, it is determined that excessive image padding, rotation, warping, and cropping of calving fronts to close to the image bounds result in suboptimal performance. Another helpful technique is the use of test-time augmentations, wherein each image subset is cut into nine overlapping 224×224 image windows and processed individually, before being reassembled into the final 256×256 output mask. This allows for multiple independent classifications of the central pixels, ensuring agreement and confidence in detected calving fronts. To increase accuracy, a custom loss function optimizes the binary cross entropy and intersection over union (see Eq. 1, Sect. 4.1) (Mannor et al.2005). This penalizes mismatches between calving front pixels in the predicted (Icf) and measured (I^cf) image masks. Mismatched ice/ocean pixels in the predicted (Iio) and measured (I^io) image masks are less heavily weighted by an empirically chosen factor of α=1/25, as seen in the final loss function in Eq. (2).


After integrating these improvements, CALFIN-NN is trained for a total of 80 epochs, with 4000 batches per epoch and eight images per batch. Training is carried out on a K40 NVIDIA Tesla GPU with 12 GB of VRAM, with each epoch taking about 126 min to complete and almost 1 week in total to obtain the optimal weights at epoch 65. Once trained, an NVIDIA GTX1060 with 6 GB VRAM is used for the offline data processing of the 20 188 GeoTIFF subsets. The CALFIN algorithm takes about 3.5 d to process all of the subsets into calving fronts, excluding preprocessing but including post-processing, as discussed in the following section.

3.3 Post-processing

At this stage, the two-channel pixel mask output of CALFIN-NN is post-processed to extract the shapefile data products (Fig. 6).

Figure 6Postprocessing pipeline. (a) First, get the processed image from CALFIN-NN. (b) Then, isolate and re-process each front. (c) Next, filter unconfident predictions. (d) Now, fit line and mask static coastline (see also Fig. 7). (e) Lastly, export and validate the shapefile.

First, a polyline is fit to the pixel mask to retrieve the correct coastline boundary. This is performed by converting each pixel in the mask to nodes in a graph, connecting the nearest neighboring nodes, and then finding the single longest path in the graph's minimum spanning tree (MST) (Kruskal1956). This path not only corresponds with the coastline edge, but also outperforms outputs from other contour-finding algorithms by eliminating noise, errors, and gaps inherited from previous steps. Such gaps are given weights based on the negative exponential distances between nodes, which allows for connections if the joined paths are significantly longer than the gap itself. A visual example is given in Fig. 7a–d.

Figure 7Mask to polyline algorithm. (a) First, extract the coastline mask (red/yellow) from the CALFIN-NN output. (b) Then create a graph, connecting each pixel (red) to 15 % of its nearest neighbors with an edge (black). (c) Next, create an MST from the graph. (d) Now, extract the longest path from the MST. (e) Finally, mask the static coastline using the fjord boundaries (cyan) to extract the calving front.

Next, the calving front is isolated from the coastline polyline. Static masks of the average fjord boundaries are manually created for each basin using the image subsets and BedMachine v3 for reference (Morlighem et al.2017). By calculating the distance from each point in the coastline to the nearest fjord boundary pixel, and then selecting the contiguous pixels which are the farthest from the fjord boundaries, the calving front can be isolated. The result of this is shown in Fig. 7e.

Once each front is located, its bounding box is used to extract a higher-resolution subset from the original image and reprocessed. This innovation allows for increased spatial accuracy when processing multiple fronts in large basins. After reprocessing, the nature of CALFIN-NN's two-channel output as a confidence measure is exploited to filter out uncertain detections. Since the neural network assigns each pixel a value between 0 and 1 based on its perceived class, any deviation from these two values can used as a measure of uncertainty. The filtering method averages the deviation of the ice/ocean classification mask in a 5-pixel-wide buffer around the calving front and discards any fronts whose mean deviation exceeds an empirically chosen threshold of 0.125.

The last step is to export the polylines and the corresponding polygon as georeferenced shapefiles. First, the polylines are smoothed to eliminate noise artifacts inherited from previous steps, deviating no more than 1 pixel from the raw extracted coastline (see Supplement Fig. S2). Next, the smoothed polylines, fjord boundary mask, and land-ice/ocean masks are combined to create a polygonal ocean mask. Optionally, manual verification of each output with the original GeoTIFF subset can be performed. This was done for all cases in this study to ensure the validity of the automated pipeline. This constrains the mean distance error to be < 100 m, as covered in the following section.

4 Validation

Two methods are used to evaluate CALFIN. For the primary method, the error is estimated by calculating the mean–median distance between predicted and manually delineated fronts (see Fig. 8a and Sect. 4.1). For the secondary method, the classification accuracy is calculated with the intersection-over-union metric (see Fig. 8b and Sect. 4.2). Additionally, the detection accuracy is evaluated, and the associated confusion matrix is provided (see Table 1 and Sect. 4.4). These metrics are evaluated on several validation sets, taken from existing studies as discussed in Sect. 1. These validation sets contain data that are excluded during model training. This prevents the models from memorizing data and skewing the accuracy assessment.

Figure 8Error measures. (a) A visual outline of mean–median distance error estimation and (b) classification accuracy using intersection over union (IoU) for (i) the primary calving front and (ii) the secondary ice/ocean mask, respectively.


4.1 Error estimation

The primary quality assessment method is the mean distance error (Mohajerani et al.2019; Zhang et al.2019; Baumhoer et al.2019). Conceptually, this method resembles the numerical integration of the area between two curves, normalized by the average length of the curves (see Fig. 8a). Also referred to as the area over front (A/F) in literature, this method can also be seen as a generalization of the method of transects along arbitrarily oriented fronts (Mohajerani et al.2019; Baumhoer et al.2019). This metric is implemented by taking the mean–median of the distances between closest pixels in the predicted and manually delineated fronts. Note that pixel distance is biased to be inversely proportional to a network's input size, so the error in meters is also provided in the following analysis.

4.2 Classification accuracy

The secondary quality assessment method calculates the intersection over union (IoU) (Baumhoer et al.2019). This metric evaluates the degree of overlap between the predicted and manually delineated masks of the calving front. It is calculated by dividing the number of pixels in the intersection of two masks over the number of pixels in the union of the two masks (see Fig. 8b). When calculating the IoU of 3-pixel-wide edges, this measure is very strict: 1 pixel of difference results in a score of 0.5, and scores at or above that range are indicative of human levels of accuracy. When calculating the IoU of land-ice/ocean masks, this measure is less strict, and scores at or above 0.9 indicate human levels of accuracy.

4.3 Validation results

The following subsections show tables with the above metrics for the associated validation sets, the values from the original studies, and a subset of the outputs of CALFIN-NN on each. The primary validation set, the CALFIN validation set (CALFIN-VS), consists of 162 images with clouds, illumination differences, ice mélange, and Landsat 7 scan line corrector errors (L7SCEs). The CALFIN-VS contains data from 62 Greenlandic basins, including Helheim, which was specifically excluded from CALFIN's training set for validation purposes – as done by Mohajerani et al. (2019). The CALFIN-VS ensures CALFIN-NN produces consistent results on new data, addressing concerns raised by Zhang et al. (2019), Sect. 7.3. To evaluate performance on Landsat 7 scan line corrector errors, the validation subset CALFIN-VS-L7-only isolates images with L7SCEs, and the CALFIN-VS-L7-none excludes images with L7SCEs. To allow for comparisons between studies, CALFIN-NN's performance metrics on previous studies' validation sets are also shown, where appropriate. The sets include the 10 Landsat Helheim subsets used in Mohajerani et al. (2019) (M-VS), the six TerraSAR-X Jakobshavn subsets used in Zhang et al. (2019) (Z-VS), and 62 Sentinel-1 Antarctic basins taken from the 11 validation scenes used in Baumhoer et al. (2019) (B-VS). Note that the error metrics are still sensitive to how each study implements them, which are nevertheless reproduced and documented for comparison's sake. These concerns are also addressed in the comprehensive inter-model comparison, discussed in Sect. 6.

CALFIN-NN performs well on the CALFIN-VS (Fig. 9). The true mean distance error of the CALFIN dataset is calculated to be 86.76 ± 1.43 m with 95 % confidence. When including only images with L7SCEs (CALFIN-VS-L7-only), the error is 91.93 m, showcasing CALFIN-NN's unique robustness to L7SCEs. Intuitively, excluding “difficult” images with L7SCEs in the validation set (CALFIN-VS-L7-none) decreases the error to 81.65 m. The median distance error is only 44.59 m, showing that only a few outliers contribute considerably to the mean. For full outputs, see Supplement Figs. S5–S8.

Figure 9CALFIN-VS validation output results. Yellow represents human (green) and machine (red) agreement on the front location. Note that the drop in mean pixel distance despite the increase in mean meter distance (and vice versa) comes from L7SCE images being reprocessed at lower sizes due to detection failures (see Fig. 6c) and pixel error bias being inversely related to input size (see Sect. 4.1).

Figure 10M-VS validation output results. Note that CALFIN-NN has never trained on Helheim but can still predict the front under different conditions and preprocessing methods. See Fig. S9 for full outputs.

Figure 11Z-VS validation output results. CALFIN-NN works well on SAR data in addition to optical data. See Fig. S10 for full outputs.

Figure 12B-VS validation output results. Similar to Z-NN, B-NN uses a high-resolution input (768×768) relative to CALFIN-NN (224×224), which skews the mean pixel distance comparison in CALFIN-NN's favor. See Figs. S11–S12 for full outputs.

CALFIN-NN performs well on the M-VS (Fig. 10). This demonstrates CALFIN-NN's ability to accurately process new data, which builds upon the Mohajerani et al. (2019) neural network (M-NN). Note that M-NN implements distances errors differently and omits ice/ocean masks from the evaluation. These differences are further explored in the Sect. 6 model intercomparison.

CALFIN-NN performs competitively on the Z-VS (Fig. 11). It achieves a similar mean meter distance (115.24 m vs. 104 m) despite being constrained to using lower-resolution TerraSAR-X data. Note though that the Zhang et al. (2019) neural network (Z-NN) uses higher-resolution input data (960×720) compared to CALFIN-NN (224×224), which skews the mean pixel distance comparison, where CALFIN-NN performs better (2.11 px vs. 17.3 px). Another source of skew comes from CALFIN-NN confidence filtering, as only 8 of 12 fronts in the set are confidently detected (see Sect. 4.4). Increasing CALFIN-NN’s input resolution and training on higher-resolution SAR data may enable CALFIN-NN to detect more fronts with greater accuracy.

CALFIN-NN performs subpar on the B-VS (Fig. 12). When comparing the mean distance error with the Baumhoer et al. (2019) equivalent area-over-front (A/F) error, the Baumhoer et al. (2019) neural network (B-NN) outperforms CALFIN-NN (330.63 m vs 108 m). Note that the easily detected static coastlines are masked out, raising the relative error and negatively impacting CALFIN-NN’s performance on this metric. When comparing metrics that isolate the calving front, the absolute median distance error is calculated (achieving 112.75 m), whereas Baumhoer et al. (2019) uses signed median distance error (achieving 0 m), which is not directly comparable in this context and thus omitted. Currently, the error is affected by kilometer-range deviations in very large domains like Voyeykov Ice Shelf and differences in sea ice mélange as seen along the Gillet and Wordie ice shelves, which would be consistent with findings in Baumhoer et al. (2019), Sect. 5.2. After excluding such outliers, fronts are detected in 55 out of 62 domains (88.71 %), achieving median distance errors of 0.95 px (127.87 m). Intensive retraining on ice shelves may be required for CALFIN-NN to improve.

4.4 Detection accuracy

Lastly, CALFIN-NN is shown to automatically filter images that do not have detectable calving fronts. To verify this, 13 images are included in the CALFIN-VS which do not contain calving fronts discernible to the human eye. The true positive (TP), true negative (TN), false positive (FP), and false negative (FN) rates are computed for the entire 162 image CALFIN-VS, and the associated confusion matrix is shown in Table 1. Note that CALFIN-NN does not output any false positives on the CALFIN-VS. While this ensures accurate fronts are output rather than incorrect fronts, this filtering behavior removes potentially large errors and must be accounted for when comparing errors across other sets.

Table 1Confusion matrix: CALFIN-NN misses fronts in 8 of 149 valid CALFIN-VS images, but this trade-off is acceptable.

Download Print Version | Download XLSX

5 Results and discussion

The code implementation of the CALFIN method is released, along with its associated calving front data products as described in the following section, for use within the scientific community. The CALFIN dataset spans 66 Greenlandic basins, over the period September 1972–June 2019. It consists of over 1500 manual delineations and 22 678 total calving fronts. Two levels of CALFIN data products are provided. The Level 0 products include the shapefile domains used for subsetting, the neural network training image–mask pairs, the fjord boundary masks, the full Landsat scene ID list, and the quality assurance images for validation purposes. The use cases of Level 0 products may include studies of reproducibility, validation, or training new neural networks. The Level 1 products include the calving front polyline and polygon shapefiles. The polyline product consists of the isolated, refined, georeferenced, and verified calving fronts for each domain. The polygon product consists of an ocean mask bounded by the domain subset, the fjord boundaries, and the calving front(s), for each domain. Both of the shapefiles share a common metadata feature schema (see Table S2) derived from the MEaSUREs glacial termini dataset (Moon and Joughin2008; Joughin et al.2015), and names are derived from Bjørk et al. (2015). These products can be found via these links to (last access: 29 March 2021) and (Cheng et al.2020).

Figure 13Terminus advance and retreat over time. (a–j) Basin setup (left) and graph (right) for 10 high-discharge basins. Positive length change represents retreat relative to the earliest position along the centerlines in red. Note the seasonal variations captured by CALFIN, in blue. Time series for other studies span 1990–2016 (ESA-CCI), 2000–2017 (MEaSUREs), and 1999–2019 (PROMICE). Note the seasonal variations shown by the solid lines and the dotted lines from 1972–1985 that indicate a lack of such seasonal observations. Also note that the vertical axis scaling is applied differently for each graph to highlight seasonal trends.

With the new data available to use in the CALFIN dataset, it is possible to explore seasonal trends across the Greenland Ice Sheet and validate a subset of 10 high-discharge basins of interest against existing ESA-CCI, MEaSUREs, and PROMICE data products (ENVEO2017; Joughin et al.2015; Andersen et al.2019). Figure 13 shows the high temporal resolution and spatial accuracy of the CALFIN data product alongside corresponding available data products from 1972–2019. For Joughin et al. (2015), if a date range is given, the same relative change at both start and end dates (Moon and Joughin2008) is plotted. For Andersen et al. (2019), 15 August is used as the end-of-melt-season date of delineation, as the date is otherwise not specified in the provided data. The advance and retreat of the calving front along the basin centerlines is relative to their earliest positions. Note the large improvement in temporal/seasonal coverage and the general agreement of CALFIN with existing data products. Note also that the discrepancies such as that during 2005–2009 in Jakobshavn (Fig. 13e) mostly stem from a lack of winter coverage during Landsat's optical blackout period. Additional outliers in Kong Oscar (Fig. 13g) stem from the somewhat arbitrary delineation of the ice tongue front position. Kangiata Nunaata Sermia (Fig. 13j) suffers from both of the aforementioned effects but otherwise shows the same general agreement with existing datasets from 2000 onwards.

Figure 14Regional terminus advance and retreat over time. (a) Regional delineations (left) and terminus position graphs (right) for Greenland (b), as well as the northwestern (c), central western (d), central eastern (e), southeastern (f), and southwestern (g) regions. Note that the total Greenland mean advance and retreat is unadjusted and dominated by the trend lines of numerous smaller glaciers in CW and NW Greenland. Note that branches in the 66 studied basins are independently counted, for a total of 87 glaciers.

Additionally, Fig. 14 shows the regional mean advance and retreat change, alongside the mean for the entirety of Greenland covered by the CALFIN dataset. Contributions from NW Greenland influence the overall trend the most, due to the presence of many small glaciers/branches in the region. Note that the mean for Greenland also includes contributions from Petermann, which is visible in the summers of 2010 and 2012. Shared regional trends are visible across NW and CW Greenland, which both show relative stability before 2000, followed by steady retreat up until 2017–2018. CE and SE Greenland also share a similar but less pronounced retreat, showing an accelerating retreat beginning around 1995. These regional trends are less visible in SW Greenland, which is dominated by Narsap Sermia's retreat from 2010–2013. Overall, these regional trends generally agree with studies such as Wood et al. (2021) and King et al. (2020), helping further validate the CALFIN method and data.

6 Inter-model comparison

To further reinforce the validity of the study, and address the shortcomings of different error metric comparisons (as discussed in Sect. 4.3), a comprehensive inter-model comparison is conducted between CALFIN-NN and the model developed by Mohajerani et al. (2019) (M-NN). This experiment seeks to understand how both models perform, holding all other variables constant. In particular, this experiment seeks to determine if the M-NN model, and by extension other UNet models, perform on par with the CALFIN-NN model, given the same training data. This task involves retraining the M-NN on CALFIN training data and comparing its performance against CALFIN-NN using a shared validation set. For the fairest results, only images without L7SCEs are evaluated in this validation set – CALFIN-VS-L7-none – which is within the known capabilities of the M-NN. Furthermore, the same pre- and post-processing is applied to both models.

Table 2Model intercomparison error table. Metrics for the CALFIN-NN and M-NN models on all non-Landsat 7 test images in the CALFIN validation set.

Download Print Version | Download XLSX

Across all non-Landsat 7 test images in the CALFIN validation set, CALFIN-NN attains a 2.27 px (81.65 m) mean distance between the predicted and the manually delineated fronts. This exceeds the level of accuracy achieved by the model from Mohajerani et al. (2019), which after retraining on CALFIN training data is 4.45 px (201.35 m). Note again that Landsat 7 images were excluded during reevaluation for the M-NN. This supports the findings that the CALFIN-NN architecture is an improvement over existing UNet models.

With this added context, the validation table is reproduced from Sect. 4.3, Fig. 10, and the error analysis is continued below. To reemphasize the differences in mean distance error calculation between different studies, Mohajerani et al. (2019) begins by breaking each predicted front to 1000 smaller segments within a small buffer from the fjord walls and calculating the mean deviation between the segments of the predicted and manually delineated fronts. The method begins by averaging the mean distance between each pixel of the predicted front and the closest pixel of the manually delineated front as detailed in Sect. 4.1. While the line-segment methodology of Mohajerani et al. (2019) provides a stricter estimate by enforcing close agreement between corresponding front segments, the CALFIN method allows for non-aligned evaluation of the mean distance error. Although both implementations quantify the differences between the lines, the differences in implementation should still be considered when evaluating the comparison below.

Table 3M-VS validation output results. Accuracy and error metrics for the CALFIN-NN and the M-NN models on the M-VS. Again, some metrics are not provided by Mohajerani et al. (2019), so they are omitted from this table. NA – not available

Download Print Version | Download XLSX

Across all 10 test images in the M-VS, CALFIN-NN attains a 2.56 px (97.72 m) mean distance between the predicted and the manually delineated fronts. This approaches the level of accuracy achieved in the original study, which is 1.97 px (96.31 m). This supports the findings that the CALFIN-NN architecture generalizes to new data well. Note that CALFIN-NN's larger network size requires additional training data to avoid over-fitting, or memorizing, the training data, which could explain the slightly lower accuracy when compared to the M-NN. In summary, this comprehensive model intercomparison supports the hypothesis that the CALFIN-NN model improves on existing studies and is generalizing well.

7 Conclusion

Overall, the goal of automatically delineating calving fronts from satellite imagery is accomplished. The CALFIN method uses the cutting edge in deep learning architectures, allowing for robustness to minor cloud cover, Landsat 7 scan line corrector errors, and illumination changes. The method is validated through a comprehensive data intercomparison with existing studies, and the results deviate by on average 86.76 ± 1.43 m from the measured fronts. Regional trends show larger-than-average absolute retreat in SE Greenland, and new subseasonal trends are available for further investigation with the release of the 22 678 calving front lines generated across 66 Greenlandic glaciers. Future work may entail accuracy improvements, expansion of included domains, usage of SAR data sources, and near-real-time data products. Within the community, the benefits of standardized training, validation sets, and outputs/metadata are anticipated. The community's development of new automated extraction studies, such as grounding line delineation, iceberg tracking, and sea ice mélange measurements, is also anticipated. A key takeaway is the maturation of neural networks for automated calving front detection. Specifically, a well-trained network now approaches human levels of accuracy in picking arbitrary glacial calving fronts. This reinforces existing studies on the viability of the methodology and paves the way for applications on other data processing tasks. Ultimately, this work showcases the state-of-the-art in automated calving front detection and provides a new database of glacial termini positions for the cryosphere community.

Code and data availability

The code used to automate the implementation of the CALFIN pipeline is freely available at (last access: 29 March 2021) and (Cheng2020). It is written in Python 3, using the Keras and TensorFlow libraries. The data generated by CALFIN are currently available at (Cheng et al.2020).


The supplement related to this article is available online at:

Author contributions

DC developed the code/model, created the training data, carried out the data processing/error analysis, and wrote the majority of the manuscript. WH provided input on the processing methodology, post-processing algorithms, error analysis, discussion topics, and writing the manuscript. EL provided key direction for the overall study, error analysis, outputs, and writing the manuscript. YM performed the model intercomparison and assisted with the writing of the manuscript. MW performed the data preprocessing for the model intercomparison. IV assisted in organizing collaborators and the model intercomparison. ER contributed suggestions regarding the error analysis and intercomparison. WH, EL, MW, and YM revised the manuscript and results.

Competing interests

The authors declare that they have no conflict of interest.


This work was conducted as a collaboration between NASA's Jet Propulsion Laboratory and the University of California, Irvine. The CALFIN neural network architecture implementation is derived from Emil Zakirov's DeepLabV3+ Xception code base at (last access: 13 August 2020). We acknowledge the USGS for providing Landsat 1–8 images; the ESA for their Sentinel-1 images; and the ESA-CCI, PROMICE, and MEaSUREs programs for providing calving front data used in this study. Additionally, we thank the editors and reviewers for their contributions to the improvement of this paper.

Review statement

This paper was edited by Stef Lhermitte and reviewed by Celia A. Baumhoer and one anonymous referee.


Andersen, J. K., Fausto, R. S., Hansen, K., Box, J. E., Andersen, S. B., Ahlstrøm, A. P., Dirk, Citterio, M., Colgan, W., Karlsson, N. B., Kjeldsen, K. K., Korsgaard, N. J., Larsen, S. H., Mankoff, K. D., Pedersen, A. Ø., Shields, C. L., Solgaard, A. and Vandecrux, B.: Update of annual calving front lines for 47 marine terminating outlet glaciers in Greenland (1999–2018), GEUS Bulletin, 43,, 2019. a, b, c

Andersen, M., Stenseng, L., Skourup, H., Colgan, W., Khan, S., Kristensen, S., Andersen, S., Box, J., Ahlstrøm, A., Fettweis, X., and Forsberg, R.: Basin-scale partitioning of Greenland ice sheet mass balance components (2007–2011), Earth Planet. Sc. Lett., 409, 89–95,, 2015. a

Baumhoer, C. A., Dietz, A. J., Kneisel, C., and Kuenzer, C.: Automated Extraction of Antarctic Glacier and Ice Shelf Fronts from Sentinel-1 Imagery Using Deep Learning, Remote Sensing, 11, 2529,, 2019. a, b, c, d, e, f, g, h, i, j, k

Bjørk, A. A., Kruse, L. M., and Michaelsen, P. B.: Brief communication: Getting Greenland's glaciers right – a new data set of all official Greenlandic glacier names, The Cryosphere, 9, 2215–2218,, 2015. a

Bunce, C., Carr, J. R., Nienow, P. W., Ross, N., and Killick, R.: Ice front change of marine-terminating outlet glaciers in northwest and southeast Greenland during the 21st century, J. Glaciol., 64, 523–535,, 2018. a

Carr, J. R., Stokes, C. R., and Vieli, A.: Threefold increase in marine-terminating outlet glacier retreat rates across the Atlantic Arctic: 1992–2010, Ann. Glaciol., 58, 72–91,, 2017. a

Catania, G. A., Stearns, L. A., Sutherland, D. A., Fried, M. J., Bartholomaus, T. C., Morlighem, M., Shroyer, E., and Nash, J.: Geometric Controls on Tidewater Glacier Retreat in Central Western Greenland, J. Geophys. Res.-Earth, 123, 2024–2038,, 2018. a

Catania, G. A., Stearns, L. A., Moon, T. A., Enderlin, E. M., and Jackson, R. H.: Future Evolution of Greenland's Marine-Terminating Outlet Glaciers, J. Geophys. Res.-Earth, 125, e2018JF004873,, 2020. a

Chen, L., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H.: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, European Conference on Computer Vision, arXiv [preprint], 801–818, arXiv:1802.02611, 2018. a

Cheng, D.: daniel-cheng/CALFIN: CALFIN v1.0.0, Zenodo,, 2020. a

Cheng, D., Hayes, W., and Larour, E.: CALFIN: Calving Front Dataset for East/West Greenland, 1972–2019, Dyrad,, 2020. a, b

Chollet, F.: Xception: Deep Learning with Depthwise Separable Convolutions, Computer Vision and Pattern Recognition, 1800–1807,, 2017. a

ENVEO: Greenland Calving Front Dataset, 1990–2016, v3.0, available at: (last access: 14 August 2020), 2017. a, b

Fürst, J. J., Goelzer, H., and Huybrechts, P.: Ice-dynamic projections of the Greenland ice sheet in response to atmospheric and oceanic warming, The Cryosphere, 9, 1039–1062,, 2015. a

Joughin, I., Moon, T., Joughin, J., and Black, T.: MEaSUREs Annual Greenland Outlet Glacier Terminus Positions from SAR Mosaics, Version 1, NSIDC,, 2015. a, b, c, d

King, M. D., Howat, I. M., Jeong, S., Noh, M. J., Wouters, B., Noël, B., and van den Broeke, M. R.: Seasonal to decadal variability in ice discharge from the Greenland Ice Sheet, The Cryosphere, 12, 3813–3825,, 2018. a

King, M. D., Howat, I. M., Candela, S. G., Noh, M. J., Jeong, S., Noël, B. P. Y., van der Broeke, M. R., Wouters, B., and Negrete, A.: Dynamic ice loss from the Greenland Ice Sheet driven by sustained glacier retreat, Nature News, 1,, 2020. a

Kruskal, J. B.: On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem, P. Am. Math. Soc., 7, 48–50,, 1956. a

Mannor, S., Peleg, D., and Rubinstein, R.: The Cross Entropy Method for Classification, in: Proceedings of the 22nd International Conference on Machine Learning, ICML '05, Association for Computing Machinery, New York, NY, USA, 561–568,, 2005. a

Mohajerani, Y., Wood, M., Velicogna, I., and Rignot, E.: Detection of Glacier Calving Margins with Convolutional Neural Networks: A Case Study, Remote Sensing, 11, 74,, 2019. a, b, c, d, e, f, g, h, i, j, k, l

Moon, T. and Joughin, I.: Changes in ice front position on Greenland’s outlet glaciers from 1992 to 2007, J. Geophys. Res.-Earth, 113, F02022,, 2008. a, b

Morlighem, M., Williams, C. N., Rignot, E., An, L., Arndt, J. E., Bamber, J. L., Catania, G., Chauché, N., Dowdeswell, J. A., Dorschel, B., Fenty, I., Hogan, K., Howat, I., Hubbard, A., Jakobsson, M., Jordan, T. M., Kjeldsen, K. K., Millan, R., Mayer, L., Mouginot, J., Noël, B. P. Y., O'Cofaigh, C., Palmer, S., Rysgaard, S., Seroussi, H., Siegert, M. J., Slabon, P., Straneo, F., van den Broeke, M. R., Weinrebe, W., Wood, M., and Zinglersen, K. B.: BedMachine v3: Complete Bed Topography and Ocean Bathymetry Mapping of Greenland From Multibeam Echo Sounding Combined With Mass Conservation, Geophys. Res. Lett., 44, 11051–11061,, 2017. a

Nagler, T., Rott, H., Hetzenecker, M., Wuite, J., and Potin, P.: The Sentinel-1 Mission: New Opportunities for Ice Sheet Observations, Remote Sensing, 7, 9371–9389,, 2015. a

Nick, F. M., Vieli, A., Andersen, M. L., Joughin, I., Payne, A., Edwards, T. L., Pattyn, F., and van de Wal, R. S. W.: Future sea-level rise from Greenland's main outlet glaciers in a warming climate, Nature, 497, 235–238,, 2013. a

Paravolidakis, V., Moirogiorgou, K., Ragia, L., Zervakis, M., and Synolakis, C.: COASTLINE EXTRACTION FROM AERIAL IMAGES BASED ON EDGE DETECTION, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-8, 153–158,, 2016. a

Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation, CoRR, arXiv [preprint], arXiv:1505.04597, 2015. a

Seale, A., Christoffersen, P., Mugford, R. I., and O'Leary, M.: Ocean forcing of the Greenland Ice Sheet: Calving fronts and patterns of retreat identified by automatic satellite monitoring of eastern outlet glaciers, J. Geophys. Res.-Earth, 116, F03013,, 2011. a

van den Broeke, M. R., Enderlin, E. M., Howat, I. M., Kuipers Munneke, P., Noël, B. P. Y., van de Berg, W. J., van Meijgaard, E., and Wouters, B.: On the recent contribution of the Greenland ice sheet to sea level change, The Cryosphere, 10, 1933–1946,, 2016. a

Wood, M., Rignot, E., Fenty, I., An, L., Bjørk, A., van den Broeke, M., Cai, C., Kane, E., Menemenlis, D., Millan, R., Morlighem, M., Mouginot, J., Noël, B., Scheuchl, B., Velicogna, I., Willis, J. K., and Zhang, H.: Ocean forcing drives glacier retreat in Greenland, Sci. Adv., 7, eaba7282,, 2021. a

Zhang, E., Liu, L., and Huang, L.: Automatically delineating the calving front of Jakobshavn Isbræ from multitemporal TerraSAR-X images: a deep learning approach, The Cryosphere, 13, 1729–1741,, 2019. a, b, c, d, e, f

Short summary
Tracking changes in Greenland's glaciers is important for understanding Earth's climate, but it is time consuming to do so by hand. We train a program, called CALFIN, to automatically track these changes with human levels of accuracy. CALFIN is a special type of program called a neural network. This method can be applied to other glaciers and eventually other tracking tasks. This will enhance our understanding of the Greenland Ice Sheet and permit better models of Earth's climate.