Detection of ice core particles via deep neural networks

Maffezzoli, Niccolò; Cook, Eliza; van der Bilt, Willem G. M.; Støren, Eivind N.; Festi, Daniela; Muthreich, Florian; Seddon, Alistair W. R.; Burgay, François; Baccolo, Giovanni; Mygind, Amalie R. F.; Petersen, Troels; Spolaor, Andrea; Vascon, Sebastiano; Pelillo, Marcello; Ferretti, Patrizia; dos Reis, Rafael S.; Simões, Jefferson C.; Ronen, Yuval; Delmonte, Barbara; Viccaro, Marco; Steffensen, Jørgen Peder; Dahl-Jensen, Dorthe; Nisancioglu, Kerim H.; Barbante, Carlo

doi:https://doi.org/10.5194/tc-17-539-2023

Articles | Volume 17, issue 2

https://doi.org/10.5194/tc-17-539-2023

Articles | Volume 17, issue 2

Research article

07 Feb 2023

Research article |

| 07 Feb 2023

Detection of ice core particles via deep neural networks

Niccolò Maffezzoli, Eliza Cook, Willem G. M. van der Bilt, Eivind N. Støren, Daniela Festi, Florian Muthreich, Alistair W. R. Seddon, François Burgay, Giovanni Baccolo, Amalie R. F. Mygind, Troels Petersen, Andrea Spolaor, Sebastiano Vascon, Marcello Pelillo, Patrizia Ferretti, Rafael S. dos Reis, Jefferson C. Simões, Yuval Ronen, Barbara Delmonte, Marco Viccaro, Jørgen Peder Steffensen, Dorthe Dahl-Jensen, Kerim H. Nisancioglu, and Carlo Barbante

Abstract

Insoluble particles in ice cores record signatures of past climate parameters like vegetation dynamics, volcanic activity, and aridity. For some of them, the analytical detection relies on intensive bench microscopy investigation and requires dedicated sample preparation steps. Both are laborious, require in-depth knowledge, and often restrict sampling strategies. To help overcome these limitations, we present a framework based on flow imaging microscopy coupled to a deep neural network for autonomous image classification of ice core particles. We train the network to classify seven commonly found classes, namely mineral dust, felsic and mafic (basaltic) volcanic ash grains (tephra), three species of pollen (Corylus avellana, Quercus robur, Quercus suber), and contamination particles that may be introduced onto the ice core surface during core handling operations. The trained network achieves 96.8 % classification accuracy at test time. We present the system's potential and its limitations with respect to the detection of mineral dust, pollen grains, and tephra shards, using both controlled materials and real ice core samples. The methodology requires little sample material, is non-destructive, fully reproducible, and does not require any sample preparation procedures. The presented framework can bolster research in the field by cutting down processing time, supporting human-operated microscopy, and further unlocking the paleoclimate potential of ice core records by providing the opportunity to identify an array of ice core particles. Suggestions for an improved system to be deployed within a continuous flow analysis workflow are also presented.

Download & links

Article (PDF, 11333 KB)

Supplement (8730 KB)

Download & links

Article (11333 KB)
Full-text XML
Supplement (8730 KB)
BibTeX
EndNote

How to cite.

Maffezzoli, N., Cook, E., van der Bilt, W. G. M., Støren, E. N., Festi, D., Muthreich, F., Seddon, A. W. R., Burgay, F., Baccolo, G., Mygind, A. R. F., Petersen, T., Spolaor, A., Vascon, S., Pelillo, M., Ferretti, P., dos Reis, R. S., Simões, J. C., Ronen, Y., Delmonte, B., Viccaro, M., Steffensen, J. P., Dahl-Jensen, D., Nisancioglu, K. H., and Barbante, C.: Detection of ice core particles via deep neural networks, The Cryosphere, 17, 539–565, https://doi.org/10.5194/tc-17-539-2023, 2023.

Received: 15 Jul 2022 – Discussion started: 26 Aug 2022 – Revised: 02 Jan 2023 – Accepted: 18 Jan 2023 – Published: 07 Feb 2023

1 Introduction

Ice cores provide some of the most valuable continuous records of the Earth's past climate. While the oldest Antarctic and Greenland cores date back, respectively, to 800 000 and 125 000 years ago and register variability in climate parameters at hemispheric scales (North Greenland Ice Core Project members, 2004; EPICA community members, 2004), ice stored in glaciers and small ice caps located at lower latitudes typically contains fingerprints of local to regional climate changes on centennial to millennial timescales (Schwikowski, 2004). The analytical detection of impurities contained in the ice matrix allows the production of past climate records at various spatial and temporal scales. Alongside gas bubbles and soluble chemical compounds, the ice matrix stores insoluble particulate matter, hereafter referred to as “particles”. Among the types of particles is mineral dust, the glass component of volcanic ash, pollen grains, and other biological matter, as well as microfossils sourced from oceans or lakes, such as diatoms and foraminifera. Each particle type carries its own climate significance, and its concentration depends on factors such as the source strength and emission mechanisms, the relative distance between core site and source region, and parameters controlling atmospheric transport and deposition.

By far the most abundant particle type in ice cores is mineral dust particles that are sourced from continental surfaces and are transported and dry- or wet-deposited onto ice sheets and glaciers (Legrand and Mayewski, 1997). The detection of dust is fundamental for investigating the extent of arid areas in the past, the paleo–atmospheric circulation, and to assess the role of mineral dust aerosol in Quaternary climate changes (Petit et al., 1999; Lambert et al., 2008). Thanks to its preservation, dust records can be used to synchronize deep ice cores in the absence of other proxies, thus supporting ice core dating (e.g., Bohleber et al., 2018; Eichler et al., 2000; Dome Fuji Ice Core Project Members, 2017). Dust measurements are routinely carried out while melting ice cores in continuous flow analysis setups (CFA; Bigler et al., 2011) using optical systems such as the laser-based Klotz Abakus sensor. As the abundance of dust particles is orders of magnitude higher than other insoluble particles, Abakus measurements are commonly associated with dust, despite the instrument actually being insensitive to the type of particle entering the detector. Additionally, Abakus values require an accurate calibration with an independent technique, typically the Coulter counter (CC), which is an electrical-based analyzer that operates in discrete mode and cannot be run on CFA setups (Petit et al., 1981). The mismatch and calibration between the Abakus and the CC impurity detection is an active research topic within the ice core community (Simonsen et al., 2018). The higher accuracy of the CC comes at the expense of its discrete mode of use; moreover, it is also particle insensitive.

Volcanic ash deposits in ice cores can contain volcanic minerals, rock fragments, and volcanic glass shards. Across the spectrum of volcanic material, in this work we target cryptotephras and glass shards from individual eruptions that can form stratigraphically distinct deposits in ice cores and in marine and terrestrial sediments that are invisible to the naked eye (e.g., Lowe and Hunt, 2001; Turney et al., 1997). The identification of volcanic glass (hereafter referred to as “tephra”) provides direct evidence of past volcanic activity (Abbott and Davies, 2012; Sigl et al., 2015) and provides a crucial tool to date and synchronize paleorecords (ice, marine, and lake) and therefore to establish absolute and synchronized chronologies (Lowe, 2011). The analytical detection of tephra layers in ice cores is typically carried out either by using different methods or in combination. Potential volcanic layers can be identified by electrical conductivity or sulfate concentration measurements during CFA analyses (e.g., Wolff et al., 1995) at high resolution. Not all tephra layers, particularly cryptotephras, however, correspond to acidity or sulfate peaks, and vice versa, given the different emission, transport, and deposition of gaseous species and particulate volcanic material (Legrand and Mayewski, 1997; Davies et al., 2010). For example, in the glacial period, tephra-rich deposits consistently lack coeval chemostratigraphic peaks, partially due to signal neutralization by dust (Bourne et al., 2015). Manual discrete subsampling of such selected intervals of interest is also carried out to maximize tephra layer identification. During this method, ice samples are individually processed and manually inspected using optical bench microscopy (e.g., Bourne et al., 2015; Cook et al., 2018). If the presence of cryptotephra is confirmed, then the glass shards are individually counted. This makes the identification of tephra extremely time-consuming and in some cases serendipitous. While attempts have been made to automate particle detection (e.g., Van der Bilt et al., 2021, in sediment records), the methodology for investigating tephra in ice cores typically requires a huge time commitment by tephra experts.

Pollen analyses from snow and ice records provide information on past vegetation and atmospheric circulation changes (Bourgeois, 2000). Over relatively short timescales, pollen records with springtime maxima associated with vegetation blooms can also be used as a dating method (Nakazawa et al., 2004; Festi et al., 2021). Like tephra, pollen analyses need several and laborious preprocessing steps in which discrete ice samples are cut, melted, and pre-concentrated (e.g., Festi et al., 2015, and references therein). Finally, the presence and the number of pollen grains are manually evaluated by palynologists via optical bench microscopy. In summary, the extraction and detection of climate-relevant ice core particles is extremely laborious.

Over the last 10 years, neural networks and in particular convolutional neural networks (CNNs) have become the state-of-the-art methods in digital image classification tasks. Since the proposed architecture of Krizhevsky et al. (2012), the field experienced rapid growth that spawned a major breakthrough and optimization of a number of aspects, including increasing model depth (Simonyan and Zisserman, 2014), understanding the dynamics of internal layers (Zeiler and Fergus, 2014), and facilitating the gradient flow (He et al., 2016). In the ImageNet classification challenge (Deng et al., 2009), CNN-based architectures have surpassed human accuracy (He et al., 2015). Despite the advances of such techniques, their application to environmental studies has lagged behind to very few and recent examples (Kerr et al., 2020; Viertel and König, 2022).

In this work we investigate the extent to which autonomous and simultaneous detection and classification of ice core particles can be achieved with deep neural networks. In our setup, to generate the ice particle imagery, we rely on a flow imaging microscopy instrument (the FlowCam; Fig. A1) able to produce images of particles captured within a liquid stream continuously pumped through the instrument. We develop a mixed convolutional and fully connected neural network to classify the imagery into six classes of particles, namely mineral dust, tephra (basaltic and felsic), and three pollen grains potentially present in alpine ice records, i.e., Corylus avellana, Quercus robur, and Quercus suber. An additional seventh class of contamination/blurry particles is included as a control channel for the model to be able to identify those particles that do not provide climate information.

2 Methods

2.1 The FlowCam: settings and image and feature extraction

The FlowCam instrument (Yokogawa/Fluid Imaging Technologies; VS-IV model) located at the Earth Surface Sediment Laboratory (EARTHLAB; University of Bergen, Norway) is used to capture images of particles in ultrapure water or ice meltwater samples. The FlowCam is a benchtop flow imaging cytometer equipped with a visible range optical camera. The liquid sample is injected into the system by manual pipetting, and it is drawn by a syringe pump to a quartz flow cell. Alternatively, connection tubing can allow sampling from discrete sample vials or from a continuous flow system (Appendix A). The flow cell used in our setup (depth = 80 µm; width = 570 µm) allows the flow of particles up to 80 µm in diameter in the maximum dimension. A 1.0 mL volume syringe pump is set to operate at a flow rate of 0.02 mL min⁻¹. While passing through the flow cell, the sample is imaged by a camera equipped with a 20× magnification objective. The camera flash duration is set to 65 µs and is operated at the maximum 22 frames per second. With the aforementioned settings, the imaged sample volume, i.e., the percentage of volume imaged by the camera, is 41.8 %. This parameter is determined by the combination of camera frame rate, pump speed, and flow cell geometry. The system optics determine a calibration factor of 0.2752 µm per pixel in the resulting monochrome 1280×960 pixels 8 bit TIFF images.

The mechanics of particle image creation is performed by the native FlowCam software (VisualSpreadsheet v3.4). All camera image frames captured during analyses are compared to a calibration image acquired prior to the analysis (Fig. A1). In every image, the pixels are considered to be signal (i.e., set to 1) if their intensities are higher or lower than their intensities in the calibration image by a threshold value. If the pixel intensity differences do not exceed the threshold, then they are considered to be background and set to 0. Once the signal–background binary image is created, the single particle images are extracted by segmenting out the pixels flagged as signal (Fig. A1). Each created image thus represents one particle. The threshold value, set to 18, and the camera focus are optimized by acquiring images of spherical polystyrene 25 µm beads and by minimizing the standard deviation of the resulting size distribution (Fig. S1 in the Supplement). For each acquired particle image, the FlowCam software calculates a number of numerical features, hereafter also referred to as “metadata”, mostly reflecting the geometrical properties of the particles. These numerical features are calculated by classic computer vision algorithms. In this work we use n=34 metadata (Appendix B).

2.2 Training dataset

The classification model is based on a supervised learning approach. The training dataset consists of images and related metadata for seven classes of particles, i.e., mineral dust, tephra (basaltic and felsic), three pollen species of Corylus avellana, Quercus robur, and Quercus suber, and an additional class that consists of contamination particles that are found on the external surface of ice cores (Table 1; Figs. C1–C7). Each item in the training dataset consists of a particle image and the corresponding array of 34 numerical metadata. The training dataset of each class (except for the contamination class) is created by preparing and measuring samples that contain only one type of particle, so that each acquisition yields a purely one-class batch. The samples are created by preparing solutions in ultrapure water, and multiple acquisitions are repeated until several thousand images are collected. Every image in the training dataset is visually inspected and validated by the human eye.

Table 1Training dataset.

Download Print Version | Download XLSX

The training dataset of dust particles is created by measuring the water solutions of FD066 (Linsinger et al., 2019; Table 1; Fig. C1), an aluminum oxide powder containing particles with a mean size distribution of 2.5 µm and rarely exceeding 6 µm (Table 2). Such a dust training set is therefore suited to mimic dust found in inland Antarctic and Greenland ice cores, typically below 4 µm (e.g., Delmonte et al., 2002; Ruth et al., 2003).
Two tephra classes, felsic and basaltic, are included in the training dataset, primarily because of their detectable color differences that result from a different geochemistry. Felsic (silica-rich) tephras are typically lighter in color, while basaltic ash is darker. The felsic tephra training dataset consists of Campanian Ignimbrite volcanic ash from the 39.3 ± 0.1 ka Phlegraean Fields eruption (Fedele et al., 2003; Table 1; Fig. C2). The phonolitic–trachytic (∼ 60 wt % SiO₂) ash was sampled ∼ 1000 km from its source (Veres et al., 2013). Our basaltic tephra consists of volcanic ash from the Icelandic Grímsvötn 2011 eruption (Table 1; Fig. C3). Ash samples were collected on 22 May 2011 in the town of Kirkjubæjarklaustur, about 70 km southwest of the Grímsvötn caldera. After collection, samples were dried and stored in plastic beakers. Ashes of both types were dry-sieved at 63 µm to limit the maximum dimension and fit the max. 80 µm size constraint (min. ∼ 8 µm) of the flow cell. This range (8–80 µm) is consistent with the size that is typically considered during cryptotephra manual counting by bench microscopy (Gow and Meese, 2007; Narcisi et al., 2012; Abbott and Davies, 2012; Plunkett et al., 2020). It is important to note that, for both tephra classes, only those tephra images that could be clearly validated by an experienced tephra analyst were included in the training dataset. This resulted in discarding a very large fraction of blurry imagery. This decision was adopted to drive the model to yield clearer tephra predictions and reduce ambiguous predictions (i.e., for tephra, purity is prioritized over efficiency).
Three pollen species are included in the training dataset, i.e., C. avellana, Q. robur, and Q. suber (Table 1; Figs. C4, C5, and C6). C. avellana branches were collected near Innsbruck (Austria) in February 2019 from multiple trees within a radius of 500 m. The inflorescence was matured in the lab, and the samples were prepared by mixing together pollen from different trees. Both Quercus species were collected in Portugal and treated similarly. Occasionally, if pollen grains flow at the boundary of the camera field of view, they result in being partially captured. We decided to keep fractional pollen images to increase the sensitivity of the model to correctly classify pollen, even when grains are only partially visible.
The seventh class (contamination/blurry) consists of two types of particles. The first includes contamination particles from the GRIP ice core external surface (Table 1; Fig. C7). The ice core surface typically contains particles from the core drilling, cutting, and handling operations such as paper wrap, glove clothing fibers, and graphite from the pencil used to mark the core sections. The second type of particle added to this class includes relatively large and poor-quality images, i.e., out of focus. The particles collected for this class are obtained from GRIP sample measurements followed by offline manual validation and labeling. While blurry images are an intrinsic limitation of this methodology, the contamination/blurry class serves the purpose of an important controlled channel for the model to be able to identify particles that do not carry climate significance.

2.3 Model

2.3.1 Hybrid deep neural network

The developed model is a hybrid network that supports mixed data inputs (Fig. 1). It is composed of two branches, a convolutional neural network (CNN), and a multilayer perceptron (MLP), fed, respectively, by particle images and the corresponding 34-dimensional (34-d) numerical feature vectors (metadata). The CNN consists of ResNet-18 architecture (He et al., 2016). This network is composed of multiple convolution layers that progressively increase the number of filters while decreasing the feature map size. Batch normalization (BN) layers are placed right after each convolution layer and before ReLU (rectified linear unit) activations. The network ends with an average pooling layer and a final FC (fully connected) layer that compresses the image into a 64-d embedding. This vector is concatenated to the output of the MLP, formed by two series of FC-BN-ReLU dropout layers followed by a final FC layer that produces a 32-d representation. Following the concatenation of the two network branches, a first FC-BN-ReLU stack is placed before the final FC layer that precedes a sigmoid activation.

https://tc.copernicus.org/articles/17/539/2023/tc-17-539-2023-f01

Figure 1Model architecture. The top branch of the network is a ResNet-18 CNN (He et al., 2016). BN, ReLU layers, and skip connections are omitted for clarity. The bottom branch operates on the numerical features and consists of three-layer multilayer perceptron. The separate outputs of the two branches are concatenated into a final classification branch. Indicated in parentheses are the input and output shapes of some layers along the network.

Detection of ice core particles via deep neural networks

2.1 The FlowCam: settings and image and feature extraction

2.2 Training dataset

2.3 Model

2.3.1 Hybrid deep neural network

2.3.2 Data preprocessing and augmentation

2.3.3 Model training, validation, and test

3.1 Dust

3.1.1 Standard Reference Material: size reconstruction, limits of detection (LOD), and mass concentrations

3.1.2 Ice core dust mass concentrations

3.2 Pollen

3.3 Tephra

3.3.1 Optical microscopy for tephra analysis

3.3.2 Flow imaging microscopy and particle classification

3.3.3 Human assessment of modeled tephra predictions

3.3.4 Investigating the network dynamics