Accuracy and inter-analyst agreement of visually estimated sea ice concentrations in Canadian Ice Service ice charts using single-polarization RADARSAT-2

Cheng, Angela; Casati, Barbara; Tivy, Adrienne; Zagon, Tom; Lemieux, Jean-François; Tremblay, L. Bruno

doi:https://doi.org/10.5194/tc-14-1289-2020

Articles | Volume 14, issue 4

https://doi.org/10.5194/tc-14-1289-2020

Articles | Volume 14, issue 4

Research article

21 Apr 2020

Research article |

| 21 Apr 2020

Accuracy and inter-analyst agreement of visually estimated sea ice concentrations in Canadian Ice Service ice charts using single-polarization RADARSAT-2

Angela Cheng, Barbara Casati, Adrienne Tivy, Tom Zagon, Jean-François Lemieux, and L. Bruno Tremblay

Abstract

This study compares the accuracy of visually estimated ice concentrations by eight analysts at the Canadian Ice Service with three standards: (i) ice concentrations calculated from automated image segmentation, (ii) ice concentrations calculated from automated image segmentation that were validated by the analysts, and (iii) the modal ice concentration estimate by the group. A total of 76 predefined areas in 67 RADARSAT-2 images are used in this study. Analysts overestimate ice concentrations when compared to all three standards, most notably for low ice concentrations (1/10–3/10). The spread of ice concentration estimates is highest for middle concentrations (5/10, 6/10) and smallest for the 9/10 ice concentration. The overestimation in low concentrations and high variability in middle concentrations introduce uncertainty into the ice concentration distribution in ice charts. The uncertainty may have downstream implications for numerical modelling and sea ice climatology. Inter-analyst agreement is also measured to determine which classifier's ice concentration estimates (analyst or automated image segmentation) disagreed the most. It was found that one of the eight analysts disagreed the most, followed second by the automated segmentation algorithm. This suggests high agreement in ice concentration estimates between analysts at the Canadian Ice Service. The high agreement, but consistent overestimation, results in an overall accuracy of ice concentration estimates in polygons to be 39 %, 95 % CI [34 %, 43 %], for an exact match in the ice concentration estimate with calculated ice concentration from segmentation and, 84 %, 95 % CI [80 %, 87 %], for the ±1 ice concentration category. Only images with high contrast between ice and open water and well-defined floes are used: true accuracy is expected to be lower than what is found in this study.

Download & links

Article (PDF, 3704 KB)

Supplement (71 KB)

Download & links

How to cite.

Received: 12 Aug 2019 – Discussion started: 30 Aug 2019 – Revised: 16 Jan 2020 – Accepted: 21 Jan 2020 – Published: 21 Apr 2020

1 Introduction

Sea ice charts are routinely made by national ice services to provide accurate and timely information about sea ice conditions. These charts are produced to support navigation in polar regions, to provide information to local communities and to monitor the long-term evolution of sea ice conditions (i.e., climatology). Ice analysts and forecasters generate these charts by using various data sources, including remotely sensed imagery, to quantify various sea ice characteristics, including sea ice concentration. Analysts and forecasters at the Canadian Ice Service (CIS) predominantly rely on RADARSAT-2 imagery for monitoring sea ice conditions. Analysts at the CIS identify areas with similar ice conditions and open water for navigational purposes and then manually delineate them with polygons. The analyst then assigns an estimated concentration value for the polygon using the visual segmentation. The ice concentration is expressed in categories, as a percentage rounded to the nearest tenth (i.e. 1/10, 2/10, etc.). A number of studies have been done to develop, assess, or improve upon algorithms for automated calculation of ice concentration from remotely sensed images. Algorithms have been built for automated ice concentration retrieval using different sensors. The NASA team and bootstrap algorithms use the Special Sensor Microwave/Imager, (commonly referred to as SSM/I), although other algorithms exist for Advanced Very High Resolution Radiometer (AVHRR) and RADARSAT-2 SAR images (e.g. Belchansky and Douglas, 2002; Williams et al., 2002; Meier, 2005; Hebert et al., 2015; Scott et al., 2012). The process of automatically calculating ice concentration from remotely sensed images requires classifying each pixel of the image into a category. In the simplest case, the categories are ice and open water. In more complex cases, the ice is categorized by type, such as multi-year ice or landfast ice, or by thickness (e.g. thick vs. thin). Classified pixels are then grouped by category to produce segmentation results. The sea ice concentration can then be derived for a given area by taking the sum of the total number of sea ice pixels and dividing by the total number of pixels.

Manually derived products from SAR are assumed to be the most accurate source of information on ice concentration by many users (Karvonen et al., 2005). Therefore, many studies rely on the manually derived ice products as a ground truth when developing automated ice retrieval techniques (Komarov and Buehner, 2017; Karvonen, 2017). However, due to a limited number of studies, the accuracy of manually derived ice concentrations is not well understood. Karvonen et al. (2015) conducted a study in an attempt to quantify this; they compared the sea ice concentrations assigned by five separate working groups (containing up to five ice analysts each) for 48 polygons predefined by an ice analyst using two high-resolution ScanSAR images in the Baltic Sea. They found deviation in ice concentration estimates between ice analysts, some of which were significant, especially for polygons in mid-range ice concentrations. However, their study was geographically and temporally limited to only two SAR images in the Baltic Sea. Other than this study, there has been little measurement of the spread and variability in sea ice concentration estimates by human analysts or the subsequent uncertainty of ice charts produced by operational ice analysts.

The uncertainty of sea ice concentration estimates can result in downstream uncertainties for applications that rely on sea ice charts. For example, sea ice concentration estimates from Canadian Ice Service charts are used as a data source for input to initialize sea ice models (Smith et al., 2016; Lemieux et al., 2016). The error in the initial condition of sea ice concentration estimates can propagate and grow with time and impact the accuracy of predictions from numerical models (Parkinson et al., 2001). Uncertainty of ice concentration estimates could also impact the accuracy of climatology studies of ice concentration derived from operational ice charts, although that has not been investigated.

The main objective of this study was to determine the probability that a given ice concentration in a Canadian Ice Service ice chart polygon reflects the ice concentration found in the corresponding SAR image used by analysts to create the ice chart. To achieve this, we assessed the following:

i.
the accuracy of analysts and forecasters in visually estimating ice concentration when compared to calculated ice concentration from image segmentation under best-case scenarios (i.e. image segmentation adequately resembles the visual segmentation done by an analyst or forecaster)
ii.
the consistency of analysts and forecasters with one another in visually estimating ice concentration from SAR imagery.

The paper is structured as follows. Section 2 describes the data and standards for ice chart creation. Section 3 describes the methodology for generating the sample polygons used in this study, calculating total ice concentration from image segmentation and capturing analyst estimates of ice concentration. Section 4 describes the producer and user's accuracy as well as the two skill scores used in this study. Section 5 provides the results of the comparison between visually estimated ice concentrations and calculated ice concentrations using the skill scores. Section 6 compares visually estimated ice concentrations with the modal ice concentration value. Section 7 describes the accuracy of visually estimated ice concentrations in polygons. The paper concludes with a discussion in the final section.

2 Ice charting

Section 2 describes elements of ice charting. Section 2.1 describes the remote-sensing data that are the primary input data source for generating ice charts. Section 2.2 briefly describes the type of ice charts created at the Canadian Ice Service. Section 2.3 gives an overview of the egg code, which is the international standard used for ice charting and is the method for reporting ice concentrations in an ice chart.

2.1 Remote sensing for monitoring ice concentration

Sea ice is routinely monitored by ice services using satellites due to their ability to acquire images covering large spatial areas. Passive microwave and synthetic-aperture radar (SAR) sensors are preferred over optical imagery because of their ability to see through clouds. Optical satellites rely on solar illumination, which is absent in Arctic regions during polar night. Passive microwave observations often have coarse resolution (i.e. 50 km), whereas SAR data can be acquired at high (i.e. 50 m) resolution. Low spatial resolution makes it difficult to resolve sea ice conditions in certain conditions or geographic areas. SAR provides consistent, high-resolution coverage of the Arctic without cloud interference or limitations due to a lack of solar illumination.

The CIS relied on RADARSAT-1, a SAR sensor, for ice charting beginning in 1996 until its decommissioning in 2013. The Canadian Ice Service currently relies predominantly on RADARSAT-2 but will start to use the RADARSAT Constellation Mission (RCM) operationally, following its recent launch in 2019. In the 2017 calendar year, the Canadian Ice Service received approximately 45 000 SAR scenes between Sentinel-1 and RADARSAT-2 and another 85 000 scenes from various satellites, including GOES, MODIS, AMSR, and VIIRS. The lower number of SAR scenes reflects the fact that RADARSAT scenes are geographically targeted acquisitions ordered by the CIS, while GOES, MODIS, AMSR, and VIIRS are publicly available swaths acquired for general use. The latter are less targeted for CIS Operations but useful as a secondary, supplemental data source.

Ice services have had difficulty implementing systems to automate sea ice interpretation from satellite imagery. Automated calculation of sea ice concentration requires first classifying the pixels in an image into categories of ice (i.e. first-year ice, multi-year ice, etc.) or open water and then calculating the proportion of ice within a given area. Automated sea ice algorithms in SAR rely on interpretation of sea ice backscatter, which can be ambiguous (Zakhvatkina et al., 2019). For example, open water under low wind conditions yields similar backscatter to first-year ice, making it difficult to distinguish automatically. During the melt season, sea ice forms a layer of meltwater on top of the sea ice, which can yield similar backscatter values to that of open water – confusing algorithms, which then classify it as open water. Many attempts have been made to automatically classify sea ice in SAR scenes using a variety of methods but have had difficulty in conditions such as those previously described (Zakhvatkina et al., 2019). In many of these cases, expert analysts are able to detect sea ice where the algorithms cannot.

Polarizations have provided additional data that have been useful for implementing automated sea ice classification. Polarization refers to the orientation of the electromagnetic waves sent and received by the sensor. The main polarizations used for sea ice monitoring with RADARSAT-2 are (1) horizontal transmit and horizontal receive (HH) and (2) horizontal transmit and vertical receive (HV). The HV band has been shown to be less sensitive to the incidence angle of the satellite. The combination of both HH and HV channels has been shown to better distinguish between sea ice concentrations than either channel alone (Karvonen, 2014). On the other hand, only the HH polarization was available for sea ice monitoring with RADARSAT-1. RCM will provide compact polarimetry modes, which will provide additional information. Automation of sea ice classification algorithms currently uses dual-polarization imagery but will use compact polarimetry as it becomes available.

Despite technological advances in SAR satellites and a long history of development of automated techniques, ice services still continue to rely on manually drawn ice charts to identify sea ice conditions because automation of sea ice classification has significant limitations (i.e. difficulty in separating multi-year ice from first-year ice during summer melt). Furthermore, analysts are able to provide additional information that automated classification cannot (i.e. analysts can provide total concentration, partial concentration, stage of development, etc., in a single chart). However, ice charting is time-consuming, and the number of images acquired by satellites is expected to increase. Therefore, development of automated sea ice classification algorithms for ice charting continues to be a pressing need for operational ice services.

2.2 Chart types

A number of different types of charts are generated at the Canadian Ice Service (e.g. regional, daily, image analyses, concentration, stage of development, etc.), which vary due to the chart's purpose, relevant time, or underlying data sources (Canadian Ice Service, 2019) (refer to Dedrick et al., 2001, on National Ice Center charts for the US). Image analysis charts are created by visually interpreting specific satellite images. These charts are constrained to the geographic extent and resolution of the corresponding satellite images. Daily ice charts combine different sources of information, introducing variability between the ice chart and the satellite image.

2.3 Egg code

The egg code is a World Meteorological Organization international standard for coding ice information (see Fig. 1), (WMO, 2014). Each polygon in an ice chart is assigned an egg code with corresponding values. The egg code contains information on the ice concentration (C), stage of development (S), and the predominant form (F) of ice (floe size), within an oval shape. The top value in the egg code is the total ice concentration, which includes all stages of development of ice. Total ice concentration is expressed in categories, where the ice concentration as a percentage is rounded to the nearest tenth. Less than 1/10 of sea ice is used to denote open water, which is not the absolute absence of ice but is the definition of ice less than 1/10. Partial concentration is used when more than one ice type is present within the delineated polygon. No partial concentration is reported when only one ice type is found. In our study, we only considered total concentration rather than partial concentrations.

https://www.the-cryosphere.net/14/1289/2020/tc-14-1289-2020-f01

Figure 1The World Meteorological Organization standard egg code used for ice charting at the Canadian Ice Service (WMO, 2014). The total concentration value, C_t, is the code found on the first line of the egg code. Secondary concentration values ( $C_{a}, C_{b}, C_{c}$ ) can be found on the second line when more than one ice type is present.

3 Ice concentration estimates

In this paper we consider three standards with which analysts' visually estimated ice concentration is compared due to the absence of absolute ground truth. The standards used are (i) ice concentrations derived from automated segmentation, (ii) ice concentrations derived from automated segmentation that have been validated by analysts, and (iii) the mode of ice concentration estimates given by analysts. This section describes the methodology for creating sample polygons used in this study, calculating ice concentrations from automated segmentation and capturing visual estimation of ice concentration from participating ice analysts.

3.1 Case-study selection and polygon definition

RADARSAT-2 images were randomly selected from the Canadian Ice Service image archive. Each image was manually reviewed to find areas of clear contrast between water and ice to optimize segmentation capability and reduce potential ambiguity in visual analysis.

A former operational analyst delineated potential polygons for the sample of selected cases used in this study. Polygons were drawn in areas with high contrast between ice and open water (to optimize the algorithm's ability to differentiate between ice and water) and areas of fractional ice cover (since there is little value in evaluating analysts' ability to estimate 0/10 or 10/10 ice concentration). Unlike a traditional ice chart where analysts segment entire images into polygons, we only drew polygons in areas of interest.

The images used for this study were selectively picked to be areas with well-defined floes with high contrast against the black (water) background. SAR image quality varies from image to image and even within the image. Likewise, the structure of sea ice in Canadian waters can vary greatly, with brash and rubble ice along the eastern coast and well-defined floes in the Beaufort Sea. Ice without well-defined spatial structure may not be captured due to the resolution of the sensor. For example, first-year ice can appear similar to open water, making it difficult to determine its edges. Brash ice is composed of small pieces of ice (less than 2 m in diameter) that cannot be resolved at the resolution of the (SAR) sensor. Furthermore, segmentation of sea ice in visually ambiguous conditions (i.e. first-year ice during the melt season, brash ice, etc.) by automated algorithms is still sub-optimal. As a result, we did not present analysts with ice conditions that would have been difficult to automatically segment. The sea ice types used in the samples of this study are not representative of all sea ice conditions typically found in Canadian Service Ice Charts. This study quantifies the accuracy of sea ice concentration estimates under the best-case scenario of well-defined floes in very clear SAR images. It is expected that accuracy would decrease under brash ice conditions and/or poor image quality.

We assessed if there were differences in the size of polygons drawn for this study and the sizes of polygons in published charts, since polygon sizes could impact analyst ability to estimate ice concentration. The polygon sizes were compared to polygon sizes from two types of published operational charts: daily charts and image analyses. The image analyses and daily charts used the same RADARSAT images that were used to delineate polygons used in this study. Since the polygons were delineated differently, sometimes the sample polygon would spatially intersect with two or more polygons, making it difficult to directly compare the sizes of polygons. We addressed this by identifying the polygon with the greatest spatial intersection with the sample polygon and comparing the two areas. Figure 2 shows the difference between polygon sizes. Polygon sizes were not normally distributed. Under a Wilcoxon–Mann–Whitney rank test, polygons from image analyses and daily charts are not significantly different in size (p=0.226). On the other hand, polygons generated for this study are significantly smaller than polygons from image analysis charts (p=0.002), although there is overlap in the size range. Polygons generated for this study are also significantly smaller than polygons from daily charts (p=0.071), with less overlap in the size range than for the image analyses.

https://www.the-cryosphere.net/14/1289/2020/tc-14-1289-2020-f02

Figure 2A comparison of the polygon sizes for the sample of polygons generated for this study, corresponding polygons in daily charts, and corresponding polygons in image analyses. Areas are given (in m²) on a log scale. The orange horizontal line indicates the median area. The upper and lower whiskers show the limits of 1.5 times the interquartile range. The circles indicate outlier polygon areas outside of the interquartile range.

Accuracy and inter-analyst agreement of visually estimated sea ice concentrations in Canadian Ice Service ice charts using single-polarization RADARSAT-2

2.1 Remote sensing for monitoring ice concentration

2.2 Chart types

2.3 Egg code

3.1 Case-study selection and polygon definition

3.2 Ice concentration estimates using automated image segmentation

3.3 Ice concentration estimates from operational analysts

4.1 Multi-categorical contingency table, producer's accuracy, and user's accuracy

4.2 Kappa statistic

4.3 Krippendorff's alpha

5.1 Kappa statistic between analysts and MAGIC

5.2 Inter-rater reliability between analysts and MAGIC