Lead detection in Arctic sea ice from CryoSat-2 : quality assessment , lead area fraction and width distribution

Leads cover only a small fraction of the Arctic sea ice but they have a dominant effect on the turbulent exchange between the ocean and the atmosphere. A supervised classification of CryoSat-2 measurements is performed by a comparison with visual MODIS scenes. For several parameters thresholds are optimized and tested in order to reproduce this prior classification. The maximum power of the waveform shows the best classification properties amongst them, including 5 the Pulse Peakiness. The sea surface height is derived and its spread is clearly reduced for a classifier based on the maximum power compared to published ones. Lead area fraction estimates based on CryoSat-2 show a major fracturing event in the Beaufort Sea in 2013. The resulting Arctic wide lead width distribution follows a power law with an exponent of 2.47± 0.04 for the winter seasons from 2011 to 2014, confirming and complementing a regional study based on a high resolution SPOT 10 image.


Introduction
Sea ice affects all interaction between the ocean and the atmosphere, namely heat, mass and momentum transports in ice-covered regions.It strongly reduces most of these types of transport, thereby basically leaving these processes to openings in the ice.These openings, called leads, appear even in regions which are typically covered by thick ice, like the central Arctic.Shear and divergence in the ice cover create new leads (Miles and Barry, 1998).Those areas can exhibit huge temperature differences between cold air and relative warm water.The resulting heat loss causes fast formation of new ice.Even leads covered by thin ice show much higher heat fluxes than the surrounding thick ice (Maykut, 1978).The low albedo of leads promotes an energy flow in the opposite direction which increases the amount of absorbed insolation, resulting in a warming of the underlying water.Leads reduce the internal strength of the sea ice, enabling higher drifting velocities (Rampal et al., 2009) and are expected to influence the atmospheric boundary layer chemistry (e.g., Moore et al., 2014).
Large-scale satellite remote sensing studies of lead occurrences have been done based on visual and thermal imagers (e.g., Lindsay and Rothrock, 1995;Willmes and Heinemann, 2015).They are generally limited by the resolution of thermal infrared measurements of about 1 km and by the influence of clouds.By using passive microwave data, Röhrs et al. (2012) avoided the requirement of free sky conditions but reduced the resolution even further to 6.25 km.Despite this resolution a good agreement with CryoSat-2 (CS-2) and the Advanced Synthetic Aperture Radar (ASAR)-based estimates of the lead occurrence for leads wider than 3 km has been reported in Röhrs et al. (2012).CryoSat-2-based lead detection is expected to be a good complement to previous estimates as it combines an increased resolution of some hundred meters with a strong atmospheric independence.The quality of this approach has been assessed by Zygmuntowska et al. (2013) for airborne surveys and is the topic of this study for CS-2 measurements.
Apart from the lead area, also the width distribution is important for the turbulent heat transport in ice-covered regions.A convective boundary layer evolves over leads which increases in thickness towards the downwind side of the lead (Andreas et al., 1979).This boundary layer dampens the heat flux per lead area which is therefore higher for narrow leads than for wider ones.This has led to different lead-width-dependent heat transfer formula-tions (e.g., Andreas and Murphy, 1986).Marcq and Weiss (2012) show that the turbulent heat flux over leads is up to 55 % higher if using a power-law distribution down to a lead width of 10 m instead of considering all leads as one large area of open water.
The extent of Arctic sea ice has declined substantially over the last decades (Serreze et al., 2007), while comparable studies for the ice thickness are rare and struggle with uncertainties (Lindsay and Schweiger, 2015).Ice thickness estimates based on upward-looking sonars on submarines (e.g., Rothrock et al., 2008) or moorings (Proshutinsky et al., 2009) have a relatively sparse temporal and spatial coverage.Airborne and helicopter-based thickness measurements utilize the strong difference between the electromagnetic inductances of seawater and ice.They are of great value for regional studies and validation, but are restricted by the limited number of conducted surveys (Haas et al., 2010;Renner et al., 2013Renner et al., , 2014;;Maaß et al., 2015).
Sea ice thickness is retrieved from satellites by radiometry, i.e., the influence of the ice thickness, salinity and temperature on the emissivity and transmittance.Various passive thermal to microwave sensors have been used (AVHRR, MODIS, SSM/I, AMSR-E, MIRAS) (Yu and Rothrock, 1996;Singh et al., 2011;Martin et al., 2005;Kaleschke et al., 2012;Tian-Kunze et al., 2014).As the ice thickness information saturates for all these sensors at a certain level, this approach is only capable of measuring relatively thin ice, typically well below 1 m (e.g., Kaleschke et al., 2010).
Another approach utilizes altimetry in order to derive the snow or ice freeboard, i.e., the elevation difference between the sea surface height (SSH) and snow or ice surface, respectively.Laser signals only reach the snow surface, while radar altimeters basically show the snow-ice interface elevation.By considering the relevant densities and the snow thickness those freeboards can be converted into ice thickness by assuming hydrostatic equilibrium.Sea ice thickness has been derived from K u band radar altimetry from the European Remote Sensing satellites ERS-1 and ERS-2 as well as Envisat and CS-2 (Laxon et al., 2003;Giles et al., 2008;Laxon et al., 2013;Ricker et al., 2014).These radars are not restricted to clear sky conditions, but limited knowledge of the snow loading and the radar interaction with the snow layer currently limits the accuracy of altimeter-derived sea ice thickness estimates (Willatt et al., 2011;Kwok, 2014).Advantages of the radar on CS-2, over earlier K u band altimeters are the reduced footprint size and noise due to the synthesis of overlapping measurements, its orbit which allows a coverage up to 88 • N and S and the potential of interferometric measurements (Wingham et al., 2006).In most parts of the Arctic Ocean, the Synthetic Aperture Radar (SAR) mode is used except for many coastal areas where the SAR Interferometric (SARIn) mode is applied.Until July 2014 the so-called "Wingham Box" (80-85 • N and 100-140 • W) was another area of SARIn-mode measurements.
The SSH is crucial for altimeter-based ice thickness retrievals.For this reason the altimeter measurements are separated into those from ice and those from leads (see Fig. 1 for examples from CS-2).The lead measurements are used to derive the SSH, which acts as reference for the freeboard.Leads covered by thin ice and falsely detected leads (i.e., thick ice) result in an overestimation of the SSH and therefore in a negative bias in the derived freeboard and thickness.If considering only a very few, assured lead measurements, the statistical error increases (Armitage and Davidson, 2014).It is therefore of high interest to find a lead detection method which is very trustworthy and detects as many leads as possible.
In this study the quality of CS-2-based lead detection procedures is assessed by a comparison with MODIS measurements.Previously published classifiers are implemented and compared with newly derived ones in a receiver operating characteristics (ROC) graph.The most promising one is subsequently used to derive the lead area fraction and the lead width distribution.Thereby this study attempts to close a gap of knowledge about the differences of lead detection procedures from CS-2 and makes suggestions for improvements, which has direct implications for sea ice thickness estimates.

The ground truth
In order to optimize and compare the performance of different classification routines, we choose a supervised classification approach.Visual Moderate Resolution Imaging Spectroradiometer (MODIS) measurements can be used to distinguish between sea ice and water (Su et al., 2012).Two MODIS instruments are in operation on the NASA satellites Terra and Aqua.They cover the earth surface every 1 to 2 days and measure in 36 spectral bands from visual (used here) to infrared (Barnes et al., 1998).We identify land and cloud influences manually and are therefore able to rely only on the MODIS band 2 (around 857 nm wavelength) level 1B reflectance as reference data.It has a resolution of 250 m and seems to be even more suited to identify leads than band 1 (not shown).Dark areas with sharp edges and linear shapes in the MODIS images are interpreted as leads.CS-2 measurements from these areas, recorded less than 1 h before or after the MODIS acquisition, are manually labeled as lead.In the same way we identify CS-2 measurements of ice, while all measurements with a mixture of both classes within the footprint are excluded from this study (see also Fig. 2a).The CS-2 footprint is assumed to be 300 m in and 1500 m across flight direction.In the following, this ice/lead information is considered as ground truth, regardless of possible mislabeling, for example, caused by unexpected high ice velocities.
The ground truth consists of 722 lead and 5768 ice measurements.Note that this method is limited by the resolu-  tion of MODIS.CryoSat-2 measurements which look like they originate from ice in MODIS scenes can actually contain small amounts of leads.See Sect.4.2 for a discussion on this circumstance.The ground truth is acquired from February to the beginning of May in 2012 and 2013 from seven MODIS granules in the eastern Beaufort Sea and north of the Canadian Arctic Archipelago.For this time of the year optical MODIS scenes are available and surface melting can be ruled out.Within this study we use CryoSat-2 Level 1b data with processor versions "SIR1SAR/4.0" and "SIR1SAR/4.1"(Baseline B).These two SAR mode versions are equivalent.

Relation to physical properties
Large-scale roughness results in a spread in time of the received CS-2 signal as exposed parts of the surface are reached earlier than low-lying parts.Roughness with a scale smaller than the wavelength (∼ 2.2 cm for K u band) reduces the specularity of the surface.Therefore measurements of the same position from altering incidence angles are more similar for rough surfaces (Wingham et al., 2006).In addition areas further away from the nadir point have a stronger contribution, leading to an emphasized signal following the first (nadir) peak (Laxon, 1994a).Energy conservation conditions a reduced maximal receivable signal if the emitted power is scattered in all directions by a rough surface.
The characteristic impedance of the surface layer might also influence the signal amplitude (Laxon, 1994a).If the difference in impedance at 13.5 GHz of the uppermost layer and the air is small, there is less reflection and more transmission into the ice/snow.Within the medium it is partly absorbed and scattered by inhomogeneities, again leading to a spread of the signal with lower maximum values and a more homogeneous angular distribution.This process could for example be favored by a layer of snow with moderate temperature.
As leads are locally bound, the fetch is too small for bigger waves to evolve in the water.The thin ice cover, if present, is yet neither physically deformed nor covered with snow.Furthermore the microstructure of young ice is more compact than of older ice as most brine pockets are filled and fewer connections have evolved.Therefore leads can be characterized by their commonly flat surface with relatively high impedance difference to the air.The returns originating from leads are expected to be compressed in time with higher max-  imum values and stronger incidence angle dependency (specular returns).
The Doppler shift is used in the CS-2 SAR mode to split each returning echo into 64 beams with different along-track incidence angles.For each processed point on the ground ,all beams targeting this point from altering satellite positions are combined to one waveform (Wingham et al., 2006) i.e., the returned power as function of time (see Fig. 1 for typical ice and lead waveforms).The following waveform-based parameters are used: maximum power, pulse peakiness, leading edge width and trailing edge width.While in the process of waveform formation the information of the angular dependency is disregarded, the beams are additionally integrated over time (summed) individually.Thereby the incidence angle information is maintained in exchange for the temporal development.The returning energy as function of beam number (i.e., incidence angle) is approximated by a fitted Gaussian distribution curve.We use the stack standard deviation and the stack excess kurtosis parameters which are based on this curve.

Parameter definition
-The maximum power (MAX) is the highest recorded power of the calibrated waveform in Watts.
-The pulse peakiness (PP) has been established by Laxon (1994b) and is defined as the MAX divided by the accumulated power (P WF ) of all bins constituting the waveform: which is the same definition as used by Armitage and Davidson (2014), while the values of Laxon et al. (2013) are divided by 100 and those of Ricker et al. (2014) by 128 for consistency.
-The left and right pulse peakiness (PPL and PPR) from Ricker et al. (2014) for Baseline B data are defined as (R. Ricker, personal communication, January 2015): where imax is the index of the maximal value of the waveform.The PPL and PPR have been proposed in order to reject off-nadir leads, the influence of which can not be quantified based on our methodology (see Sect. 2.1).Therefore the PPL and PPR are not fully included in this study.However, they are defined as we use the classifier of Ricker et al. (2014) for comparisons.
-The leading edge width (LEW) is defined as the width between 1 and 99 % of the amplitude of a Gaussian fit to the leading edge of the waveform.The fitted area starts at the first bin, reaching 1 % of the maximum power and ends at the second bin, following the first peak.The first peak is the first local maximum reaching at least 50 % of the maximum power.To avoid bimodal waveforms, we exclude measurements with a first peak smaller than 80 % of the maximum power from the ground truth.About 7.6 % of the waveforms are discarded in this way.Similar fits and constrains are used by Kurtz et al. (2014).
-The trailing edge width (TEW) is defined as the width between 99 and 1 % of the amplitude of an exponentially decaying fit to the trailing edge of the waveform.
The fitted area starts at the position of the maximum power and ends at the last bin (e.g., Legresy et al., 2005).
-The stack standard deviation (SSD) is the standard deviation (SD) of the mentioned Gaussian distribution of the energy as function of beam number (i.e., incidence angle).The SSD describes the width of the Gaussian; it is not the SD of the energy values themselves.We use the SSD in units of "beams" but it can also be expressed in degrees.Due to the more specular characteristics of leads, the spread of power with incidence angle is expected to be smaller and so is the SSD for leads (Wingham et al., 2006).
-The stack excess kurtosis (SK) is also obtained from the Gaussian approximation of the energy as function of beam number.Continuous Gaussian functions have in general an excess kurtosis of zero, so how can the SK reach other values?This is attained by evaluating the Gaussian at the beam numbers.The excess kurtosis of these discrete values is the SK (Veit Helm, personal communication, June 2014;Wingham et al., 2006).The fitting of the Gaussian to the measured beam energies and subsequent evaluation of it at the very same positions can be understood as a smoothing procedure.It is worth mentioning that this procedure might also limit the information the SK provides.The kurtosis is a measure of the peakedness which is expected to be higher for leads.

Threshold optimization
Threshold-based classifications are widely used to identify leads from K u band altimeters.We use a repeated random cross-validation technique to derive and test thresholds ( ) (interested readers are referred to chapter 9 in Duda et al., 2001).consists of one threshold for each parameter used for the respective classifier.The cross-validation involves a random separation of the ground truth samples into a training and a testing subset, each of which consist of 50 % of all samples.From the training subset we derive by using Eq. ( 4) and apply it to the testing set to investigate its performance.The random assignment into subsets and the testing of the newly derived is repeated 200 times for each classifier to get an overall performance and an estimation of its spread.These steps are illustrated in Fig. 3.As mentioned in Sect. 1 there are different applications for lead detection algorithms also resulting in different demands on its characteristics.One plausible aim is to reduce the total number of false detections to a minimum.But one might also be interested in a more conservative lead detection by reducing the amount of ice being detected as lead (false leads) at the cost of fewer correctly detected leads (true leads).A more conservative detection might be used for a freeboard retrieval as false leads might result in a bias while high true lead rates are not always of high importance.
To take these different demands into account we include a weighting factor w in the cost function.
where False_Ice represents the number of lead samples classified as ice. is derived by minimizing the cost function on the training subset using the Nelder-Mead simplex algorithm (Nelder and Mead, 1965) with up to 400 initial guesses to find the global minimum.The Nelder-Mead method is an unconstrained direct search algorithm for multidimensional minimization.This optimization reduces primarily false leads for 0 < w < 1, while for w = 1 the total number of false classifications (false ice + false leads) is minimized.We use the parameter acronym with the weight as index to point at the corresponding one-dimensional classifier.
This methodology is applied to all single parameters and all possible pairs of them.In the latter case, is derived as the combination of both thresholds with the smallest value of the cost function.

Classification performance
In Fig. 2 the CS-2 track essentially crosses three wider leads, two of which are brighter at the northern side.This indicates that they are covered by ice on this side, while the southern side might exhibit open water.The third wider lead around 71.2 • N and a thinner one at 71.75 • N seem both to be completely covered by thin ice.The manual classification in Fig. 2a only visualizes the methodology as the time difference is larger than 1 h and this scene is therefore not part of the ground truth.Gaps in the track occur when the MODIS information of CS-2 footprints cannot be assigned unambiguously to leads or ice.The PP 1 , MAX 1 and the classifier developed by Ricker et al. (2014) (hereinafter called RI14) show strong similarities as they detect all relevant leads while lead detections are very rare where the MODIS scene shows ice.However all of them show in some cases a mixture of ice and lead detections within wide leads (not shown).The classifier used by Laxon et al. (2013) (hereinafter called LX13) detects all visible leads without a significant number of missing lead detections, but it also detects leads where no or only weak indications for them can be found in the MODIS scene.
Figure 4 shows a receiver operating characteristics (ROC) graph of all tested classifiers.Each classifier is represented by one point in the graph, the position of which is defined by its true lead rate (TLR; the amount of correctly detected leads divided by the number of tested lead samples) and false lead rate (FLR; the number of ice measurements in the ground truth detected as lead divided by the number of tested ice samples).The upper left corner corresponds to ideal classifiers and the principle diagonal represents random assignments.For each parameter and pairs of them, we use different weights, resulting in different and corresponding performances.For single parameter classifiers, 15 different weights (0.001, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 1, 2, 5, 10, 30, 100) are applied to capture the development of the performance (from the lower left corner to the upper right in Fig. 4) while w = 0.001, w = 0.5 and w = 1 are used in the two-dimensional case.To follow the performance of e.g., MAX-based classifiers one can start with small w, implying high values of which only detects a few leads (lower left corner in Fig. 4).With increasing w and decreasing , the TLR increases in the beginning much faster than the FLR.At some point the number of correct lead detections is mostly constant, while a further lowering of mainly increases the number of ice measurements which are detected as lead.As relative performances are shown, the classifier closest to the upper left corner is not necessarily the "best" one but if one classifier is on the upper left side of another it can be considered as superior.Further remarks on ROC graphs are given by Fawcett (2006).
It is, at this point, not important how the thresholds ( ) are derived but only the combination of its value and perfor-  Figure 5a illustrates the spread within the runs in terms of the SD of the true and false lead rates.The differences between all shown one-dimensional classifiers and the corresponding two-dimensional ones are clearly smaller than the inherent fluctuations and are therefore considered as not significant.The classifiers based on the MAX are separated from the others by more than their SDs for small weights, while they are not for higher weights.However, the fluctuation in classifier performance of individual runs with the same weight occur mostly in the direction of the mean performances of neighboring weights on the same parameter (i.e., along the lines) as shown in Fig. 5b.

Sea surface height
The SSH is calculated as a second stage of assessing the quality of classifiers.To derive the SSH from leads is a popular application; to test the classifier behavior in this context is therefore a very practical approach.This is done statistically by investigating the stability of SSH estimates from different classifiers.
The function A sinc 2 (π B w (τ − τ 0 )) is fitted to the waveform from P WF imax−2 to P WF imax+2 .Where A is the amplitude, B w = 320 MHz is the received bandwidth and τ the delay time.τ 0 is the center of the fit and is used as the tracking point, i.e., the delay time which is assumed to correspond to the return from the main scattering surface.Kurtz et al. (2014) have shown that specular returns are well approximated by a sinc 2 function and that the tracking point should be defined close to the maximum of the waveform.The range is corrected for atmospheric influences (ionosphere, wet and dry troposphere, dynamic atmosphere and the inverse barometric effect) and tides (namely: ocean, long period, solid earth, polar and ocean loading tides) as provided in the CS-2 L1B data.The surface elevation of lead measurements is considered as SSH.All SAR mode measurements from January to March of the years 2011 to 2014 are brought to a 10 km × 10 km grid.
Figure 6 shows the SSH anomaly, i.e., the difference of individual measurements along a CS-2 track from the multiyear mean SSH field.The LX13 shows the largest number of lead detections and the strongest SSH anomalies.The other three classifiers show a more similar behavior but with the MAX 1 having a notably reduced number of large (outside of ±0.2 m) SSH anomalies.
The mean SSH field could be used as reference for ice thickness estimates.The variance around it acts as an indicator for its reliability and is caused by SSH variability, noise and the lead detection behavior.We expect differences of the variance between the classifiers to be caused only by the detection behavior, namely the inclusion/exclusion of ice and/or off-nadir lead measurements.Figure 7 shows the variance distribution of selected classifiers based on the gridded SSH estimates.The MAX 1 shows in general the smallest variances, while the histograms converge to zero with increasing variances for all classifiers in a similar way.

Spatial distribution
In the following sections we use the MAX 1 which has been derived by minimizing the total number of false classifications and its results are therefore taken as the best representation of the overall lead occurrence.
Figure 8 shows the lead fraction in the Arctic region as derived from CS-2 by dividing the number of detected lead measurements by the total number of measurements from January to March 2011.The AMSR-E Arctic lead area fraction (Röhrs and Kaleschke, 2012;Röhrs et al., 2012) (downloaded in September 2014) is also shown, combined over the same period and brought to the same spatial resolution.
Lead detections from CS-2 are most common in Baffin Bay, the Fram Strait region, the northern Barents Sea and the Kara Sea, as well as in the western Laptev Sea and the Chukchi Sea, all with lead fractions up to around 15 % (Fig. 8a).The central Arctic, including the area north of the Canadian Arctic Archipelago and the northern Canada Basin, show low lead fractions of around 0-1.5 %.In the southern Beaufort Sea and especially its shear zone next to the coastline, lead fraction values of up to 6 % occur.
A somewhat different picture of the lead fraction pattern emerges by using the AMSR-E Arctic lead area fraction from Röhrs et al. (2012) (Fig. 8b).In areas covered by both estimates, the CS-2-based one mostly appears to be higher than the AMSR-E-based estimate.This is not the case in the southeastern Beaufort Sea, where the AMSR-E product shows values of 15 % and more, while they reach from 1.5 to 5 % for the CS-2-based estimate.We observe reasonable agreements in the Fram Strait region, the East Siberian Sea and the Chukchi Sea.Increased values occur for both estimates near islands like Svalbard, Franz Josef Land, Severnaya Zemlya and Wrangel Island.However there are big differences between the data sets in the Baffin Bay, the Fram Strait regions close to the ice edge, the northern Barents Sea and the Kara Sea, where CS-2 consistently detects more leads than the AMSR-E lead area fraction indicates.
While a daily open ocean mask is provided for the AMSR-E product, we consider all areas north of 65 • N for the CS-2-based estimates.The ice edge on the Atlantic side, as indicated by the AMSR-E mask, agrees well with the transition of CS-2 lead fractions from zero to higher values.
By the end of February 2013 the whole Beaufort Sea was pervaded by leads.Favored by storms, in mid-February, the ice started to move into the direction of the Bering Strait, causing a divergence in the pack ice.This is the reason for the opening of leads, beginning in the western part and propagating to the east.This process accelerated around 27 February after which all but the fast ice at the Canadian coast and the sea ice at the Canadian Arctic Archipelago was fractured.See also Beitsch et al. (2014) for further descriptions.
By comparing the CS-2 lead fractions from February and March 2013 (Fig. 9), the pattern of this fracture event is reproduced with a proper shape and amplitude.Most lead patterns can be observed in both months, in many cases slightly decreasing in amplitude towards March.However, while in February, noticeable amounts of leads are only detected in the western part of the Beaufort Sea, the complete region shows 8 to 15 % lead coverage in March.

Apparent lead width
To investigate the lead width distribution we use a proxy which we call apparent lead width.The apparent lead width is the number of consecutive MAX 1 lead detections multiplied by the approximate distance between two positions of 300 m.It can be seen as a measure of the CS-2 track interval over a crossed lead or as the width of a lead how it appears in the one-dimensional domain of the CS-2 track.If the lead orientation is orthogonal to the CS-2 track, the apparent lead width is our best estimate of the actual lead width.We do not allow any ice detection within a lead which will in case of false detections split a lead into smaller ones.
The apparent lead width distribution follows a power law in winter months with an exponent a of 2.47 for values of 600 m and more (Fig. 10).A quantity z is classified as being power law-distributed if its probability density function p(z) satisfies: (5) It is derived following the approximation of Clauset et al. (2009) for discrete distributions with a simple adjustment for a step size of 300 m as shown in Eq. ( 6).
For the calculation of the power-law exponent, only apparent lead widths z i with a width equal or higher than z min = 900 m are considered, with N Z being the amount of them.A line representing a power law with the calculated exponent is displayed in Fig. 10.It shows the validity of this approximation down to 600 m as the slopes of both lines agree very well.The interannual variability is small, with exponents between 2.42 in 2013 and 2.52 in 2011 with a SD of 0.04 amongst all 4 years.Differences between January, February and March of the same year are even smaller while the exponent decreases towards spring and autumn.All calculated distributions follow a power law for apparent lead widths of 600 m and more.

Classification performances
Classifiers based on the MAX parameter generally show the best ratio between true and false lead rate.A classifier using MAX > 2.58 × 10 −11 W as threshold (MAX 1 ) detects 68.18 % of all leads correctly, while only 3.41 % of the tested ice measurements in the ground truth are detected as leads.The PP 1 using 0.35 as threshold has a TLR of 64.66 % (instead of 68.18 %) and a FLR of 4.09 % (instead of 3.41 %).The differences are even stronger for higher thresholds of 1.22 × 10 −10 W and 0.425, respectively (MAX 0.5 and PP 0.5 , Table 1).
The performances of individual runs overlap only slightly for w = 1 and are well separated for w = 0.5.This shows that the performance improvement is significant.The increased  fluctuation in the direction of neighboring weights in Fig. 5b is likely to be caused by a variability of the thresholds caused by the repeated optimization.
For airborne surveys with a device very similar to SIRAL on CS-2, Zygmuntowska et al. (2013) also found the MAX parameter to have fewer false lead classifications than all other parameters.The best combination of parameters (MAX & TEW) with a Bayesian classifier improves its rate only slightly from 6.5 to 6.2 %.Zygmuntowska et al. (2013) define the false lead classification Rate (FLCR) as the percentage of all lead detections originating from sea ice.This is different to our false lead rate as we use the number of true ice measurements as a base.The FLCR calculated from the absolute values in Table 1 are 28.6 and 12.5 % for MAX 1 and MAX 0.5 , respectively.One reason for higher error rates of CS-2 is the reduced resolution of 300 m × 1500 m in con-trast to around 10 m × 50 m for the airborne device; thereby it becomes much more likely that different surface types occur within one footprint.Further we have to allow for some temporal differences in the data acquisition and have to collocate the data sets, while for the airborne surveys optical images are taken simultaneously.Deficiencies of the ground truth which might be caused by ice drift and opening/closing of leads between the data acquisition, collocation and unnoticed narrow leads increase the error rates which might therefore be overestimated.
Compared to their MAX classifier, the PP classifier of Zygmuntowska et al. (2013) detects more leads from both, ice and lead measurements.This is directly connected to applied thresholds and is not a parameter property.For a solid decision as to which parameter is suited best for lead detection, it is necessary to vary the thresholds.Three classifiers developed by other authors are included in this study.With the same number of false leads, the true lead rates can be increased for our data set from 9 to ∼ 13 % (Röhrs et al., 2012), from 83 to ∼ 89 % (LX13) or from 61 to ∼ 79 % (RI14) if a MAX-based classifier is used instead (Fig. 4).
The shown classifiers using two parameters detect a lead if both thresholds are reached.This logical "and" criterion is now replaced by an "or".A classifier based on the MAX and the PP could for example define a measurement as a lead if its MAX value is above 10 −11 W or if its PP value is above 0.3 (one of those is now sufficient).This influences the number of false ice and false lead detections (i.e., the cost function).As a result, our example has higher thresholds than it would have for the same weight and parameters using the "and" criterion.Performing the same test as before but now using the "or" criterion for all pairs of parameters achieved no improvement of the classification (not shown).
The fact that combining two parameters seems to have no benefit at all indicates that the parameters are basically all utilizing the same physical information and that the instrument and fading noises have either a correlated influence on all parameters or the influence is not significant at all.As some of the parameters are derived in a very different way (e.g., waveform-and stack-based ones) we do not expect the noise to affect them equally.We conclude that noise probably plays only a minor role in the classification errors.

Narrow leads and sea surface height
It has been shown that leads which only cover a small fraction of a radar altimeter footprint can dominate the signal due to the high amplitude of specular returns (Drinkwater, 1991).Therefore CS-2 detects leads which are simply not visible for MODIS despite its higher resolution.The fraction of these leads in the ice class of the ground truth cannot be quantified by our approach.These narrow leads either cover the nadir point or not, while leads covering the whole footprint ("true leads") do for sure.Therefore one could expect true lead measurements to ensure a higher quality (see Sect. 4.3) for the derivation of the SSH.
This expectation is supported by the smaller spread of the SSH estimate based on the MAX 1 compared to the PP 1 , with nearly the same number of lead detections (true + false leads; Table 1).This advantage, on the other hand, certifies that narrow, unnoticed leads in the ice class do not reverse the ROC analysis.

Off-nadir leads
As mentioned before, leads which are not directly in nadir direction can dominate the signal.As this can cause a bias in elevation estimates, Ricker et al. (2014) introduced the left and right pulse peakiness to avoid off-nadir leads.It has further been shown that it is, to some extent, possible to reduce the influence of off-nadir leads by increasing the pulse peakiness threshold of a single parameter classifier (Armitage and Davidson, 2014).This is done at the cost of discarding up to 60 % of the lead detections and thereby increasing the statistical error.The underlying process allowing for this reduction is the influence of the surface orientation towards the sensor on the maximum return.The relative orientation, favoring high maximum values the most, is expected to be found close to the nadir point.The further away from this point the main scattering surface (i.e., the lead) is, the more power is reflected away from the sensor instead of back towards it.This process influences the MAX value in the first place which then has implications for the PP (Armitage and Davidson, 2014).Therefore it is reasonable to assume that the influence of off-nadir leads is also reduced for high MAX thresholds, potentially even stronger than for the PP, as the process causing this reduction has a more direct impact on it.This assumption is supported by the reduced SSH variance of the MAX 1 even though we cannot say whether this reduction is caused by the elimination of off-nadir leads or incorrectly classified ice measurements (or a combination of both).

Spatial distribution
The CS-2 lead fraction shows a reasonable spatial distribution.It is small in the central Arctic and north of the Canadian Arctic Archipelago which are typical regions of thick multi-year ice.It shows high values in regions of high drifting velocities or those known to favor the development of polynyas like the Fram Strait, the western Laptev Sea and the Chukchi Sea.The lead fractions also increase around most islands and coasts which introduce shear between the land fast ice and the drifting pack ice.Small lead fractions in the eastern Laptev Sea and the western parts of the East Siberian Sea could indicate the presence of large amounts of land fast ice.The absolute lead fraction values tend to be higher but are mostly in agreement with those of Lindsay and Rothrock (1995).They found lead fractions of 2 to 3 % for the central arctic and 6 to 9 % in the peripheral seas in the winter using the Advanced Very High Resolution Radiometer (AVHRR).
In nearly all regions, the CS-2 lead fraction exceeds the AMSR-E Arctic lead area fraction from Röhrs et al. (2012) (Fig. 8).While the AMSR-E product only detects most leads with a width of 3 km and more, a width of at least some hundred meters is sufficient for detection by CS-2.As shown in Sect.3.4, the apparent lead width follows a power law on the scale of kilometers, implying that measurements from narrow leads largely outnumber those from wider leads.In contrast to the CS-2 lead fraction, the AMSR-E product additionally does not include very large regions of thin ice like huge polynyas, as a spatial high-pass filter is used.
The ice edge towards the North Atlantic is captured by both approaches quite similarly.In Fig. 8a we expect the ice edge to be at the interface between areas of no lead detections around the Norwegian and central Barents Sea and neighboring areas of higher lead fractions.This allows the inference that the MAX 1 detects no leads over the open ocean.For this reason the lead fraction of grid cells at the very ice edge is likely to be underestimated, relative to the ice-covered part of the cell.
While the AMSR-E lead fraction drops relatively consistently down to values around 2-3 % within a belt of around 200 km from the ice edge, CS-2-based estimates show much higher values of around 12 % in these areas.The high values in the marginal ice zone are reasonable as this area is likely to be fractured due to the influence of ocean waves.Especially in the Baffin Bay, the northern Barents Sea and the Kara Sea, high rates of new ice formation can occur in winter which is in good agreement with high CS-2 lead fractions of these regions.The general reasonable distribution and its alternation enhance our confidence in the CS-2 lead detection algorithm.

Apparent lead width
Compared to the power law, the found number of apparent lead width of 300 m is smaller than expected.This is a typical feature of the lower bound of the resolution as leads of this size are not always covered by a single measurement but partially by more, not necessarily leading to a detection.This is intensified by the elongated footprint of CS-2 as small leads may only be detected if they cover most of the width of the footprint.The MAX 1 is optimized mainly on leads wider than a single measurement which could also cause the relative small number of apparent lead width of 300 m.Therefore it is likely that the bend on the lower bound of the distribution in Fig. 10 is an artifact and not a valid part of the lead distribution.
Marcq and Weiss (2012) have found a power-law exponent similar to ours, between 2.1 and 2.6 for scales from 20 m to 2 km, by analyzing a single SPOT image with a resolution of 10 m.In two submarine-based surveys, power laws with exponents of 2 and 2.29 were found for the regions from www.the-cryosphere.net/9/1955/2015/The Cryosphere, 9, 1955-1968, 2015 the Fram Strait to the North Pole and the Davis Strait, respectively (Wadhams, 1981;Wadhams et al., 1985).In both cases, resolutions of about 5 m are present and the power law holds for the range from 50 to 1000 m.The examination of submarine and mooring data by Kwok et al. (2009) also indicates a strong accumulation of lead widths down to 5 m but the distribution has not been analyzed.For the central Arctic, a study of Lindsay and Rothrock (1995) also states a powerlaw distribution, but with a mean exponent of 1.6 for scales from 1 to around 50 km.It is based on thermal to near-visible infrared measurements from the AVHRR, which is, despite its resolution of 1 km, expected to detect leads with a minimum width below this size.It has been discussed whether the lead width distribution might be scale-dependent (Lindsay and Rothrock, 1995;Marcq and Weiss, 2012) which seems not to be the case, as we found a stable power-law behavior on scales partly covering those of all other studies.The results of Lindsay and Rothrock (1995) are contradictory to ours as we found a higher power-law exponent, implying a higher fraction of narrow leads.One explanation would be the relative coarse resolution of the AVHRR in combination with its high sensitivity to leads.This could cause leads to appear wider then they are, as well as several narrow ones to appear as one wide lead, resulting in a less steep apparent lead width distribution.Comparisons with MODIS images indicate that the classifier used in this study switches in some cases between lead and ice detections over refrozen leads.This could result in an overestimation of the power-low exponent.The estimates might also have a different tolerance of refrozen leads, while both include at least the early stages of freezing.The size of leads often grows with time as the surrounding ice floes keep drifting apart, meaning that estimates which include older leads are also likely to show less steep apparent lead width distributions.
Another reason could be an actual shift in the distribution between the periods from 1989 to 1995 and 2011 to 2014.This would be consistent with the results of Marcq and Weiss (2012) but would not explain the differences to those studies by Wadhams (1981) and Wadhams et al. (1985).However, this shift could be driven by observed changes in the amount of perennial ice, the ice thickness and drifting velocities (Nghiem et al., 2007;Haas et al., 2008;Rampal et al., 2009).Rampal et al. further link an increase found in winter strain rates between 1978 and 2007 to a weakening in mechanical strength of the ice and increased fracturing.We found no sign for a trend of the power-law exponent within the 4 years of CS-2 data.

Implications of apparent lead width distribution
As most leads are not crossed orthogonally, the apparent lead width is typically larger than the actual width of the lead.A transformation to the latter is not possible without profound knowledge of the sensitivity of lead detections and it requires assumptions about the shape and orientation of leads.This is impeded by a nonuniform distribution of lead orientation (Bröhan and Kaleschke, 2014).For most applications it is not necessary to perform this transformation as this is the way leads appear to anything moving along sea ice, including the wind acting on the ocean surface.
The apparent lead width distribution shows a strong intensification towards smaller lead widths.The area contribution of leads with the width z is z • p(z) ∝ z −2.47+1 , which still decreases relatively fast with increasing width.This indicates that every lead area estimate which is not capable of detecting narrow leads is very likely to underestimate the total lead area.For a parametrization of lead area estimates it is of high interest to know down to which bound the power-law behavior holds.This defines not only the mean lead width but also the fraction of lead area which is not captured by the estimate.

Conclusions
This study presented the potentials of several parameters and combinations of them to distinguish CryoSat-2 measurements from leads and those from ice.They have been tested by deriving thresholds and analyzing their capabilities of reproducing a prior classification.The combination of parameters, even though common practice, has not shown any advantage for threshold-based classifications.Using the maximum value of the waveform has in all cases shown better results than any other tested parameter, including the pulse peakiness.Compared to the classifier used by Laxon et al. (2013), a threshold of 2.58 × 10 −11 W on the MAX detected only 68 instead of 83 % of ensured lead measurements but showed a much more stable SSH estimate by reducing the amount of ice being detected as lead and/or off-nadir leads.A solid lead detection, which ensures that nearly all lead classifications actually originate from leads, facilitates a precise, unbiased freeboard retrieval.It thereby helps to improve ice thickness estimates, which is one of the major aims of the CryoSat-2 mission.
The threshold of 2.58 × 10 −11 W was further used as the best representation of the overall lead occurrence.It showed reasonable spatial distributions with relatively high lead fractions of around 12 % in the marginal ice zone.This data set has been made available at http://icdc.zmaw.de/.The apparent lead width was derived from the number of consecutive lead detections.Its distributions follow a power law with exponent of 2.47 ± 0.04 which implies a concentration of both amount and area contribution at small lead widths.Embedding this work into those of others, a scale-independent lead width distribution from 20 m to 50 km is likely.The implications for a parametrization of low-resolution lead area estimates were addressed and its dependency on the lower bound of the distribution found was emphasized.The turbulent heat transport over ice-covered regions is known to be strongly lead width-dependent on small scales.The distribution found suggests that the work of Marcq and Weiss (2012), based on a single SPOT scene, can be generalized.This implies a much higher heat transport per lead area than that which would be obtained by wide leads.In this manner the presented findings can help to improve the parametrization of this fundamental process in coupled ocean-ice-atmosphere models.
The Supplement related to this article is available online at doi:10.5194/tc-9-1955-2015-supplement.

Figure 1 .
Figure 1.Typical CryoSat-2 waveforms from ice (left panel) and a lead (right panel).The definition of the leading edge width (LEW), trailing edge width (TEW) and maximum power (MAX) are illustrated, while the pulse peakiness (PP) is inversely proportional to the gray areas that normalized waveforms would have.The bin number can be converted into delay time.Note the different scaling factors of the y axis (×10 −13 and ×10 −10 for the ice and lead waveform, respectively).

Figure 2 .
Figure 2. MODIS band 2 scene from 6 March 2013 in the southern Beaufort Sea combined with a CS-2 track taken 83 min later on.The CS-2 samples have been classified as lead (red) and ice (blue) manually (a) or by PP 1 (b), MAX 1 (c), RI14 (d) and LX13 (e).The classifier from Röhrs et al. (2012) detects no leads within this section.

Figure 3 .
Figure 3. Flow chart of the cross-validation scheme used.

Figure 4 .
Figure 4. ROC graph of tested classifiers with altering thresholds ( ) on one (connected by lines) and two (marker) parameters as well as predefined classifiers (magenta markers).RO12 corresponds to the classifier used in Röhrs et al. (2012).In the two-dimensional case, the color indicates one of the parameters, and the shape the other one.The inset is a zoom of small false lead rates.

Figure 5 .
Figure 5. (a) ROC graph including error estimates in terms of SD of the 200 runs for each weight of the single-feature classifiers using the MAX, PP and LEW as well as the performances and SDs of the predefined classifiers.For comparison, the performances of selected two-dimensional classifiers are included.(b) Performances of each individual run being part of the single-feature classifiers using the MAX, PP and LEW with weights of 0.5 and 1 (dots) in combination with mean values for all weights (lines).

Figure 6 .
Figure 6.Sea surface height anomaly from different classifiers along a typical descending CS-2 track from 6 March 2013.The shaded segment corresponds to the section shown in Fig. 2.

Figure 7 .
Figure 7. Histograms of grid cell SSH variance from different classifiers.Only values based on at least three lead detections are considered.

Figure 8 .
Figure 8. Lead fraction derived from CS-2 SAR mode (a) and from Röhrs et al. (2012) (b) on a north polar stereographic grid with a resolution of 99.5 km × 99.5 km, merged from January to March 2011.Only values based on at least 2000 CS-2 measurements north of 65 • N (a) or with a grid cell data coverage of 10 % or more (b) are shown.Missing CS-2 estimates north of Canada are caused by the use of the SARIn mode in the Wingham Box.

Figure 9 .
Figure 9. Lead fraction derived from CS-2 SAR mode on a north polar stereographic grid with a resolution of 99.5 km × 99.5 km from February (a) and March 2013 (b).Only lead fraction values north of 65 • N based on at least 1000 measurements are shown.Missing estimates north of Canada are caused by the use of the SARIn mode in the Wingham Box.

Figure 10 .
Figure 10.Apparent lead width distribution from all CS-2 SAR mode ocean measurements north of 65 • N in winter season (JFM) from 2011 to 2014.The distribution of a power law with exponent of 2.47 is included for comparison, forming a straight line in a double logarithmic presentation.See text for definition of the apparent lead width.

Table 1 .
Selected classifier performance.The last three classifiers are (from bottom to top): RI14, fromRöhrs et al. (2012)and LX13.TL: true leads; FL: false leads; TI: true ice; FI: false ice; TLR and FLR: true and false lead rates (%); eTLR and eFLR: SDs of TLR and FLR within runs (%).A list of all tested classifier performances is provided in the Supplement.