Mapping avalanches with satellites – evaluation of performance and completeness

: The spatial distribution and size of avalanches are essential parameters for avalanche warning, avalanche documentation, mitigation measure design and hazard zonation. Despite its importance, this information is incomplete today and only available for limited areas and limited time periods. Manual avalanche mapping from satellite imagery has recently been applied to reduce this gap achieving promising results. However, their reliability and completeness have not yet been verified satisfactorily. In our study we attempt a full validation of the completeness of visually detected and mapped avalanches from optical SPOT 6, Sentinel-2 and radar Sentinel-1 imagery. We examine manually mapped avalanches from two avalanche periods in 2018 and 2019 for an area of approximately 180 km2 around Davos, Switzerland, relying on ground- and helicopter-based photographs as ground truth. For the quality assessment, we investigate the probability of detection (POD) and the positive predictive value (PPV). Additionally, we relate our results to conditions which potentially influence avalanche detection in the satellite imagery. We statistically confirm the high potential of SPOT for comprehensive avalanche mapping for selected periods (POD = 0.74, PPV = 0.88) as well as the reliability of Sentinel-1 (POD = 0.27, PPV = 0.87) for which the POD is reduced because mainly larger avalanches are mapped. Furthermore, we found that Sentinel-2 is unsuitable for the mapping of most avalanches due to its spatial resolution (POD = 0.06, PPV = 0.81). Because we could apply the same reference avalanche events for all three satellite mappings, our validation results are robust and comparable. We demonstrate that satellite-based avalanche mapping has the potential to fill the existing avalanche documentation gap over large areas, making alpine regions safer. Abstract. The spatial distribution and size of avalanches are essential parameters for avalanche warning, avalanche documentation, mitigation measure design and hazard zonation. Despite its importance, this information is incomplete today and only available for limited areas and limited time periods. Manual avalanche mapping from satellite imagery has recently been applied to reduce this gap achieving promising results. However, their reliability and completeness have not yet been veriﬁed satisfactorily. In our study we attempt a full validation of the completeness of visually detected and mapped avalanches from optical SPOT 6, Sentinel-2 and radar Sentinel-1 imagery. We examine manually mapped avalanches from two avalanche periods in 2018 and 2019 for an area of approximately 180 km 2 around Davos, Switzerland, relying on ground- and helicopter-based photographs as ground truth. For the quality assessment, we investigate the probability of detection (POD) and the positive predictive value (PPV). Additionally, we relate our results to conditions which potentially inﬂuence avalanche detection in the satellite imagery. We statistically conﬁrm the high potential of SPOT for comprehensive avalanche mapping for selected periods (POD = 0.74, PPV = 0.88) as well as the reliability of Sentinel-1 (POD = 0.27, PPV

Abstract. The spatial distribution and size of avalanches are essential parameters for avalanche warning, avalanche documentation, mitigation measure design and hazard zonation. Despite its importance, this information is incomplete today and only available for limited areas and limited time periods. Manual avalanche mapping from satellite imagery has recently been applied to reduce this gap achieving promising results. However, their reliability and completeness have not yet been verified satisfactorily.
In our study we attempt a full validation of the completeness of visually detected and mapped avalanches from optical SPOT 6, Sentinel-2 and radar Sentinel-1 imagery. We examine manually mapped avalanches from two avalanche periods in 2018 and 2019 for an area of approximately 180 km 2 around Davos, Switzerland, relying on ground-and helicopter-based photographs as ground truth. For the quality assessment, we investigate the probability of detection (POD) and the positive predictive value (PPV). Additionally, we relate our results to conditions which potentially influence avalanche detection in the satellite imagery.
We statistically confirm the high potential of SPOT for comprehensive avalanche mapping for selected periods (POD = 0.74, PPV = 0.88) as well as the reliability of Sentinel-1 (POD = 0.27, PPV = 0.87) for which the POD is reduced because mainly larger avalanches are mapped. Furthermore, we found that Sentinel-2 is unsuitable for the mapping of most avalanches due to its spatial resolution (POD = 0.06, PPV = 0.81). Because we could apply the same reference avalanche events for all three satellite mappings, our validation results are robust and comparable. We demonstrate that satellite-based avalanche mapping has the potential to fill the existing avalanche documentation gap over large areas, making alpine regions safer.
Nevertheless, information on avalanche occurrence is only available for limited areas and time spans. This means that most avalanche events are not reported and therefore not captured in any database or cadaster and particularly not within poorly accessible regions .
Remote sensing technology is increasingly used to record and map avalanche occurrences with a consistent methodology and continuous spatial coverage over large regions. Optical data from airplanes and satellites with a high to very high spatial resolution (0.1-1.5 m) have been successfully used in the past to manually or semi-automatically map avalanches (Bühler et al., 2009;Lato et al., 2012;Eckerstorfer et al., 2016;Korzeniowska et al., 2017;Bühler et al., 2019). Optical data with a high to very high spatial resolution have mostly limited coverage and a low temporal resolution as they are usually available upon request only. Furthermore, they are often costly and depend on cloud-free conditions. Optical satellites under free and open data policies with a high temporal resolution but lower spatial resolution like Sentinel-2 have only been tested briefly for snow avalanche detection or have been used to complement Sentinel-1 investigations (Nolting et al., 2018;Abermann et al., 2019). For the documentation of individual avalanche events, unmanned aerial systems (UASs) equipped with optical cameras can flexibly provide detailed information, but they are not able to cover larger regions Eckerstorfer et al., 2016).
In the microwave spectrum, radar sensors operate independently of light and weather conditions. Radar sensors can detect the increased roughness (Oh et al., 1992) of the snow surface caused by avalanches (Eckerstorfer and Malnes, 2015;Leinss et al., 2020). Radar satellites, like RADARSAT, TerraSAR-X and Sentinel-1, have been successfully applied for avalanche mapping in various regions (Eckerstorfer and Malnes, 2015;Vickers et al., 2016;Eckerstorfer et al., 2017;Wesselink et al., 2017;Abermann et al., 2019;Leinss et al., 2020). Selective verification has shown that radar underestimates the avalanche activity to an unknown extent . Often only parts of the avalanches are mapped, and Sentinel-1 misses most small avalanches due to the limited spatial resolution (Leinss et al., 2020).
As consistent avalanche detection using satellite data is becoming increasingly important, the identification of its performance and reliability is essential. To do so, we assess the completeness of visually detected and manually mapped avalanches, using three different satellite sensors, which have recently been used to detect avalanches (e.g., Eckerstorfer and Malnes, 2015;Leinss et al., 2020;Bühler et al., 2019;Nolting et al., 2018;Abermann et al., 2019): optical SPOT 6, commercial, 1.5 m spatial resolution radar Sentinel-1, open access, 10 m spatial resolution optical Sentinel-2, open access, 10 m spatial resolution.
As validation data, we rely on photographs taken from the ground and from helicopters to document two extreme avalanche situations in 2018 and 2019, in Davos in eastern Switzerland. We compare the completeness of avalanches detected with the three sensors by answering the following two research questions: 1. Of the avalanches identified in the ground truth, how many were correctly detected in the satellite data by a human?
2. If a human visually detected an avalanche in the satellite data, how often was there an avalanche?
Furthermore, we investigate these findings in relation to conditions which potentially influence avalanche detection in satellite imagery. To do so we consider the size of the mapped avalanche from all approaches, the illumination conditions in optical SPOT 6 (SPOT hereafter) data and the predominantly detected parts of the avalanches in radar data. Finally, we highlight the potential and the limitations of a wellestablished, multi-year data set of mapped avalanches as an existing data source for validation.
2 Area and data sets

Study area and validation period
Our study area of approximately 180 km 2 is located around Davos, Switzerland ( Fig. 1). Of the total area, 25 % is considered avalanche release area according to the release area definitions introduced by Bühler et al. (2018). The study area comprises the main valley and parts of three inhabited side valleys (Flüela, Dischma, Sertig), as well as the surrounding mountains, and covers an elevation range from 1450 to 2981 m a.s.l. In January 2018, 93 % of the study area was covered by SPOT satellite imagery ordered for rapid mapping . The 7 % which was missed was excluded from our study and is shaded in red in Fig. 1. In January 2019 the entire area was covered by SPOT imagery. For the validation, we considered two periods with high avalanche activity (Fig. 2 -From 13 to 16 January 2019, referred to as 2019. Following several snowfalls in the 2 weeks before the period, heavy snowfall brought about 100 cm of new snow in 60 h. This resulted in mostly dry-snow avalanches, some with a destructive powder blast. In both situations, danger levels 4 (high) and 5 (very high) on the five-level ordinal European Avalanche Danger Scale (WSL, 2019) were forecasted in the study area and avalanches of all sizes released.

Satellite data
High-spatial-resolution (1.5 m) SPOT satellite imagery was acquired after the two validation periods on request (Fig. 2, Table 1). From operationally acquired medium-spatialresolution (10 m) Sentinel-1 (Table 2) and Sentinel-2 (Table 1) acquisitions we selected images from before and after the validation period.

Data preprocessing -optical data
We refrained from atmospheric corrections because they are not necessary for avalanche detection as atmospheric effects are relatively minor for most regions in winter since the water content of the atmosphere is typically low (Nolin, 2010). SPOT data were cloud-free for both years. For Sentinel-2 in 2019, about 7 % of the validation area on the post-event image was hidden by clouds. Because of the reliance on manual mapping we refrained from cloud preprocessing.

SPOT
SPOT imagery was delivered with type "primary" and pansharpened in full radiometric resolution (12 bit). The data were oriented using bundle block adjustment and orthorectified by swisstopo based on the high-quality terrain model swissALTI3D resampled to 5 m (swisstopo, 2018). In addition to automated tie-point generation, ground control points (GCPs) were digitized manually. The RMSE of the GCPs achieved a localization accuracy of better than 2 m in X and Y .
2.4 Data preprocessing -radar data 2.4.1 Sentinel-1 For processing, we followed the steps described in Leinss et al. (2020) but added local resolution weighting (LRW; Small, 2012) to optimize the spatial resolution and to minimize terrain shadow and layover effects. For LRW, two acquisitions from orbits with opposite view directions (ascending, looking east, and descending, looking west) were combined using a weighted average based on the local, terrain-dependent, resolution of every pixel. Table 2 lists the set of pre-and postevent images used for the two avalanche periods in 2018 and 2019; a processing flowchart is shown in Appendix A. The coherent imaging method of the synthetic aperture radar (SAR) system requires some spatial averaging to reduce radar speckle and to improve the radiometric accuracy of the backscatter intensity. The native resolution of the single-look-complex (SLC) interferometric wide swath mode (IW) images of Sentinel-1 is about 3 × 23 m (slant range × azimuth), provided at a slant-range pixel spacing of 2.3 × 14.1 m (Bourbigot et al., 2016). To avoid loss of resolution we averaged (multi-looked) the images with a relatively small window of 2 × 1 pixels (range × azimuth). Then we averaged the backscatter intensity (β 0 ) of both polarizations, VV and VH, scaled in decibels to reduce the multiplicative speckle noise.
As LRW requires extremely precise geocoding on the subpixel level we co-registered the measured backscatter intensity with the backscatter intensity β 0,sim simulated using the swissALTI3D elevation model (swisstopo, 2018) downsampled to a 30 m resolution. We then orthorectified (geometric  terrain correction) the measured and simulated backscatter images, sampled at a slant range resolution of 4.6 × 14.1 m (corresponds to a resolution of 6.9 × 14.1 m when projected on horizontal terrain), to a 10 × 10 m pixel spacing on the ground. Bilinear interpolation steps during co-registration, orthorectification and collocation of orthorectified images slightly reduced the spatial resolution. The orthorectified radar images were then radiometrically terrain corrected (Small, 2011) with the simulated intensity (β TC 0 = β 0 /β 0,sim ) to remove the terrain-dependent illumination bias. LRW was applied to the backscatter signal of ascending (asc) and descending (des) acquisitions (in dB) using the simulated intensity as weight (w asc = β 0,sim,asc , w des = β 0,sim,des ): β TC,LRW 0 = β TC 0,asc /w asc + β TC 0,des /w des / (w asc + w des ) .
(1) LRW optimizes the spatial resolution, which depends strongly on the local incidence angle (given by the local slope angle η) and the topography because of the slant imaging geometry of SAR sensors (see Appendix B). From the final LRW images, we estimated an effective resolution of about 15 × 25 m. Leinss et al. (2020) list reasons, in accordance with the work of Oh et al. (1992), why the relative brightness of avalanches is stronger for slopes facing away from the radar. As these slopes are weighted more strongly by LRW, LRW also enhances the visibility of avalanches. For avalanche segmentation we mapped areas which showed an increased radar backscatter signal in the difference in the pre-and post-avalanche event LRW image. To remove bias by changing snow properties (snow wetness), a 1 km high-pass filter was applied to the single-orbit and LRW difference image. Additionally, to suppress noise but to preserve spatially structured details, a nonlocal mean filter (Buades et al., 2005;Condat, 2010) was applied to the LRW difference image.

Methods
To compare the different mapping methods, we proceeded in five steps (Fig. 3) which are detailed in the sections below: 1. Avalanches were visually detected or mapped based on the satellite data (Sect. 3.1); furthermore, we extracted mapped avalanches from an existing database, the Davos avalanche mapping project (DAvalMap; Sect. 3.3).
2. The ground-truth data set was compiled from ground and helicopter photographs (Sect. 3.2).
3. Validation points were defined to mark locations where the existence or non-existence of avalanches was examined. Consequently, validation points were created for all avalanches visible on ground-truth photographs and, in addition, for all locations where at least one of the visual mapping methods indicated an avalanche. Through bidirectional comparison of ground truth and mapped avalanches (Sect. 3.4), properties like true or false positives were assigned to the validation points (a full list of assigned attributes is given in Appendix C).
4. Validation points located in areas not covered by the ground truth or before or after the validation period were removed (Sect. 3.5).

Visual detection of avalanches based on satellite data
For each of the three satellite image sources (Sect. 2.2), a different avalanche expert visually inspected the satellite images to detect and map features representing avalanches. We ascertained that the person mapping avalanches was familiar with the respective data source as we experienced that a trained person achieved better results than someone without the specific training. Furthermore, with a different person mapping the avalanches for each data source, we prevented information leaking about the presence of avalanches from one mapping method to another.

SPOT (SPOT mapping)
We took advantage of the false-color band combination in the near-infrared band (green, red and near-infrared (NIR) band), where the reflectance of snow is lower (Warren, 1982). The mapping followed the methodology described in : avalanches were identified and digitized as polygons from optical images (Fig. 4a). To improve visibility, image stretching and gamma optimization, as well as modifications of contrast and brightness for separate outline digitization in the sun and shaded areas, were applied. Additional data like the Swiss Map Raster 25 (swisstopo, 2020b), the summer orthophoto mosaic SWISSIMAGE 25 cm (swisstopo, 2020a) and the layer "Slope angle over 30 • " calculated from the swissALTI3D (swisstopo, 2018) were used for interpretation. The mapping was performed as part of two verification campaigns following the avalanche-active periods in 2018 and 2019 (Bründl et al., 2019;Zweifel et al., 2019), conducted for a much larger area than our study area. Of all mapped avalanches, 486 are located in our study area (2018 -368, 2019 -118).

Sentinel-2 (S2 mapping)
S2 mapping relied on false-color composite (green, red and NIR) images (Fig. 4b). For identification of avalanches, the post-event image was searched for identifiable avalanche features. Additionally, the pre-event image was consulted to (1) Avalanches were mapped from satellite imagery and extracted from the DAvalMap database.
(2) Ground-truth data were compiled. (3) In the validation process (Sect. 3.4), symbolized by the orange arrows, validation points were created and assigned attributes (see also Appendix C). (4) Points representing avalanches from before or after the validation period or outside ground truth were removed. The remaining validation points were used for analysis.
identify changes (e.g., in forest) that might be connected to avalanches. As supplementary information, the SWISSIM-AGE 25 cm (swisstopo, 2020a) was used. Avalanches were marked as points because the outline could not be meaningfully identified at the spatial resolution of S2. In total 44 points identifying avalanches were created (2018 -34, 2019 -10). In 2019 about 7 % of the validation area was hidden by clouds in the Sentinel-2 image. In total 15 avalanches were hidden in those spots with regard to ground truth (size 1 -4, size 2 -9, size 3 and size 5 -1 each). Statistics for Sentinel-1 in Sect. 4 were calculated accordingly.

Sentinel-1 (S1 mapping)
For S1 avalanche polygons were mapped using the backscatter difference images (Sect. 2.2, Fig. 4c). In uncertain cases (e.g., to remove bright pixels due to changing human objects), the radiometrically terrain-corrected RGB backscatter composites (Fig. 4c) were considered for reference. As shown in the processing graph (Appendix A), avalanches were manually detected based on the apparent visual brightness and the shape and size of bright pixels. No pre-defined threshold was used as the mapping was performed manually. In total 125 avalanche polygons were created (2018 -46, 2019 -79).

Ground truth
As ground truth, we relied on over 900 photographs taken before and after the two avalanche periods (Fig. 2). Photographs were taken from the valley floor or from locations within the three ski areas by the interns of the avalanche warning service. Additionally, helicopters were used to document the exceptional avalanche activity. To avoid a bias from ground truth, we did not analyze the ground truth before finalizing the satellite mappings and the Davos avalanche mapping (Sect. 3.3). With plain photographs as ground truth, we could validate the existence of avalanches, albeit not the accuracy of outlines as the photographs were not orthorectified. Due to limited terrain visibility, our ground truth showed gaps in both validation periods. In these gaps no validation was possible (Fig. 5). Still, the available data allowed for validation of the majority of avalanches for each period as ground truth was available for 84 % of the perimeter in 2018 and for 74 % in 2019.

Avalanche size
To relate the mapping results to avalanche size, we classified the avalanches at the validation points. Two raters assigned avalanche size independently from each other using the ground-truth photographs. Avalanches were given one of five ordinal size classes (size 1 -small, size 2medium, size 3 -large, size 4 -very large, size 5 -extremely large) according to the classification defined by the European Avalanche Warning Services (EAWS, 2020) or were assigned as "unknown" if the size could not be determined. The sizes assigned by the two raters corresponded well (Cohen's κ = 0.84, considered an "almost perfect agreement"; Cohen, 1968;Landis and Koch, 1977). For 56 cases, when avalanche size differed (2018 -37, 2019 -19), the two raters and 4 avalanches were confirmed by ground truth, whereas at validation point 3 no avalanche exists in the ground truth (S1 false positive). For validation points 1-3, each mapped avalanche corresponds to a single validation point (one-to-one join), whereas for the validation point 4, multiple S1 polygons correspond to a single validation point (one-to-many join) generated from ground truth (map source: Federal Office of Topography). discussed the size classification to assign a unique size. For 79 % of avalanches one of the size classes could be assigned; the remaining 21 % of the avalanches were classified as size unknown.

Davos avalanche mapping project (DAvalMap) -a ground-truth alternative
Since the winter of 1949/50, avalanches occurring in the region of Davos have been mapped. To obtain a high-quality avalanche inventory, the national avalanche warning service, located at the Institute for Snow and Avalanche Research (SLF) in Davos, cooperates with the rescue services of the ski areas and the council's avalanche warning service to document avalanches. The area of the DAvalMap covers about 180 km 2 and corresponds to the study area described in Sect. 2.1. Great efforts are made to obtain a complete-as-possible avalanche inventory. However, missed avalanches and uncertain release dates may occur particularly during prolonged storms with limited visibility or due to a limited view of the more remote parts of the region. Avalanches are recorded in the DAvalMap if the minimum extension is 50 m in one direction (width or length) for slab or glide-snow avalanches and for a length of 100 m for loosesnow avalanches. Generally, avalanches are documented by photographs taken in the field, and, at a later stage, their approximate outlines are mapped by the avalanche warning intern manually.
The Davos avalanche mapping (DAvalMap) data set is especially meaningful as it provides one of the rare data sets where avalanches have been mapped as comprehensively as possible for decades. The DAvalMap data set has been used in several studies, e.g., for validation of the avalanche forecast, as input to model wet-snow avalanche occurrence and runout distance, or to derive terrain characteristics describing potential release areas (e.g., Schweizer et al., 2003;Wever et al., 2018;Bühler et al., 2018;Harvey et al., 2018;.
The properties of this data set make it a potential candidate to validate avalanches detected in, e.g., remote sensing time series. However, currently information about the quality and particularly the completeness of this data set is missing; therefore we include it in the analysis and compare the DAvalMap with the ground-truth data set.

Validation points
As our ground truth does not cover the validation area completely (Fig. 5), we had to examine our ground truth twice: first to identify avalanches in the ground truth (positives) and to create validation points which we continued to match with the mapped avalanches to classify them as true positives or false negatives and, second, to check whether the remaining unmatched avalanches (from our examined methods) were covered by ground-truth photographs which proved them to be a false detection (false positives). If no ground-truth photograph was available or where a human interpreter could not identify an avalanche on ground truth with sufficient certainty, the mapping was classified as unknown. Properties were assigned to each validation point describing which method detected an avalanche at the specific location (see also Appendix C).
We placed no validation points in locations where no avalanche was detected, even though the detection of nonevents (true negatives) would have been correct. Validation points were placed either inside the area of the avalanche Figure 6. Illustration of a one-to-one join (a), a one-to-many join (b) and a many-to-one join (c) used to assign the avalanches mapped by the different methods to the validation points.
visible on the ground truth or, in the case the ground truth showed no avalanche or no ground truth was available, somewhere within the avalanche polygon of the corresponding mapping method. For matching locations, avalanches detected in the mapping methods had to be assigned to groundtruth validation points. In most cases, a single avalancheoutline (SPOT, S1) or point (S2) -was assigned to one validation point ( Fig. 6a and validation point 1-3 in Fig. 4d). However, as sometimes one avalanche was mapped with a single polygon by one method but split up into several polygons (or points) by another method, we allowed for one-tomany and many-to-one joins (Fig. 6b and c and validation point 4 in Fig. 4d). A one-to-many join means one validation point is linked to multiple avalanche polygons, whereas a many-to-one join links one avalanche polygon to several validation points, both with respect to ground truth.  In total, we created 733 validation points (2018 -536, 2019 -197). Of these, the 181 points classified as unknown were omitted from further analysis (2018 -131, 2019 -50). Orbit revisit times restricted the image acquisition times which differed by a few days as shown in Fig. 2. Therefore, it is possible that avalanches were mapped which had occurred before or after the validation period given by field photographs taken before and after the event. To remove them, 50 additional points were excluded (2018 -48; 2019 -2). This allowed us to use in total 502 (68.5 %) of all validation points (2018 -66.6 %, 2019 -73.6 %) for performance evaluation.

Statistical measures
To assess the detection performance of each mapping method, we calculated two statistical measures, which are based on standard 2 × 2 contingency tables (Table 3).
To determine how many of the avalanches identified in the ground truth were correctly detected in the satellite data by a human (research question 1), we calculated the probability of detection (POD), also called detection rate (adapted from Trevethan, 2017

Location-specific detection
Avalanche illumination conditions in optical imagery. Cast shadow on slopes has been observed to make avalanches difficult to detect in optical satellite imagery Leinss et al., 2020). Calculations using a digital elevation model (DEM) and the specific azimuth and altitude at image acquisition have shown that 65 % (61 %) of the investigated perimeter were illuminated at the time of SPOT image acquisition in 2018 (2019). To show the effect of this, the SPOT avalanches were visually checked to see if the avalanche visible on ground truth was in fully illuminated, partly illuminated (at least one-fifth of the area shaded or illuminated) or fully shaded parts of the SPOT imagery.
Partial detection of avalanches by radar. Among others, Leinss et al. (2020) and Abermann et al. (2019) pointed out that radar is more likely to detect the deposit area of avalanches, whereas the release area and the avalanche track could often be missed. To quantify this characteristic of avalanche detection by radar, we used the large number of avalanche polygons derived from Sentinel-1 in combination with the ground-truth photographs to estimate which part of an avalanche is covered by the S1 avalanche polygon. For that we considered the upper third of the ground-truth avalanche as release area, the middle part as avalanche track and the lower third as the deposit area. Each part mapped by S1 was added to the properties of the corresponding validation point. Then we calculated the POD for the subset of S1 avalanches which contained only one of the three properties of deposit, track and release area.

Results
We performed the following analyses: 1. POD per avalanche size for each mapping method 2. POD and PPV of avalanches ≥ size 2 for each mapping method 3. POD dependence on illumination for the SPOT mapping 4. effects of partial avalanche detection in S1 mapping on the POD and PPV 5. implications of validation with other data as ground truth.
According to the ground truth, 445 avalanches occurred in the two validation periods (2018 -318, 2019 -127). The resulting size distributions are shown in Fig. 7. Except for size-1 avalanches, which we believe are underrepresented in the ground truth, the observed size distributions agree with magnitude-frequency distributions observed in other avalanche size distributions (i.e., Faillettaz et al., 2004;Schweizer et al., 2020).

Avalanche detection rate per avalanche size (satellite methods)
Only the SPOT mapping approach detected all size-4 and size-5 avalanches (Fig. 8a). The capabilities of the S1 mapping to detect the largest avalanches followed closely with 90 % of size-4 avalanches and all size-5 avalanches detected in 2019 (Fig. 8b). By contrast, the S2 mapping only identified 29 % of size-4 avalanches and none of the size-5 avalanches in 2019 (Fig. 8c). As Fig. 8a-d illustrate all satellite methods show declining ability to map avalanches with decreasing size. This decline is more pronounced for the S1 than for the SPOT mapping. The S2 mapping identified very few avalanches altogether, especially for 2019.

Detection statistics of the satellite mapping methods (POD and PPV, size ≥ 2)
Size-1 (small) avalanches are unlikely to cause damage or bury a person. Furthermore, they were probably also missed more frequently in the ground-truth data. Therefore, in the following, we exclude size-1 avalanches and avalanches of unknown size and limit the analysis to the 298 avalanches confirmed by ground truth and classified as size 2 to size 5. Avalanches of size 2 to 5 were confirmed at 298 of the remaining 355 validation points (84 %), indicating that in 57 locations at least one of the methods falsely detected an avalanche. Considerable variations in the performance of the three satellite mapping approaches are noted: -The probability of detecting an avalanche (POD), given its presence in the ground truth, varied greatly between methods (Table 4). Avalanches were most reliably detected by the SPOT mapping approach with 221 out of 298 detected avalanches (POD = 0.74), while the S1 mapping missed almost three-quarters of the size-2 to size-5 avalanches (POD = 0.27). Performance was extremely poor for S2 (POD = 0.06), highlighting that visual avalanche detection is nearly impossible in S2 data.
-The positive predictive value (PPV), the proportion of true positive avalanches to all avalanches mapped by a specific method, was greater than 0.8 for all meth-  Table 5. Percentage of validation points where the specified joins were applied for the SPOT and S1 mapping. For each method only the avalanches mapped and validated (true positives) were considered.
One-to-one One-to-many Many-to-one SPOT 88 % 3 % 9 % S1 76 % 14 % 10 % ods (Table 4). Again, performance was best for SPOT (PPV = 0.88) and lowest for S2 with a PPV of 0.81, indicating that between one in five (S2) to one in nine (SPOT) mapped avalanches were false alarms.
-Comparing the performance between the two validation periods showed that the SPOT mapping is the most reliable one of the satellite-based methods with both performance metrics being similar in 2018 and 2019. The S1 mapping, in contrast, shows bigger differences between the two validation periods, with the POD being clearly lower in 2018 (POD = 0.17, mixed-snow conditions) compared to 2019 (POD = 0.52, dry-snow conditions), at least partly due to the larger occurrence of size-2 and size-3 avalanches in 2018 (Figs. 7 and 8).
As illustrated in Fig. 6 (Sect. 3.4), mapped avalanche outlines did not always correspond to one validation point from ground truth; hence one-to-many and many-to-one joins were allowed (Fig. 6). Considering the SPOT and S1 methods only, the proportion of one-to-one joins was lower for S1 (76 %) compared to SPOT (88 %, Table 5). One-to-many joins, i.e., multiple detected avalanche patches corresponding to one avalanche in the ground truth, were comparably frequent for S1 (14 %) and rare for SPOT (3 %).
However, allowing one-to-many and many-to-one joins impacts results in two ways: firstly, in terms of the correspondence between the number of features detected and the number of avalanches they represent, and secondly it influences the calculated performance metrics (POD and PPV). For instance, a method for which a high number of one-tomany joins were made (here S1) overestimates the total number of avalanches while it increases both the POD and the PPV (Appendix E). In contrast, a method characterized by a high number of many-to-one joins tends to underestimate the number of avalanches assuming a one-to-one translation between detected features and avalanches. Furthermore, manyto-one joins will decrease both the POD and the PPV. As overall the effects of one-to-many joins are more relevant for S1 and the effects of many-to-one joins are more relevant for SPOT (see Appendix E), the results will diverge if performance is evaluated neglecting joins based on ground truth. This is caused by an artificially increased POD and PPV for S1 and a decreased POD and PPV for SPOT (see also Appendix E).

Effect of cast shadow on mapping from optical SPOT data
The detection rate using SPOT images depends strongly on whether the avalanche is located on a well-illuminated slope or in the cast shadow of surrounding mountains. The 221 avalanches correctly detected with SPOT mapping can be split into the following three categories: in fully illuminated slopes, 127 of 147 avalanches were detected (POD = 0.86).
This indicates a low detection rate for avalanches located fully in the cast shadow.
Calculations relying on a DEM, sun azimuth and sun altitude have shown that 35 % (39 %) of the investigated perimeter was shaded at the time of SPOT image acquisition on 24 January 2018 (16 January 2019). Examining the evolution of illuminated and shaded areas from 21 October to 21 April ( Fig. 9), the shaded areas peak with 43 % of the perimeter on 21 December. Examining the results in Table 4, mapping results for 2019 were slightly better than 2018 even though 4 % more of the validation area was shaded. In the light of these insights, the expected performance values for SPOT might be slightly worse than presented in Table 4 from the middle of December until the middle of January but are significantly better before mid-December and after mid-January. The given evolution of shaded and illuminated areas depends on sun azimuth and sun elevation for which our results are comparable to other parts of the Alps, of course being locally modified by terrain. At higher latitudes the amount of shaded terrain will be considerably larger than in our case from which we expect that the POD is decreased to an extent possibly as low as on our shaded slopes.

Partial avalanche detection in the S1 mapping
Avalanche polygons mapped in S1 data often show, in comparison to SPOT data (Fig. 4c vs. 4a), multiple patches. These patches correspond to a single avalanche because the joining parts between the visible patches show too little contrast in S1 imagery. The existence of multiple patches causes a discrepancy between the number of S1 avalanche polygons and the number of avalanches from ground truth and leads to the considerable number of 14 % of one-to-many joins (Table 5), whereas this number is relatively low (3 %) for SPOT. However, in Fig. 10 we observed that the detectability of different avalanche patches depends on their relative location, i.e., on which part of the avalanche the patches belong to. According to the analysis described in Sect. 3.6 we found that the total POD of 0.27 (Table 4) is reduced to a POD of 0.22 when only the deposit area is considered, as done before by several authors (Abermann et al., 2019;Eckerstorfer et al., 2017;Lato et al., 2012). On the one hand, this corresponds to the major part (75 %) of avalanches detected by Figure 9. Evolution of the share of illuminated and shaded areas from 21 October to 21 April (exemplarily shown for the winter 2018/19 for the perimeter 2019). The share was calculated using the function "hillshade" (with cast shadows) in ArcGIS for every 6th and 21st day of the months relying on a DEM with a 2 m resolution, sun azimuth and sun altitude (for SPOT acquisition time at 10:00 UTC for Davos from http://sonnenverlauf.de, last access: 18 February 2021). The values between the 6th and 21st of each month were interpolated. Figure 10. Percentage of radar-detected avalanches where a feature was detected in the deposit, track or release area of avalanches confirmed by ground-truth photographs. The error bars indicate the respective proportions for the two validation periods. radar (Fig. 10). On the other hand, however, 52 % of the radar mapped avalanches also mapped parts of the avalanche track and 35 % mapped even the release area. This, in turn, confirms the supposed ability to detect primarily the deposit area but highlights the importance to also map the release area and the avalanche path to obtain a better POD.

Validation with less complete avalanche data sets (i.e., DAvalMap)
In the following, we analyze the influence of relying on less complete data sets, like the DAvalMap, SPOT or S1, for val-idation on performance metrics of other satellite mapping methods. In particular the DAvalMap seems to be a promising candidate that could be used as validation data sets in the future, considering the high PPV of 0.93, indicating a high reliability that mapped features in fact correspond to avalanches (Table 4). However, the POD was considerably lower with only about half of the size-2 to size-5 avalanches being detected (POD = 0.56). Similarly to the satellite mapping methods, the detection rate decreased strongly with decreasing avalanche size (Fig. 8d). Furthermore, considerable variation in the POD between the 2 years was noted (2018 -0.46, 2019 -0.82). Performance metrics are generally more satisfactory for 2019, indicating a dependence on the person mapping.
Recalculating the POD and PPV relying on the DAvalMap as ground truth for SPOT inevitably affected the PPV strongly; the PPV decreased from 0.88 to 0.59. The comparably large number of SPOT true-positive avalanches, considered false alarms according to DAvalMap, explains this. In contrast, the influence on the POD is comparably small (0.74 to 0.78), as SPOT also detected many of the avalanches detected in the DAvalMap.
If we use the SPOT mapping as ground truth for the S1 mapping, the POD decreases slightly from 0.27 to 0.24 with the PPV dropping from 0.87 to 0.73. Doing it the other way around, using the S1 mapping as ground truth for the SPOT mapping, the POD remains almost the same (0.74 to 0.73). In contrast, the PPV decreases from 0.88 to 0.24, caused by the large number of apparent false-positive avalanches found in the SPOT but missed by the S1 mapping.

Comparison of mapping approaches
We explored three mapping methods: optical high-resolution SPOT; optical lower-resolution Sentinel-2 (S2) and radarbased Sentinel-1 (S1). Of these methods, the 1.5 m resolution optical SPOT mapping achieved the best results for the POD (0.74) and PPV (0.88; Table 4). It can detect avalanches of all sizes (Fig. 8a). The ∼ 10 m resolution S1 mapping, in contrast, performs well for the identification of larger avalanches (size 4 or 5), but the overall POD is significantly lower (0.27) than for the SPOT mapping mainly because the majority of size-2 and size-3 avalanches, which represent the largest number of all avalanches, were missed. The PPV of S1 (0.87) is in a similar range to that of SPOT.
Another quality aspect, which highlights SPOT's mapping potential, is the high percentage of one-to-one joins (88 %), indicating that the number of features detected by SPOT shows a closer correspondence with the actual number of avalanches compared to S1 (one-to-one joins -76 %). Compared to a joining of avalanches and validation points based on ground truth, neglecting one-to-many and many-to-one joins affects performance values leading to an over-and underestimation of the POD and PPV, respectively. This effect has to be considered if different mapping methods are compared in the future without using ground truth. The results of the S2 mapping are poor with only 1 in 17 avalanches detected (POD of 0.06). We can therefore not recommend S2 for avalanche detection. Summarizing, high values of the POD and PPV, in combination with a high proportion of one-to-one joins, make a mapping with SPOT recommendable. However, in the two situations explored, conditions were optimal for SPOT: the day immediately after the period of interest was cloud-free and satellite images could be obtained. This dependence of optical sensors on cloud-free conditions is the biggest disadvantage of the SPOT method. Additionally, SPOT data are costly and only available upon request. Our investigations (Sect. 4.2.1) have shown that the POD is significantly lower in shaded areas for avalanche mapping in SPOT imagery. As it is more probable for larger avalanches that part of the avalanche will be illuminated due to their longer runout distance, the probability of detection for smaller avalanches will be more affected by this. The proportion of terrain shaded depends on the time of the year, i.e., sun azimuth and sun altitude at acquisition time and the terrain investigated. In our study area the fraction of shade varies between 1 % and 43 % during the winter season. However, we could not find any significant detection performance differences in the SPOT imagery from 24 January 2018 and 19 January 2019 where the fraction of shade differed by only 4 %.
Although the Sentinel-1 mapping achieved a considerably lower POD than SPOT, S1 permits observations independent of weather and light conditions. Furthermore, S1 data are free of charge and are operationally available (Table 6). Among others, Eckerstorfer et al. (2017) have focused on the mapping of avalanche deposit areas from Sentinel-1 imagery. As we have shown in Sect. 4.2, the deposit area could be identified for about 75 % of all avalanches by the S1 mapping. The remaining 25 % of S1 polygons captured release area and/or track only. Our investigation indicates that, even though deposits are more likely to be detected, the S1 mapping in many cases identifies other avalanche parts as well. Unfortunately, mapping results from S1 showed multiple patches corresponding to a single avalanche which need to be joined to avoid an overestimation of the avalanche number and an underestimation of the avalanche size. In order to solve this problem, an algorithm, joining S1 polygons belonging to the same avalanche path, would be desirable. We believe the automated snow avalanche release area delineation from Bühler at al. (2018) may be adapted for such a purpose.
With a POD of 0.06, Sentinel-2 imagery seems unsuitable for the mapping of avalanches. Abermann et al. (2019) found 23 % of avalanches on both Sentinel-1 and Sentinel-2 images, whereas we, in contrast, found only 9 % of avalanches from the S1 mapping overlapping with the S2 mapping. This might be due to better visibility of wet-snow avalanches, especially the slush flows, in Abermann et al. (2019). An overview of the strengths and weaknesses of all investigated satellite mapping methods is given in Table 6.
Snow conditions differed between the 2 years: in 2018, both dry-snow and wet-snow avalanches released, while in 2019 avalanches were dry (Fig. 11). For the SPOT mapping we found no difference in the POD between dry and wet snow. In contrast, for radar-based mapping, it is commonly reasoned that wet-snow avalanches are easier to detect (e.g., Leinss et al., 2020) which was confirmed in Eckerstorfer et al. (2019) using ground-truth data. Nevertheless, we obtained apparently the opposite result in the S1 mapping (POD and PPV better for 2019 with dry-snow conditions; Table 4). This can be partially explained by the relatively large number of size-2 to size-3 avalanches in 2018, which are more likely to be missed. Nevertheless, Fig. 8b shows that during dry-snow conditions in 2019, a larger fraction of size-2 and size-3 avalanches could be detected, compared to the mixed-snow conditions (dry-wet) in 2018. Pre-and postevent radar backscatter images show much stronger overall changes in the snow conditions from mixed-snow (preevent) to wet-snow (post-event) conditions in 2018, whereas in 2019 with stable dry-snow conditions, avalanches were the most prominent changes in the backscatter signal. This, in turn, agrees with Eckerstorfer et al. (2019), who also observed a high POD for dry-snow conditions in both (pre-and post-event) images.

On the influence of the quality and definition of ground truth on validation results
We showed the influence of using less complete avalanche observations as a ground-truth alternative on the performance metrics (Sect. 4.3). Our findings are in line not only with theoretical investigations regarding the influence of errors in the reference class on the POD and PPV (Brenner and Gefeller, 1997) but also with other studies outlining the importance of the definition of the ground truth for performance metrics (e.g., Techel et al., 2020, for snow instability tests). As a specific example of a ground-truth alternative, we relied on the DAvalMap data. However, the detection rate (POD = 0.56) clearly showed that this data set provides far from a complete mapping. In fact, the POD was lower for the DAvalMap compared to SPOT (POD = 0.74). In addition, differences in the quality of the mappings between the 2 years were large for the DAvalMap. The PPV has the highest value for the DAvalMap; in 2019 with 0.99 almost all avalanches mapped could be confirmed. These findings indicate that avalanches, which are stored in the DAvalMap database, may be used for validation even though the mapping is partially inconsistent as already suspected by Schweizer et al. (2003). However, to answer research questions for which comparably complete avalanche recordings are required, findings must be interpreted considering the uncertainty related to incomplete recordings. Table 6. Summary of the strengths and weaknesses of the methods examined.

Method
Strength Weakness SPOT mapping -daily revisit capability due to constellation of SPOT 6 and SPOT 7 -may cover a very large area upon request (i.e., the whole of the Swiss Alps in 1 d) -spatial resolution of 1.5 m well suited for avalanche detection -visual avalanche identification is like what the eye is used to -NIR band makes especially wet-snow avalanches well visible; no radiometric saturation on snow (Fig. 11) -strongly dependent on cloud-free conditions -data only available if ordered and rather expensive (∼ USD 100 000 for an area of 12 500 km 2 ) -if satellite is passing far from nadir, high acquisition angles cause distortions in steep terrain (Fig. 11) -resolution of 1.5 m restricts the detection of size-1 and size-2 avalanches S2 mapping -orbit-revisiting time with the same acquisition angle every 5 d so covers large regions in several overpasses but regularly captures the same area -image acquisition with relatively small incidence angles of < 10 • ; data are free of charge -visual avalanche identification is generally like what the eye is used to, but the spatial resolution is mostly insufficient -strongly dependent on cloud-free conditions -resolution of 10 m very much restricts the visibility; even the mapping detection of size-4 and size-5 avalanches is improbable S1 mapping -orbit-revisiting time 6 d -more often when combining data from different orbits -acquisition in all weather and light conditions -data are free of charge -if ascending and descending images are combined, the "blind spots" in layover and radar shade are negligible -sensitivity to surface roughness changes makes avalanche debris appear very bright -spatial resolution well suited for detection of larger avalanches (≥ size 4) -preprocessing computationally more expensive -no mapping of avalanches in radar shadow and layover -detection of avalanches is limited by resolution; size-2 avalanches (50-200 m long) have an extension of just 2-10 pixels -avalanches are often only partially visible due to smooth surfaces in the release or track area leading to overestimation of avalanche number and underestimation of the avalanche size -strongly variable and changing snow conditions from pre-to post-image can complicate avalanche mapping Both SPOT and Sentinel-1 data have been used previously to detect avalanches (e.g., Bühler et al., 2019;Leinss et al., 2020). Each of these studies relied on a different ground truth. Eckerstorfer et al. (2019) conducted a selective verification of 243 manually detected avalanches from Sentinel-1 imagery achieving a POD of 0.77. This is decisively better than the POD of 0.27 found for avalanches of size 2 and larger in this study (Table 4). If we only consider avalanches of size 3 and larger, the POD increases to 0.56 (while the PPV drops to 0.79) -still considerably lower than the results presented by Eckerstorfer et al. (2019). We suspect that selective verification tends to overestimate the POD, as in these cases, verification data are usually available for wellvisible prominent avalanches. Selective verification is also the reason for the higher POD achieved for the SPOT mapping (Sect. 4.3), when relying on DAvalMap as ground truth, a ground truth which had a preference towards the detection of larger avalanches (Fig. 8d). Leinss et al. (2020) compared radar-detected avalanches (Sentinel-1) with optically detected avalanches from SPOT . Of the SPOT avalanches, 68 % were detected by radar in their investigation. Inversely, 44 % of the radar-detected avalanches were detected by SPOT. In our study we linked mapped avalanches to validation points from ground truth. We found 89 % of the validation points representing avalanches of size 2 and larger, detected by S1 mapping, were also found in the SPOT mapping. In contrast, S1 detected only 55 % of the SPOT avalanches. Given the validation with independent ground truth in this study, we believe that our results provide a more objective comparison of the two mapping approaches.
Applying these findings to our study, we would argue that avalanches detected using SPOT images are a rather reliable ground-truth data source for slopes which are illuminated (or partly illuminated) and when sky conditions are clear. In contrast, if slopes are shaded or sky conditions do not permit good visibility, SPOT images will be of little use for validation. Figure 11. In 2018, the temperatures and the snowfall line were high resulting in more wet-snow avalanches and deposits. Those are identifiable by the green shimmer in the NIR and more contrast to the surrounding snow in general (left side), which is not the case for the mostly dry-snow avalanches in 2019 (right side). Additionally, disparity in visibility in steep terrain due to different inclination angles is shown. The inclination angle for the tiles shown here lies at 11.3 • in 2018 and at 27.6 • in 2019. These distortions are among the worst we encountered comparing 2018 and 2019 SPOT data (SPOT 6 data © Airbus DS 2018/2019).

Strengths and limitations of our study
The study is limited to two avalanche periods. As shown in Fig. 2, the satellite images used were not all taken on the same day and were also not the same with respect to the defined validation periods. For Sentinel-1, four images (ascending and descending, pre-and post-event) were combined to map avalanches from one period. The difference between the ascending and descending image is always 1.5 d, except postevent 2018 (difference -3.5 d). Because the main avalanche activity (with level 4 and 5; see Fig. 2) did not happen within these days in between, it is unlikely that avalanches occurred within these days. Instead, most of the detected avalanches must have occurred between the pre-and post-acquisitions. Furthermore, as the weight used for LRW is linear to the illuminated area (which is proportional to the (linear) backscatter intensity) the image composition follows an almost binary weighting, especially for non-horizontal terrain, rather than an equally weighted average (which only happens for nearly horizontal terrain). This makes LRW a good method for image composition, and the chance to miss avalanches by averaging is reduced, especially when specific events of high avalanche activity (as in our case) are enclosed by specifically selected acquisition dates.
SPOT images were acquired very close to the period of interest, and therefore the effect of this time gap is negli-gible. In 2019 Sentinel-2 imagery was acquired very close to the period of interest as well; in contrast the acquisition in 2018 happened 3 d after the investigated period. As the weather conditions were favorable without precipitation and as ground-truth photographs were available, we believe this does not distort the results.
For the two avalanche periods, we compiled a ground-truth data set. However, despite our efforts to collect a spatially complete data set, we could not validate 48 detected features (2018 -38, 2019 -10) because of gaps in the ground truth and 133 features (2018 -93, 2019 -40) because of lowquality ground-truth images. Furthermore, we expect that we missed some avalanches in the ground-truth images, particularly if these were of smaller size (Fig. 7). Despite these limitations, we consider the ground-truth data to be complete enough to allow for a sound validation of detected avalanche features. Furthermore, the independently compiled groundtruth data allowed for an objective comparison of the three satellite-based avalanche detection methods.
We explored just a small selection of the large number of potential satellite data sources, focusing on sensors and satellites previously used to detect and map avalanches (i.e., Eckerstorfer et al., 2019;Bühler et al., 2019;Leinss et al., 2020). Still, we consider the analyzed sensors and resolutions a representative selection of currently available satellite data sources. We relied on a human assessor to detect features rep-resenting avalanches visually. This approach depends heavily on the experience and skills of the human performing the task (as has been shown for landslide mapping, e.g., Hölbling et al., 2015;Galli et al., 2008) and adds a certain degree of subjectivity to the analysis. Furthermore, manual detection of features is resource-and time-consuming.
To reduce the impact of limited visibility due to adverse weather and due to variations in operator performance, we suggest that future ground-truth data sets should be complemented with avalanche occurrence data relying on automatic avalanche detection approaches, such as seismic or groundbased radar detection of avalanches (e.g., van Herwijnen and Schweizer, 2011;Mayer et al., 2020). Furthermore, recent advances in (semi-)automatically detecting avalanches are promising alternatives for complementing avalanche occurrence data Leinss et al., 2020;Korzeniowska et al., 2017).

Conclusions and outlook
For the first time, we presented a spatially continuous, extensive validation of methods detecting avalanches from selected satellite imagery. We analyzed two avalanche periods for an area covering approximately 180 km 2 around Davos, Switzerland. We examined the potential, the advantages and the disadvantages of the evaluated methods to provide decision guidance for those wanting to comprehensively map avalanches in the future. We statistically confirmed several observations from Bühler et al. (2019) and Leinss et al. (2020): the SPOT mapping misses size-1 (small) and size-2 (medium) avalanches in several cases. S1 mapping misses most size-1 and size-2 avalanches and over half of size-3 avalanches. We also confirmed that avalanches located completely in the cast shadow are much more likely to be missed, even on high-resolution optical imagery (SPOT). For S1 we showed that avalanche deposits are the avalanche part most likely detected, but the starting zone and the avalanche track are mappable in more cases than previously suspected.
The SPOT mapping holds great potential for comprehensive mapping of avalanches, at least for selected events for which costly and analysis-intensive SPOT data provide very valuable mapping results. The S1 mapping is quite reliable for larger avalanches (size 3 to 5) and allows for frequent and even operational mapping for which automatic methods are currently being developed (e.g., Eckerstorfer et al., 2019). Still, it must be kept in mind that often the true size is underestimated by SAR sensors and that avalanches can appear partitioned into small patches which need to be joined by an advanced detection algorithm to estimate the true size. We found that Sentinel-2 data have a too low resolution to reliably map avalanches. Additionally, we explored the influence of ground truth on the validation results and ascertained that incomplete, but otherwise reliable, ground-truth data sets tend to overestimate the POD and underestimate the PPV.
We found that already-existing satellite data provide great potential to approximate the avalanche activity and to obtain an overview of the spatial distribution of avalanches. However, for studies which require a precise and complete mapping of avalanche outlines, further investigations are necessary. As ground truth for such an examination, unmanned aerial systems (UASs) have been found to be a promising solution Bühler et al., 2017). To bypass time-consuming manual mapping, automation should be aimed at by developing reliable automated mapping algorithms or refining those that have already been created (Bühler et al., 2009;Lato et al., 2012;Korzeniowska et al., 2017). Prior to operational use of any approach, a comprehensive, and not only a selective, validation should be strived for. For methods that have been comprehensively validated, the DAvalMap database or a SPOT mapping might be used for selective follow-up validations. Partly shaded The avalanche is located in partly illuminated and partly shaded terrain S1 avalanche part Parts of the avalanche (release, track and/or deposit) that were captured by the S1 mapping Comment Supplementary information is put here: -the ID of the other avalanches if several mapped avalanches were joined to one validation point ("one-to-many join") -information if the avalanches were snowed upon and hard to see in the photographs -information if there was not an avalanche mapped with any of the methods but ground truth indicated the existence of one The Cryosphere, 15, 983-1004, 2021 https://doi.org/10.5194/tc-15-983-2021 In Table 5 we showed the share of validation points for SPOT and S1 which were joined as illustrated in Fig. 6. In order to show the effects described in Sect. 4.2, the POD and PPV were calculated neglecting joins. For the computation, multiple mapped avalanche patches which were originally joined to one validation point were treated as separate avalanches (one-to-many) and one avalanche patch was treated as just one avalanche even though it was joined to two validation points because of ground truth (many-to-one). In order to make the effects of either join more visible, they were calculated both separately and together. The results are shown in Table E1. It can be seen that treating several avalanche patches as several avalanches (using no one-to-many joins) overestimates the number of avalanches, leading to a higher POD and PPV. Compared to the numbers in Table 4, the increase in the POD for S1 is more pronounced as the percentage of one-to-many joins is higher (Table 5). If we are neglecting many-to-one joins and treating one avalanche polygon as one avalanche (even though ground truth showed two or more corresponding avalanches) the POD decreases as well as the PPV. If both one-to-many and many-to-one joins are neglected, for SPOT the POD and PPV are slightly lower than the results in Table 4, whereas the opposite is true for S1. This is due to one-to-many joins being more relevant for S1 and many-to-one joins being more relevant for SPOT. Table E1. POD and PPV for the SPOT and S1 mapping neglecting one-to-many, many-to-one or all joins (for the definition of those joins refer to Fig. 6).
No one-to-many joins No many-to-one joins No joins at all SPOT S1 SPOT S1 SPOT S1 Author contributions. EDH performed the SPOT mappings, collected the ground truth, analyzed the data sets, assigned groundtruth size, coordinated the study and wrote the paper draft. FT mapped from Sentinel-2 for 2018, assigned ground-truth size, delivered the necessary input from the SLF avalanche warning team, critically reviewed the results and heavily contributed to the paper draft. SL performed the Sentinel-1 data processing and mappings, wrote the description thereof, and reviewed and complemented the manuscript with YB and FT. YB, FT and EDH initiated the study together.