Snow depth mapping from stereo satellite imagery in mountainous terrain : evaluation using airborne lidar data

An accurate knowledge of snow depth distribution in mountain catchments is critical for applications in hydrology and ecology. A recent new method was proposed to map the snow depth at meterscale resolution from very-high resolution stereo satellite imagery (e.g., Pléiades) with an accuracy close to 0.50 m. However, the validation was mainly done using probe measurements which sampled a limited fraction of the topographic and snow depth variability. We deepen this evaluation using accurate maps of the snow depth derived from ASO airborne lidar measurements in the Tuolumne river basin, USA. We find a good agreement between both datasets over a snow-covered area of 137 km2 on a 3 m grid with a positive bias for Pléiades snow depth of 0.08 m, a root-mean-square error of 0.80 m and a normalized median absolute deviation of 0.69 m. Satellite data capture the relationship between snow depth and elevation at the


Introduction
The snow depth or height of the snowpack (symbol: HS, Fierz et al. 2009) is a key variable for both water resource management and avalanche forecasting in mountain regions. However, the determination of HS in complex terrain remains challenging due to its high spatial variability at decametric scale. Current operational approaches to estimate HS are either based on sparse in situ measurements, area limited unmanned aircraft vehicle (UAV) campaigns (Nolan et al., 2015, Redpath et al., 2018 or costly airborne campaigns (Bühler et al., 2015, Dozier et al., 2016, Painter et al., 2016. Recently a new method was introduced to retrieve HS maps from satellite data at metric resolution, typically 1 to 4 m (Marti et al., 2016, McGrath et al., 2019, Shaw et al., 2019. The method is based on the differencing of snow-on (winter) and snow-off (in general end-of-summer) digital elevation models (DEM) that are generated from very highresolution satellite stereo imagery (e.g. Pléiades, DigitalGlobe/Maxar WorldView-1/2/3 and GeoEye-1). The method was tested using two Pléiades stereo triplets over the Bassiès catchment in the Pyrenees (14.5 km²).
The snow-on and snow-off DEMs were generated using the Ames Stereo Pipeline (ASP, Shean et al., 2016) and co-registered before differencing them (Berthier et al. 2007). The accuracy of the method was evaluated using 451 probe measurements of the snow depth. The HS satellite-derived map was also compared to the one obtained from a UAV photogrammetric survey over a small portion of the catchment (3.1 km²). The results showed that snow depth could be retrieved with decimetric accuracy from Pléiades images (standard deviation of residuals 0.58 m for a pixel size of 2 m), suggesting that the method had the potential to become a viable alternative to airborne campaigns with the benefits of the satellite: access to any point on the globe and lower cost for the end-user.
important limitation of this study since DEMs from stereoscopic images are known to be less accurate in high slope areas (Lacroix, 2016;Shean et al., 2016). In addition, snow probe measurements may fail to represent the mean HS at the scale of a 4 m pixel especially in mountain terrain (Fassnacht et al., 2018).
Furthermore, Marti et al. (2016) did not evaluate the impact of the photogrammetric software configuration on the accuracy of the HS map, and several upgrades were implemented in ASP since their study. In particular, the semi-global matching algorithm (Hirschmüller, 2005) was added to the catalogue of algorithms that can be used to derive the disparity map from stereo images. This algorithm is expected to perform better in low texture terrain (Bühler et al., 2015;Shean et al., 2016) and therefore has the potential to reduce the number of missing values in the snow depth map.
Given the aforementioned limitations, we planned a second more comprehensive validation study by taking advantage of the NASA Airborne Snow Observatory campaigns (ASO) in the Sierra Nevada, USA. In this area ASO routinely acquires accurate HS measurements by airborne lidar altimetry. We tasked the Pléiades system to acquire two stereo triplets over the Tuolumne river basin. The snow-on Pléiades triplet was acquired 1 st May 2017, the day before the ASO flight and close to the accumulation peak. The ASO product is used as a reference as it should exhibit no bias and was found to have an accuracy of 0.08 m (Painter et al., 2016) while Pléiades HS maps have an accuracy ranging between 0.50 m and 1 m (Marti et al., 2016).
The ASO data give us the opportunity to make a more advanced evaluation of the method in mountainous terrain.
In other studies using DEM differences to characterize land surface elevation changes, error statistics are often estimated from stable terrain areas, where no elevation change should have occurred (Deschamps-Berger et al., 2019). However, stable terrain might not be representative of the snow-covered terrain owing to differences in topography and in reflectivity. A homogeneous snow-covered area can represent a challenge for image correlation compared to typical snow-free terrain found in alpine mountains. When integrating elevation change over areas (glacier volume change, lava flow volume, landslide terrain deformation), previous studies used different functions to estimate the error by including, or not, a systematic error, a random error and a spatial correlation metric. Here, the ASO dataset enables analysis of the error model and the error sources.
In addition, we take advantage of this dataset to evaluate the effect of the configuration of the ASP software on the resulting HS map. We produced the HS maps with DEMs calculated with three different sets of options. We also evaluate the effect of using different combinations of pairs of stereo images (front-back, nadir-back, front-back) instead of a triplet (front-nadir-back) to generate the snow-on DEM at a lower cost for future applications. The study site is located in the Tuolumne river basin in the Sierra Nevada mountain range, California, USA ( Fig. 1). The Tuolumne river supplies water to the agricultural plain of the Great Valley and the densely populated area of San Francisco. The region recently experienced a five-year drought from 2012 to 2016 (Roche et al., 2018), increasing the interest for water resources monitoring. The ASO flights cover 1100 km² in the basin while this study focuses on a 280 km² subzone. The elevation within this subzone ranges from 1800 m a.s.l. to 3500 m a.s.l.. Typical winter accumulation can reach several meters at high elevations (Painter et al., 2016). The 2016-2017 winter was characterized by near record snow accumulation that has been referred to as the snowpocalypse (Painter et al., 2017). Tuolumne basin (blue line) (b). The terrain elevation in the background is the snow-off digital elevation terrain from ASO used in the co-registration step.

Pléiades images
The study area is too large to be imaged by Pléiades in tri-stereo mode with a single scene, hence the area was imaged in two strips which overlapped by 3 km in winter and 1.5 km in summer in the along-track direction. The snow-on triplets were both acquired on 01 May 2017, while the snow-off triplets were acquired on 8 August 2017 and 13 August 2017 (Fig. 1, Table 1). The imaged area covers in total 280 km². Images were acquired in panchromatic and multispectral mode with incidence angles along track between -7° and 9°. The base to height (B/H) ratio of successive pairs is around 0.1 (Table 1). The panchromatic images have a resolution of 0.5 m at nadir and are used to calculate the DEMs. For the snow-on acquisition, we requested to reduce the number of time domain integration (TDI) lines used to image the scene. This is recommended to curb image saturation over sun-exposed snow surfaces (Berthier et al., 2014). As a result, there are no saturated pixels in the images of this study. Pléiades multispectral images have a resolution of 2 m. We only use the multispectral image that was acquired the closest to the nadir to compute the multispectral orthoimage.

Lidar data from the Airborne Snow Observatory (ASO)
A snow-off DEM on 13 October 2015 and a snow depth map on 2 May 2017 from the ASO are used for comparison with the Pléiades products ( Fig. 1, Table 1). The ASO program, operating since 2012, provides snow depth, Snow Water Equivalent (SWE), and snow albedo maps over full mountain watersheds to support scientific campaigns and operational water management (Painter et al., 2016). The ASO lidar system measures the distance between the target and aircraft, and is combined with aircraft position and orientation measurements to generate a collection of elevation points -a "point cloud". Ground points are aggregated to a 3 m grid to derive a gridded DEM.. Snow depth maps are obtained from the difference of a snow-on and snow-off DEM in unforested areas, and from a point-cloud differencing algorithm in areas with forest canopy. Snow depth maps are combined with density from a model and in-situ observations to obtain the SWE. The values on the snow-free areas are used to bias-correct the snow-on elevations and are set to zero. From comparison with 80 in-situ manual measurement, no bias is observed on the HS maps and the root mean square error (RMSE) per pixel at a 3 m resolution is 0.08 m (Painter et al., 2016). For the evaluation of Pléiades HS maps, we excluded 25 km² near the catchment divide in the north-east part of the study area because we observed artifacts in the lidar HS map probably due to issues with the aircraft position and orientation data.   Figure 2 presents the workflow that we developed to produce HS maps from Pléiades images using ASP version 2.6.2 (Shean et al. 2016) and the Orfeo Toolbox (Grizonnet et al., 2017). We detail below the calculation of the DEMs, the HS maps and the land cover classifications.

DEM calculation
We use an iterative approach to obtain a refined point cloud and a DEM from each triplet of stereo images.
The first iteration uses default L1B input images to produce a coarse DEM at 50 m resolution. The input images are orthorectified using this coarse DEM with the ASP utility mapproject. The orthorectified images are then used as inputs for the second iteration to obtain a fine DEM at 3 m resolution. This iterative processing was shown to improve computation time and reduce artifacts in the final DEM (Shean et al., 2016). The resolution and coordinate system of the DEMs were defined to match those of the ASO product

Snow depth (HS) maps
We co-registered the Pléiades DEMs to the ASO snow-off DEM to enable a pixel-wise comparison between both datasets and align the raster grids. We first co-registered the Pléiades snow-off DEM to the ASO snowoff DEM. We then separately co-registered the Pléiades snow-on DEM to the Pléiades-registered snow-off DEM before computing the difference between the Pléiades snow-on and Pléiades snow-off DEMs (hereafter referred to as dDEM). The north and south Pléiades dDEMs were mosaiced and the north dDEM value was preserved in the overlapping area. The co-registration vectors were calculated using the algorithm by Nuth and Kääb (2011) on areas where no elevation change is expected (i.e. stable terrain). The stable terrain areas were determined by supervised classification of the Pléiades multi-spectral images into a land cover map (see 4.1.3). From the same land cover map, the Pléiades dDEM values were set to zero in snowfree areas to obtain the HS map. Pléiades HS values below -1 m and above 30 m were set to no data.

Land cover classification
Snow covered areas and stable terrain were analysed separately, and their location determined with a land cover supervised classification calculated from the multi-spectral images. The winter and summer scenes were classified into four categories: snow, forest, open water and stable terrain, the latter corresponding to snow-free areas with low vegetation or bare rock. First, we orthorectified the nadir multi-spectral images using mapproject on their corresponding DEM. For each image, we manually extracted training data covering 0.1-1.0 km 2 from a composite image of red, green, blue, near-infrared bands and the derived NDVI.
A maximum of 33 polygons were manually drawn for the snow class on the winter north image. These samples were used to train a random forest classifier with otbcli_TrainVectorClassifier from the Orfeo Toolbox.
The stable terrain and snow masks were eroded with a radius of two pixels (4 m) and patches smaller than 30 pixels (270 m²) were removed. The masks were shifted according to the DEMs co-registration vector and then interpolated with the nearest neighbour method onto the ASO grid. Lakes and snow patches remaining in the summer land cover map were removed from the winter snow mask. Lakes were manually delineated on snow-off images. This workflow was automated except for the training dataset which was generated by human interpretation of the images.

Photogrammetric processing of the images
A DEM is computed with the Ames Stereo Pipeline (ASP) using two utilities: stereo and point2dem. First, stereo generates a dense disparity map (e.g. the pixel displacement between the two images of a stereo pair) using image correlation. The disparity map is used to calculate a point-cloud with a triangulation algorithm.
Then, point2dem interpolates the point cloud on a regular grid (Shean et al., 2016). We compared three sets of options in stereo. The first set of options is the one used by Marti et al. (2016). This set uses the local-

Comparison of bi-and tri-stereo images for DEM calculation
We calculated five DEMs from each stereo triplet by selecting a pair of images (front-nadir, nadir-back, front-back) or the complete triplet (front-nadir-back, nadir-front-back). This provided combinations of different B/H (called image geometry further in the article), ranging between 0.08 and 0.23 (Table 1). The three sets of options of stereo were tested on these different geometries. In the tri-stereo case, ASP calculates two disparity maps and performs a joint triangulation when calculating the point-cloud. In the first tri-stereo case (front-nadir-back), ASP calculates a disparity map between the front and the nadir image and between the front and the back image. In the second case (nadir-front-back), ASP calculates a disparity map between the nadir and the front image and between the nadir and the back image. We did not evaluate the third possible tri-stereo combination (back-nadir-front) as we expect results to be similar to the front-nadirback case.

Evaluation of the snow depth maps
We evaluated the quality of the Pléiades HS maps over the area defined as the intersection of snow-covered terrain in Pléiades HS maps (snow mask) and ASO HS maps (HS greater than zero). We also evaluated the Pléiades dDEM over the stable terrain where we expect elevation difference to be zero. The HS residuals are the difference between the Pléiades and the ASO HS. The stable terrain residuals are the Pléiades dDEMs as ASO products are set to zero over snow-off terrain. The distribution of the residuals was characterized with the mean, the median (i.e. the bias), the root-mean square error (RMSE) and the normalized median absolute deviation (NMAD) of the residuals. The NMAD is a measure of the dispersion suited for populations with outliers (Höhle and Höhle, 2009).
For hydrological applications, HS maps are often spatially aggregated, for example to calculate the amount of snow in a catchment or an elevation band. The expected random error of the average of a dDEM over N pixels was defined as (Nuth and Kääb, 2011): Where e rand ❑ , is the random error per pixel and N i is the number of independent pixels in the integration area.
The number of independent pixels depends on A, the area of averaging and l cor ,the auto-correlation length scale of the random noise. Many studies extract an auto-correlation length from a semi-variogram (Rolstad et al., 2009;Trüssel et al., 2013;Willis et al., 2015;Melkonian et al., 2016, Anderson, 2019. Assuming complete correlation for all the pixels within the autocorrelation area, one obtains (Nuth and Kääb, 2011): If one uses square area of averaging of length, l ❑ . The expected random error is: In most cases e rand ❑ and l cor ❑ are measured over the stable terrain and then used to estimate the error over snow or glacier terrain. We use Eq. effective (or measured) random error. The latter is calculated as the standard deviation of the HS residual map resampled at resolution l. An average resampling scheme is used, which calculates the average value if all pixels are valid within a block Brun et al., 2017). By comparing the expected and the measured error, we aim at verifying i) if the elevation difference statistics calculated over stable terrain compare well to the statistics over snow-covered areas and ii) if the error model from Eq. (1) is valid.

Results
We first present the results for the HS maps calculated with the SGM-binary option and different image geometries. Then, we focus on the impact of the configuration of ASP. The best set of options and geometry is then used to analyze the spatial distribution of the residuals and to evaluate a model of the HS error.

Evaluating the impact of bi or tri-stereo images as input
The NMAD of the snow depth residuals with respect to ASO data is larger for maps from pairs of images with B-H around 0.12 (1.13 m for front-nadir,1.07 m for nadir-back) than from pair of images with B/H around 0.20 (0.68 m for front-back) or triplets of images ( Table 2). The NMAD of the snow depth residuals from the front-nadir-back triplets (0.69 m) is slightly better than from the nadir-front-back triplets (0.78 m) and very similar to the NMAD from the front-back pair. The NMAD over stable terrain is lower but relative values between two geometries are similar (Table 2). For the different image geometries, the RMSE evolves similarly to the NMAD over snow-covered areas but very differently over stable terrain. The largest RMSE over stable terrain is 1.35 m for front-back and the smallest is 1.06 m for nadir-front-back. The mean difference over snow-covered areas ranges from +0.01 m (front-nadir) to +0.16 m (front-back). The absolute mean and median over stable terrain are all less than 0.06 m. The relative results for the different geometries are similar with the SGM-ternary and Local-Search options except for the mean error. In the following sections, the HS map from the front-nadir-back geometry is used as it yielded the lowest bias, RMSE and NMAD.

Sensitivity to the photogrammetric processing
We compare the stereo options on the HS maps from the front-nadir-back geometry (Table 3 and Fig. 3).
The SGM sets of options provide DEMs without data gaps. The Local-Search option produces snow-on DEMs with gaps which results in ~2 km² missing in the HS maps compared to the SGM options (Table 3).
Visual examination of the winter DEMs shows large differences in snow fields and forest. Linear artifacts are observed over snow in the DEM produced with the SGM-ternary option. The same regions are noisy in three times the NMAD value. This is expected as the same filtering is used during the co-registration process to remove outliers. In the following, the SGM-binary was selected since it gives the lowest bias and NMAD with respect to ASO data and the lowest NMAD over stable areas (Table 3).  Figure 6 shows that ASO and Pléiades HS exhibit a similar distribution of HS against elevation except between 1900 m -2100 m and 3500 m -3700 m where the mean residual over snow-covered areas is greater than 25 cm (Fig. 7). This corresponds however to small areas which cover less than 0.05 km² each.

Spatial distribution of the residuals
The NMAD of the Pléiades dDEMs over the 4.07 km² of stable terrain is 0.40 m against 0.69 m for the HS residual. The distribution of residuals on stable terrain is similar for most aspect classes with the exception of the north facing slopes (0.26 km², aspect classes 315°-360° and 0°-45°, Fig. 7). Based on the visual analysis of the residuals map, we attribute these errors to shaded slopes of steep summits. The distribution of HS shows a similar spread for all aspects but a larger positive bias (~0.20 m) for south facing slopes (90°-270°, Fig. 7). The distribution of HS residuals against the terrain slope is similar between 0° and 50° but becomes more spread in steeper terrains which cover 2.13 km². The same trend is observed over stable terrain but only above 70°. The map of HS residuals shows a low frequency undulation with an amplitude of approximately 30 cm and a wavelength of approximately 4 km (Fig. 8). The crests of the undulation are oriented in the east-west direction. The semi-variance of the residual increases linearly between 3 and 20 m and is stable for longer correlation distances in the considered range (Fig. 9). A similar semi-variance evolution is obtained over stable terrain. From this semi-variogram analysis we estimate that the correlation length of the random error (see 4.4) is about 20 m.

Evaluation of a simple error model
We compare the measured and the expected error of the HS maps by comparing Eq. (3) with the HS residuals. We find that the HS residuals do not match with the expected error (Fig. 10). The expected error is calculated with random error, e rand ,set to 0.50 m and 1 m to represent a realistic range of error, and the correlation length,l cor , set to 20 m in agreement with the statistics over stable terrain. The measured error of the residual is smaller than the error that would be estimated from this error if the HS map has a resolution between 20 m and 50 m and greater than expected above 50 m. This mismatch is probably partly related to the undulation observed in the HS residual (see Discussion).

Comparison to existing studies
By comparing the Pléiades HS with the ASO data, we find a NMAD of 0.69 m in the best case (i.e. best acquisition geometry and ASP options), which is higher than the NMAD of 0.45 m reported by Marti et al. (2016) based on 451 snow probes measurement. This discrepancy could be due to differences in (i) the Pléiades data (i.e. acquisition geometry), (ii) the characteristics of the study site and (iii) the representativeness of the validation data. The B/H for the images of Marti et al. (2016) study ranges between 0.21 and 0.25 for all consecutive stereo pairs while our B/H range between 0.08 and 0.12. This is consistent with the theory of the photogrammetry, which states that the accuracy of the DEM increases with the B/H up to a certain limit (Delvit and Michel, 2016). The larger NMAD compared to Marti et al. (2016) is also partly due to fact that the present study covers a much larger range of slope angles and aspect. Therefore, we argue that this study provides a better evaluation of the HS accuracy that can be expected from Pléiades in high mountain regions.

Sensitivity to image geometry and photogrammetric processing
We find that the HS maps accuracy are sensitive to the B/H ratio of the input images, and to the configuration details of the photogrammetric processing. We do not find a large added-value of the tri-stereo images on the map accuracy compared to an optimal bi-stereo configuration.
The NMAD of the Pléiades HS is improved by 36 % when using images with a B/H of 0.22 instead of 0.11 (Table 3). Marti et al. (2016) used pairs of front-nadir and nadir-back images (B/H=0.2) as they observed that the front-back pair (B/H=0.4) led to too many no-data pixels. From these two studies and for similar terrain, a B/H around 0.2 seems beneficial.
Using tri-stereo instead of bi-stereo images did not improve significantly the Pléiades HS map accuracy. It seems like the processing of a triplet of stereo images (front, nadir, back) with ASP stereo function is equivalent to the processing of the best stereo pair of the triplet, the front-back pair in our case. There were no data gaps due to view obstruction by steep relief in this study area. Should it be the case, the tri-stereo may offer a better coverage. Several studies have evaluated the benefits of tri-stereo imagery against bistereo (Berthier et al., 2014;Zhou et al., 2015;Bagnardi et al., 2016;Marti et al., 2016). However, these studies used different photogrammetric software which do not handle the combination of three images in the same way. For example, either multiple disparity maps, or points clouds or DEMs can be calculated and merged to produce a final single DEM. The use of tri-stereo results in increasing the density of the point cloud (Zhou et al., 2015;Bagnardi et al., 2016) and decreasing the no-data area in the final DEM (Berthier et al., 2014;Zhou et al., 2015). The accuracy of elevation products from tri-stereo compared to bi-stereo was slightly improved in Berthier et al. (2014) but not significantly in Marti et al. (2016). To our best knowledge, volume change measurements were never computed from a large number of VHR satellite stereo-images (>10), but studies suggest that the combination of multi-view images can improve the DEMs quality. The fusion of 16 Worldview-3 images improved the NMAD of the residual by 20% compared to a set of 6 images over an industrial zone (Rupnik et al., 2018). Therefore, the most important use of tri-stereo may not be to improve the accuracy of HS maps, but rather to obtain complete coverage of complex terrains and have a less distorted nadir ortho-image for the land surface classification. We did not evaluate the extent to which the front and back images would provide a different land surface classification from the one obtained with the nadir image.
The choice of the photogrammetric options has an impact on the elevation difference accuracy over stable terrain and snow-covered areas. The NMAD over snow-covered areas is improved by 0.16 m by only modifying the cost function (binary census-transform instead of ternary census-transform). However, such a gain on the dispersion will hardly impact the HS averaged over a region of interest since the random error decrease rapidly with the region area (see 6.3.). More important is the larger bias over snow-covered areas introduced with the SGM-ternary option (0.24 m) and Local-Search options (0.49 m). This bias is particularly important for south facing slopes. It seems to result from difficulties in image matching in bright areas for the three compared options and from the impact of isolated trees for Local-Search. The impact of the tree is likely due to the larger kernel size (25 pixels) used in the Local-Search option and advised when using the local-search stereo algorithm. The exact origin of the bias on south facing slopes remains unknown.

Distribution of HS errors
We found a mean difference of +0.08 m between Pléiades (SGM-binary, front-nadir-back) and ASO HS despite the correction of the vertical offset between the snow-on and snow-off DEM on the stable terrain after co-registration. This bias is low given the difference of the characteristics of the ASO and the Pléiades elevation differences between the snow-off DEMs. Finally, it is expected that the co-registration step has a decimetric accuracy for DEMs at a metric resolution (Nuth and Kääb, 2011).
We found that the random error is larger on snow-covered terrain (NMAD=0.69 m) than on stable terrain (NMAD=0.40 m). This is true for all slopes and most aspects classes (Fig. 7). Although mountainous snow surface tends to have smoother topography, thereby increasing the accuracy of the photogrammetric processing, bright snow surfaces also tend to have less texture than snow-free surfaces, which decreases the accuracy of the photogrammetric proccessing. The lower accuracy of snow areas is not due to saturation since no pixels were saturated in the panchromatic images. In addition, it should be noted that the residuals over stable terrain are computed from Pléiades data only, while residuals over snow-covered areas are computed from Pléiades and ASO data. We cannot conclude if the larger dispersion over snow-covered areas results from the properties of the surface or from the combination of errors in Pléiades and ASO data, or both.
We found that the decorrelation distance of the residuals between Pléiades and ASO dDEM was around 20 m and identical over stable terrain and snow-covered terrain. This suggests that the correlation length on stable terrain can be used to estimate the error on a spatial integration of HS, although this remains to be confirmed in other sites.
Previous studies suggest that the dispersion of the residuals on stable terrain can be used to estimate the error on the HS map. However, Fig 10 suggests that an error model based on a single random error with a given correlation length can be misleading. Indeed, we find that the measured error does not match the evolution of such an error model over larger averaging areas. This is due to the contribution of other error sources at lower spatial frequencies like the 4 km wavelength undulation observed in the residual.
The undulation pattern in Fig. 8 was observed in other Pléiades products, ASTER images (Girod et al., 2017) and World-View DEMs (Fig. 10 in Shean et al., 2016, Fig. 6 in Bessette-Kirton et al., 2018. It is likely a result of unmodeled attitude errors along-track (jitter). To gain insights into the contribution of this error to the total HS error, we averaged the HS residual in the across-track direction and used a Fourier transform to identify and correct this low frequency undulation. Then, we removed this error from the HS map. As a result, there is a better agreement between the measured and expected error (Fig. 10) (Rolstad et al., 2009).
We find that the selection of the image configuration and the processing options can lead to changes in the NMAD up to ~0.3 m. Fig. 10 suggests that this variation is likely to become insignificant if the HS map is aggregated at larger spatial scale (square typically larger than 100 m x 100 m). Such optimisation is therefore more important for the study of small-scale features (wind drift, avalanches, typically at decametric scale) or to decrease bias on specific terrain (south slopes, fields with isolated trees). The optimization of the photogrammetric processing can also be important when little stable terrain is available for the co-registration.

Comparison of satellite photogrammetry and airborne lidar
Airborne lidar provides HS maps with a better accuracy than Pléiades and potentially a finer horizontal resolution too (Painters et al., 2016). One strong advantage of airborne lidar is that it can measure HS under the tree canopy and in shaded areas. It is also able to acquire data in overcast conditions provided that the clouds are above the aircraft. However, from this study and Marti et al. (2016), it appears that the accuracy shortwave infrared band which enables a robust and unsupervised detection of the snow cover . Differencing terrain covered with vegetation from stable terrain would remain challenging.

Conclusion
We found a good agreement between snow depth (HS) maps from satellite very high resolution stereo images with lidar HS maps over 137 km² of mountainous terrain in California. The mean residual is +0.08 m and the NMAD 0.69 m. Satellite photogrammetry and airborne lidar methods agree at all aspects and over large range of slopes up to 60°. South facing slopes seem prone to a positive bias in the Pléiades HS (~0.2 m). These areas were found to have less texture in the panchromatic images. The main drawbacks of the satellite stereo HS method are the lack of data under dense tree cover, the reduced accuracy in shaded areas, and the current challenge to image large regions in a short time. We found that the accuracy of the maps was sensitive to the B/H and the photogrammetric processing options. Using the current ASP multi-view triangulation routines, we could not find a clear benefit from the use of a triplet of images compared to a pair with optimal B/H (about 0.2). For the error calculation, we suggest focusing on the spatial correlation of the error before averaging HS. We conclude that satellite photogrammetric measurements of HS are relevant for snow studies as they offer a good accuracy, a high level of automation and the potential to cover remote regions around the world.