Theoretical analysis of errors when estimating snow distribution through point measurements

In recent years, marked improvements in our knowledge of the statistical properties of the spatial distribution of snow properties have been achieved thanks to improvements in measuring technologies (e.g., LIDAR, terrestrial laser scanning (TLS), and ground-penetrating radar (GPR)). Despite this, objective and quantitative frameworks for the evaluation of errors in snow measurements have been lacking. Here, we present a theoretical framework for quantitative evaluations of the uncertainty in average snow depth derived from point measurements over a profile section or an area. The error is defined as the expected value of the squared difference between the real mean of the profile/field and the sample mean from a limited number of measurements. The model is tested for oneand two-dimensional survey designs that range from a single measurement to an increasing number of regularly spaced measurements. Using high-resolution ( ∼ 1 m) LIDAR snow depths at two locations in Colorado, we show that the sample errors follow the theoretical behavior. Furthermore, we show how the determination of the spatial location of the measurements can be reduced to an optimization problem for the case of the predefined number of measurements, or to the designation of an acceptable uncertainty level to determine the total number of regularly spaced measurements required to achieve such an error. On this basis, a series of figures are presented as an aid for snow survey design under the conditions described, and under the assumption of prior knowledge of the spatial covariance/correlation properties. With this methodology, better objective survey designs can be accomplished that are tailored to the specific applications for which the measurements are going to be used. The theoretical framework can be extended to other spatially distributed snow variables (e.g., SWE – snow water equivalent) whose statistical properties are comparable to those of snow depth.


Introduction
The assessment of uncertainties of snow measurements remains a challenging problem in snow science.Snow cover properties are highly heterogeneous over space and time and the representativeness of measurements of snow stage variables (e.g., snow depth, snow density, and snow water equivalent (SWE)) is often overlooked due to difficulties associated with the assessment of such uncertainties.This has been, at least in part, due to the limited knowledge of the characteristics of the spatial statistical properties of variables such as snow depth and SWE, particularly at the small scale (submeter to tens of meters).However, recent improvements in remote sensing of snow (e.g., light detection and ranging (LIDAR) and radar technologies) have allowed significant progress in the quantitative understanding of the small-scale heterogeneity of snow covers in different environments (e.g., Trujillo et al., 2007Trujillo et al., , 2009;;Mott et al., 2011).
Point or local measurements of snow properties will continue to be necessary for purposes ranging from inexpensive evaluation of the amount of snow over a particular area, to validation of models and remote sensing measurements.Such measurements have a footprint representative of a very small area surrounding the measurement location (i.e., support, following the nomenclature proposed by Blöschl, 1999), and the integration of several measurements is neces-Published by Copernicus Publications on behalf of the European Geosciences Union.E. Trujillo and M. Lehning: Theoretical analysis of errors when estimating snow distribution sary for a better representation of the snow variable in question over a given area.Because of this, tools for quantitative evaluations of the representativeness and uncertainty of measurements need to be introduced, and the uncertainty of such measurements should be more widely discussed in the field of snow sciences.
Currently, efforts to assess the reliability and uncertainty of snow measurements have focused on statistical analyses using point measurements (e.g., Pomeroy and Gray, 1995;Yang and Woo, 1999;Watson et al., 2006;Rice and Bales, 2010;Lopez-Moreno et al., 2011;Meromy et al., 2013) or synthetically generated fields in a Monte Carlo framework (e.g., Kronholm and Birkeland, 2007;Shea and Jamieson, 2010), comparisons between remotely sensed and ground data (e.g., Chang et al., 2005;Grünewald and Lehning, 2014), and analyses of subsets drawn from spatially distributed remotely sensed data (e.g., McCreight et al., 2014).These studies have been useful to empirically quantify uncertainties associated with point measurements.For example, Pomeroy and Gray (1995) present an equation for determining the minimum number of surveys points required to be confident that the mean falls within a certain envelop around the sample mean based on the CV of SWE or snow depth.McCreight et al. (2014) use the NASA's Cold Land Processes Experiment (CLPX) LIDAR snow depth data set (also used in this study) to empirically address questions regarding the inference of larger-scale snow depths from sparse observations.They evaluate estimation uncertainty from random sampling for varying sample size.Their conclusions indicate that adding observations to a randomly distributed survey pattern leads to a reduction in both percent-error in snow volume over the study areas, as well as its uncertainty.They also add that with a few hundred observations, one can expect to infer the true mean snow depth over the 1 km 2 domains to within 2 % error.Despite of these insights, these types of empirical approaches can be site-dependent, they do not provide a theoretical quantitative framework for the assessment of uncertainties associated with a particular sampling design, they do not allow for an optimal sampling strategy (e.g., selecting the number of points and locations for a desired accuracy level), and they do not take advantage of the increased knowledge of the characteristics of the heterogeneity of snow cover properties.
Another possible approach is one in which the expected error in the estimation of a particular statistical moment of a field over a defined domain (e.g., areal mean or standard deviation from a finite number of measurements) is determined on the basis of known statistical properties of the field in question.Such an approach uses geostatistical principles that have been proposed by Matheron (1955Matheron ( , 1970) ) and others, and that have been applied in mining geostatistics (Journel and Huijbregts, 1978), the analysis of uncertainties in measuring precipitation (Rodríguez-Iturbe and Mejía, 1974), and for a more general analysis of the effects of sampling of random fields as examples of environmental variables (e.g., Skøien and Blöschl, 2006).Implementation of these types of approaches appear to be lacking in the numerous studies using point measurements to represent snow distribution.Often in these studies, the spatial snow distribution derived from point measurements is addressed as the "true" distribution, which is then used for evaluating the performance of interpolation methodologies, regressions trees, and hydrological models.These comparisons ignore the intrinsic error incurred when extrapolating the original point measurements, leaving a proportion of uncertainty unaccounted for that can be significant.The principal motivation of the present study is to encourage the use of more objective and quantitative methodologies for error evaluation in snow sciences.The approach presented below can be used for objective survey design to estimate snow distribution from point measurements.We do not intend to present our approach as novel in the general geostatistical sense; instead, we present the derivation with the specific application for snow sciences in mind.However, because of the general nature of the random fields' theory the development is based on, similar developments can indeed be applied to other environmental variables that can be described as a random field.
On this basis, the error in the estimation of spatial means from point measurements over a particular domain (e.g., a profile, or an area) can be quantified as the expected value of the squared difference between the real mean and the sample mean obtained from a limited number of point measurements.Such an approach, as it will be shown here, uses spatial statistical properties of snow depth fields in a way that allows for an objective evaluation of the estimation error for snow depth measurements.The sections below illustrate the use of such methodology for optimal design of sample strategies in the specific context of snow depth.However, the methodology can also be implemented for other snow variables such as snow water equivalent.

Background
Let Z(x) denote a random field function of the coordinates x in the n-dimensional space R. Bold-italic letters represent a location vector from hereon.In our case, the field can represent, e.g., snow depth or snow water equivalent (SWE) at a given time of the year.The mean of the process over a domain A (e.g., a profile section or an area) is defined as follows: In practice, the mean is often obtained from the arithmetic average of measurements at a finite number of locations, N , within the domain as follows: The performance of the estimator Z can be evaluated by calculating the expected value of the square difference between the estimator Z and the true mean µ z (A) For a 1st order stationary process (i.e., the mean independent of location; e.g., Cressie (1993), Sect. 2;and Journel and Huijbregts (1978), Sect.2), (Eq. 3) can be expressed as where VAR[ ] and COV[ ] are the variance and the covariance, respectively.If we further assume that the process is second order stationary (e.g., Cressie (1993), Sect. 2;and Journel and Huijbregts (1978), Sect.2), that is, if the mean and the variance are independent of the location, and the covariance function depends only on the vector difference x i − x j .Equation (3) can be expressed as where CORR[ ] is the correlation function, and σ 2 p is the variance of the point process.
The first two terms in Eq. ( 5) are the total sum of the covariances (or correlation as σ 2 p has been factored out) between all point locations i = 1, . .., N (e.g., measurement locations).The first of the two terms is only a function of the number of points, while the second is a function of the number of points, N, and the correlations between the locations.Such correlations are themselves a function of the separation vectors (both in magnitude and direction), and the parameters of the correlation function.These two terms are independent of the size of the area A, and can be thought of as the portion of the error caused by the correlation between the point processes at the locations i = 1, . .., N (e.g., measurement locations).Term 3 accounts for the correlation between the measurement locations and the continuous process over the domain A. This term can be seen as a negative contribution to the total error assuming that the sum of the integrals is positive.The term is a function of the number of points, N , the domain area, A, the location of the points and the correlation structure, characterized using the parameters of the correlation function.Lastly, term 4 is the contribution to the error caused by the intrinsic correlation structure of the continuous process over the domain.This term is a function of the domain (e.g., size and shape of A) and the correlation structure (e.g., parameters of the correlation function).

Data
For the analyses and tests of the methodology presented here, light detection and ranging (LIDAR) snow depths obtained as part of the NASA's Cold Land Processes Experiment (CLPX) will be used (Cline et al., 2009).The data set consists of spatially distributed snow depths for 1 km × 1 km areas (intensive study areas -ISAs) in the Colorado Rocky Mountains close to maximum snow accumulation in April 2003.The data were processed from snow-on (8-9 April 2013) and snow-off (18-19 September 2013) LIDAR elevation returns with an average horizontal spacing of 1.5 m and vertical tolerance of 0.05 m.The final CLPX snow depth contour product (0.10 m vertical spacing) was generated from these returns.This product was used to generate gridded snow depth surfaces with 1024 × 1024 elements over the ISAs, for a grid resolution of 0.977 m.For this study two areas will be used: the Fraser-St Louis Creek ISA (FS) and the Rabbit Ears-Walton Creek ISA (RW) (Fig. 1).The FS ISA is covered by a moderate density coniferous (lodgepole pine) forest on a flat aspect with low relief.The RW ISA is characterized by a broad meadow interspersed with small, dense stands of coniferous forest and with low rolling topography.The snow depth distributions in these ISAs show differences that are relevant for the analysis of the methodology introduced here.At the FS ISA, the snow depth distribution is relatively isotropic (Fig. 1b), with short spatial correlation memory and little variation in the spatial scaling properties (i.e., power-spectral exponents and scaling breaks) with direction (Trujillo et al., 2007).On the other hand, the spatial distribution of snow depth in the RW ISA is more anisotropic (Fig. 1c), with longer spatial correlation memory along a principal direction aligned with the predominant wind direction vs. shorter memory along the perpendicular direction, and with variations in the power-spectral exponents and scaling breaks according to the predominant wind directions (Trujillo et al., 2007).

One-dimensional process
The spatial representation of the snow cover requires a basic assumption on the scale or resolution at which a field or profile is going to be represented.This relies on the spatial support of the measurements.For the case of snow depths, point measurements from local surveys using a snow depth probe are frequently used for this representation.Generally, there are additional sources of uncertainty associated with these types of measurements, such as the accuracy of the position of the measurement in space or deviations in the vertical angle of penetration of the probe through the snow pack.These uncertainties are additional to any of the uncertainties estimated using the methodology discussed here.The one-dimensional case provides a good opportunity to illustrate the limitations of point measurements.Consider the case of a snow depth profile that is measured using a snow depth probe at a regular spacing "d".Each of these point measurements is meant to represent the mean snow depth over a particular distance surrounding the measurement.The question is, over what distance is this assumption valid?In this case, the intrinsic assumption is that the measurement is representative over the distance "d", but at this point the validity of such an assumption is not proven.
The answer to this question is conditioned to how variable the profile is and over what distances.To address this, let us look at two snow depth profiles, one in a forested environment (FS) and another in an open environment (RW) in the Colorado Rocky Mountains (Figs. 2a and 3a, respectively).The variability in the profiles is markedly different, with variations over shorter distances in the forested area, and a smoother profile in the open and wind influenced environment.This is reflected in the spatial correlation structure of these snow depth profiles, with stronger correlations over longer distances in open and wind-influenced environments with respect to that in forested environments (Trujillo et al., 2007(Trujillo et al., , 2009)).These differences should be considered when selecting the sampling frequency required to capture the variability and accurately represent the mean conditions within a particular sampling spacing.This is illustrated by comparing the mean snow depth for a particular resolution to the point value at the center of the interval (Fig. 2b in a forested environment and Fig. 3b in an open and wind-influenced environment).In the figures, average vs. point values at several sampling intervals are compared for normalized profiles (µ = 0, σ = 1) separated every 30 m in both the x (east) and y (north) directions and for an area of 500 m × 500 m.The 30 m separation between profiles is chosen to reduce the spatial correlation between them.
Firstly, the resulting comparison shows that the point values generally overestimate the variability in mean snow depths if we replace the mean snow depth distribution by its point sample.To clarify this, let us consider here two snow depth profiles, one with the snow depths at the nominal scale (∼ 1 m), and a second one with a moving average (MA) of the first one with an averaging window equal to the sampling spacing.Ultimately, the variance/standard deviation of the first profile (∼ 1 m) is larger than that of the MA, with a distribution that reflects these differences.The samples drawn from the first profile will reflect a larger variance than that of the samples from the MA profile as they are drawn from these distributions, and this is what is reflected in Figs. 2 and  3.The degree of overestimation can be quantified through the slope of the regression line (in red in Figs.2b and 3b).In the forested environment (Fig. 2b), the slopes range between 0.8 and 0.13, with decreasing slopes with increasing spacing.These slopes indicate that, on average, the magnitudes of the mean values are 0.8 times the magnitudes of the point values for the 5 m spacing and 0.1 times for the 100 m spacing.In the open and wind-dominated environment, the slopes are higher and range between 0.97 and 0.23 from 5 m spacing and 100 m spacing, respectively.A clear difference emerges: forested environments require shorter separation between single measurements if the snow depth profile is to be accurately captured by the measurements.The variability within the size of the interval determines the degree of uncertainty associated with the point measurements, as the subinterval variability is related to the degree of overestimation of the mean value within the interval.
Secondly, the differences between average and point values for each spacing distance are generally more scattered in the forested environment than in the open environment, and in both environments the degree of scattering increases with spacing (Figs.2c and 3c).However, it is important to note here that we are comparing normalized profiles (µ = 0, σ = 1), allowing us to focus on the rescaled spatial variations.What is highlighted is the relevance of the spatial structure of the profile rather than the absolute variance.This spa- tial structure can be quantified by, for example, the spatial covariance/correlation function.
In addition to differences in correlation structure, there are also differences in the absolute variability in snow depth in these environments (Fig. 4).Contrary to the normalized snow depth discussed above, the subinterval standard deviation as a function of interval size along the profiles is higher in the open and wind-influenced environment at RW vs. the forested environment at FS (Fig. 4a).Mean standard deviation values in the open environment are twice as large as those at the forested environment towards the larger interval sizes (∼ 100 m).The standard deviation increases with interval size in both environments, with the steepest increase at the lower interval sizes.Furthermore, the standard deviation tends to stabilize more rapidly in the forested environments, with an increase of only 1.8 cm between 30 and 100 m.On the other hand, the standard deviation continues to increase in the open environment at RW, with less of an asymptotical behavior for the scales analyzed.Complementary, the shaded areas (25 to 75 % quantiles) give an idea of the variability of standard deviation values, with a much wider range in RW vs. FS, and an increase in the range between quantiles with interval size in RW.
Consistent with the standard deviation, the sub-interval mean range (range defined as the difference between the maximum and minimum snow depths within an interval) increases with interval size in both FS and RW (Fig. 4b).However, the mean range is larger in the open environment at RW and the rate of increase with interval size is also steeper.Similarly, the shaded areas indicate wider distribution of range values in the open environment at RW, while they are relatively uniformly distributed around the mean across interval sizes in the forested environment at FS.The results in Figs.2-4 illustrate this contrasting behavior between the snow covers in these environments and their influence on measurement strategies: that is, the forested environments requires shorter separation between measurements for accurate representation of the snow cover; however, in the windinfluence and open environment, the subinterval variability is higher indicating wider variations around any sampled measurement within the interval.Ultimately, the number and distance between measurements and the specific arrangement of the measurements are all conditioned to what the measurements are needed for.Hydrologic applications may not require a highly detail representation of a snow depth profile (or a field), and representing the average conditions over a given distance (or area) is sufficient, but small-scale process-based studies may require a more detailed characterization over shorter distances (or smaller areas).This implies that the decision depends on the particular usage that the measurements will support.In the following sections, the equations presented in the background (Sect.2) will be applied to evaluate the uncertainty associated with multiple measurement designs for profiles and fields of snow depth.

Case 1: single measurement along a profile section
Equation ( 2) can be used to evaluate the uncertainty of a single measurement along a profile section of length L. For this case, as well as for the following cases in this article, an exponential covariance with a decay exponent ν (ν > 0) will be assumed: where σ 2 is the variance, and h is the length of the vector h.For this one-dimensional case and combining Eqs. ( 6) and (5), the following expression is obtained: where x is the distance from one extreme of the section to the location of the measurement (Fig. 5a).The normalized squared error σ 2 Z (x, L, v) σ 2 p is minimized at x equal to half of the section length, L / 2, regardless of ν.The existence of a correlation in the profile leads to this solution, as the middle location contains more information about its surroundings.Also, this solution is different from the solution for an uncorrelated profile (e.g., white noise), for which the squared error would be equal to the variance, independent of the location of the measurement.
The results here are confirmed with an analysis of LIDAR snow depths profiles in FS and RW (Fig. 6).The analysis consists of calculating the difference between the mean and the point value for sections of a given length (varied between 10-50 m) and for x (Fig. 5a) between 0 and L along the profile sections.Each sample section of length L will provide a single difference for each of the x values.These sample differences are then used to calculate the mean normalized squared error for each x, and the same is repeated for each section length L. The results indicate that the real snow depth profiles behave as predicted by the model of the error, with a minimum error at x equal to half of the section length.Another difference highlighted by these results is the difference between the sample errors in the forested environment (FS) vs. the open environment (RW) for the larger interval sizes (e.g., 50 m).The sampled normalized squared error in the forested environment shows only a mild decrease in the square error to around 0.7-0.8towards the inside of the section length.However, this decrease is achieved for the measurement along most of the interval length with the exception of the extremes.This can be explained by the relationship between the spatial memory of snow depth (e.g., the correlation function) and the section length.Densely forested environments exhibit correlation lengths that are shorter than those in open and wind influenced environments (e.g., Trujillo et al., 2007Trujillo et al., , 2009)).As the section length increases beyond such correlation lengths, a measurement location towards the middle of the interval contains less information of the surrounding snow depths in a forested environment (e.g., FS) vs. an open and wind influenced environment (e.g., RW).This is observed in Fig. 6c vs. Fig.6f, with the results in RW showing a more clear minimum towards the center of the profile section.The results also show a poorer performance of the model in RW vs. FS, as the exponential correlation model has a poorer fit in RW at the shorter-lag range; However, model performance is improved for longer section lengths (e.g., Fig. 6c and f) Model and sampled results thus support that the measurement location can be fixed in the middle of the interval, and the normalized squared error can then be described as a function of both the exponential decay exponent, ν, and the length of the section, L (Fig. 7a).The normalized squared error increases with interval length, with a steeper increase for larger exponential decay exponents, for which the squared error approaches that of an uncorrelated field more rapidly.The theoretical model is tested on the snow depth fields at FS and RW.The test consists of calculating the sampled normalized squared error as the average of all squared differences between the mid-section snow depth and the mean from all LIDAR grid points within each interval of length L. This is done for profiles separated every 30 m, similar to the analysis above, and for profiles along the x and y directions.The theoretical normalized squared error is estimated from Eq. ( 7) using the exponential decay exponent from the model fitted to the sampled correlation function.The results show that the theoretical model reproduces the sampled squared error remarkably well, even reproducing the anisotropic properties of the correlograms, represented by the different exponents of the exponential model along x and y directions (Fig. 7b www.the-cryosphere.net/9/1249/2015/The Cryosphere, 9, 1249-1264, 2015 p ) for the case of a single measurement along a profile section of length L, as in Fig. 5a.The survey case applied to profiles in FS and RW along the x and y directions.Solid lines are the theoretical error using exponential decay exponents derived from the functions fitted to the sampled correlation functions of the two surfaces in the x and y directions.and c).The model also reproduces the different behavior of the squared error between both fields (i.e., FS and RW), showing that the normalized squared error increases more rapidly and is larger in the forested environment (Fig. 7b) vs. the open environment (Fig. 7c).However, it should be noted here that as the error is normalized and as the variance of the field in the open environment is larger (Fig. 4a), the absolute squared error could reach higher values in the open environment (RW).In this regard, one feature to discuss here is the assumption that the point variance of snow depth in these environments has been estimated as the spatial variance over the entire study area, as it is generally practiced in time series analysis and geostatistics.In practice, this is the only possible approach because there is limited information to estimate the point variance from multiple realizations of the process at each spatial location, as inter-annual and intra-annual snow depth fields are not available, not only for these areas, but for almost any area where this methodology may be applied.

Case 2: three measurements along a profile section
From Eq. ( 5) it is also evident that increasing the number of measurements will reduce the squared error.In the case of three measurements separated by a distance "a", with the middle measurement centered in the section of length L (Fig. 5b), and for an exponential covariance function with parameter ν, Eq. ( 5) leads to the following expression: Equation ( 8) can be minimized to determine the optimal separation distance between points, a, as a function of L and ν: where The combination of Eqs. ( 8) and ( 9) can be used to determine the normalized squared error, σ 2 Z σ 2 p , and the optimal distance, a optimal , for the measurement pattern in Fig. 5b.The model predicts that the normalized squared error is minimized at an intermediate location between 0 and L / 2 (black lines in Fig. 8a and b).The results show an increase in the error with interval size, L, as well as little sensitivity of a optimal to ν.This latter feature can be seen as an advantage since small biases in the estimation of ν will not result in significant biases in the estimation of a optimal .One could almost assume a value of a optimal without prior knowledge of the exponential decay exponent, selecting a optimal within the range of values indicated by the model for a rage of possible exponential decay exponents.Note that a optimal is located close to the 60 % distance from the center towards the outer boundary of the profile section for all section lengths (Fig. 8a and b).On the other hand, the measurement error displays a higher sensitivity to ν around a optimal , indicating that biases in the estimation of ν would have a more noticeable effect on the estimation of the measurement error.This is further clarified in Fig. 8c, in which the normalized error (not squared) and a optimal can be obtained for corresponding profile section lengths (L) and exponential decay exponents (ν) based on the isolines shown.For example, for a profile section of 30 m, and an exponential decay exponent of 0.2 m −1 , the normalized error is 0.32 and a optimal is 9.63 m (see intersect of the two isolines in Fig. 8c).The normalized error in Fig. 8c is not squared, highlighting the sensitivity of the measurement error to ν, which represents the degree of spatial correlation of the profile in this case (e.g., lower values indicate stronger spatial memory/correlation, hence lower measurement errors).
The performance of the model is tested against the normalized squared error obtained from the same snow depth profiles in FS and RW.The test consists of estimating the normalized squared error for profiles sections of length between 10 and 80 m, with a being varied between 0 and L / 2 (Fig. 9).For each value of a, the normalized squared error is estimated based on the means obtained using the three snow depth samples for each section.All squared differences are then averaged to obtain the values presented in the figure.mum error is also reproduced by the model proving the applicability of the model for estimating the optimal separation between measurements.The model does perform better in the forested environment of FS vs. RW, particularly for lower a values.This can be justified as the exponential covariance model displays a better fit in FS over RW, particularly over the lower range of lag values.Also, note that both the modeled and sampled normalized squared errors are lower for the snow depth profiles at RW because of the longer spatial memory of the snow depth distribution in this environment (higher spatial correlations) when compared to that in FS.

Case 3: N measurements along a profile section
As stated above, the measurement error can be reduced by increasing the number of measurements taken over a given section of length L. Let us focus on the case of stratified sampling where N regularly spaced measurements are taken over the interval (Fig. 5c), and to quantify this reduction we can use Eq. ( 5) and the exponential covariance model.Equation (5) can then be reduced to the following: The normalized squared error (σ 2 Z σ 2 p ) obtained with Eq. ( 10) for profiles sections of lengths between 10 and 80 shows a steep decrease with N (Fig. 10), with a steeper decrease for higher exponential decay exponents.For the longer profile sections (e.g., 80, Fig. 10d), small reductions in the squared error are achieved beyond only a few measurements (e.g., N = 16).Equation ( 10) and the results in Fig. 10 can be used to determine the number of measurements necessary to achieve a desired accuracy level.One could, for example, design a survey to sample a snow depth profile with a mean value every 10 m.The number of measurements required to achieve a desired level of accuracy can be obtained from Fig. 10a, based on previous knowledge of the sam-   ple estimate of the exponential decay exponent.This can be achieved thanks to the intra-annual and inter-annual persistence of the spatial patterns, and hence, the spatial statistical properties of snow depth fields in mountain environments, as shown in previous studies using both manual surveys and LIDAR measurements (e.g., Deems et al., 2008;Sturm and Wagner, 2010;Schirmer et al., 2011;Melvold and Skaugen, 2013;Helfrich et al., 2014).A detailed spatial survey (e.g., dense manual measurements or TLS), sampling different portions of an area can be used to determine the covariance/correlation characteristics of the snow depth distribution, with which the model for the error can be applied.An a priori estimate of the exponential decay exponent may also be possible and will be tested in future applications of the framework, given the relative insensitivity of the error with respect to ν.
Following the method described in the previous section, we test the performance of the model against the normalized www.the-cryosphere.net/9/1249/2015/The Cryosphere, 9, 1249-1264, 2015 squared error obtained from the same snow depth profiles in FS and RW.In this case, the sampled squared error is estimated for N measurements distributed along profile sections of length L. As the snow depth fields are gridded at ∼ 1 m resolution, the location of the measurements is approximated to the closest coordinate in the profile section following the pattern in Fig. 5c.Once again, sampled and modeled errors follow closely the same trend for the different L values in both FS and RW (Fig. 11).The error decreases with N, with a rapid decay at the lower N values, illustrating that the error can be drastically reduced by simply increasing the number of measurements by a small amount.The normalized squared error across all N values is lower for RW than for FS, consistent with the higher spatial correlations observed in the snow depth fields of RW vs. FS.Once again, there are some differences between the sampled and modeled normalized squared error in RW for the shorter profile lengths and for small N values: a consequence of the poorer fit of the exponential model for the shorter lag range in RW.However, the model is still able to reproduce the error in both fields, and the applicability of the model is illustrated even when the fit of the correlation model can be improved.

Two-dimensional process
Similar to the one-dimensional process, Eq. ( 5) can be formulated to calculate the squared error in the two-dimensional space.To exemplify this, we apply the methodology to an isotropic process over the x − y plane for three cases in a square area: (a) one single measurement in the center of the area, (b) five measurements radiating out from the center (Fig. 12a), and (c) N by N measurements regularly spaced in the x and y directions (Fig. 12b).
For the isotropic case, the covariance/correlation function is only dependent on the magnitude of the lag vector, and, consequently, the error is represented by, The exponential correlation function for the isotropic case takes the following form: where h is the magnitude of the lag vector.Replacing the correlation function in the expression for σ 2 Z , we obtain, For the case of a rectangular area of side dimension L x and L y in the corresponding x and y directions, the equation becomes, The limits of the integrals can be changed depending on the desired location of the origin.In this case, the origin is located at the lower-left corner.
As discussed earlier, the first term is only a function of N , such that the base error is the variance of the point process divided by the number of points.The second term is a function of N , the location of the points, and the decay rate ν.The third term is a function of N , A, the location of the points, and the decay rate ν.The fourth term is a function of A and ν, but is independent of the location of the points and N (i.e., independent of the survey design, and only a function of the correlation structure of the continuous process).p ) as a function of the distance a from the center of the area for square areas of side dimensions (L) between 10 and 80.Each curve corresponds to an exponential decay (ν) between 0.1 and 5.

Case 1: single measurement in the center of the area
In this case, we focus on a single measurement in the middle of a square area of side dimension L. Numerical solution of Eq. ( 15) shows that the normalized squared error increases rapidly with L, with a steeper increase for higher exponential decay exponents (Fig. 13a), which approach a normalized squared error of 1 for L values less than 10 (e.g., 1 ≤ ν ≤ 5).The theoretical results in Fig. 13a can be used to determine the discrepancy between a single measurement in the middle of an area and the areal mean for a second order stationary and anisotropic process with an exponential covariance/correlation function.Comparison of the modeled and sampled normalized square errors for the FS snow depth field indicate very good agreement between modeled and sample errors (Fig. 13b).The sample error is estimated following the same procedure explained for the one-dimensional cases, although in the two-dimensional space.Both sampled and modeled errors show the same behavior across L values between 1 and 100 m, although the scatter in the sampled error increases for larger L values.This can be explained by the smaller number of samples to estimate the mean normalized squared error and the fact that the correlation structure decays rapidly and a single sample becomes less correlated to the surrounding area for these larger areas.The model introduced here can then be used to assess the representativeness of a single measurement over an area objectively and accurately, and it can be extended for other covariance/correlation functions as needed.p ) for the N by N point pattern in Fig. 12b, and over square areas of side dimensions (L) between 10.7 and 79.1 m.The solid black point markers are the theoretical error for an exponential decay exponent of 0.17 derived from the sampled correlogram of snow depth in FS.The dotted red lines with circle markers are the sampled normalized squared error for the snow cover in FS.

Case 2: five measurements radiating out from the center of the area
The case of five measurements radiating out from the center (Fig. 12a), with a point in the middle of the area and four points separated by a distance a from the center leads to a similar optimization problem as illustrated in case 2 of the one-dimensional examples (Sect.4.2).In the twodimensional case, Eq. ( 15) does not have an explicit solution for a, and numerical implementation is required.The equation can be solved by simply replacing the point coordinates and the correlation function parameters.Following this ap-proach, the normalized squared error can be obtained for areas of varying sizes (Fig. 14).Similar to the one-dimensional example (case 2, Sect.4.2), σ 2 Z σ 2 p decreases with a, reaching a minimum at an intermediate distance from the middle point outwards.The decay in σ 2 Z σ 2 p is more rapid for the least correlated processes (i.e., higher decay exponents) reaching a value close to the base normalized square error that is a function of the number of points (i.e., 1/N = 1/5 in this case).An extended analysis of the effect of each of the terms in the equation is included in the Supplement.The error, as shown in Fig. 14, is minimized as a consequence of two balancing terms that lead to this intermediate solution.The optimal solution is a balance between reducing the correlation between the individual measurements (e.g., increasing the separation between the location of the measurements) but increasing the correlation between the measurements and the surrounding area (e.g., locating the measurements closer to the middle of the area).These two competing effects lead to an optimization problem based on the location of the point measurements.For the least correlated processes, the error resembles the behavior of an uncorrelated field once the measurements become effectively decorrelated (e.g., a > 1 in Fig. 14b for ν = 5).Figure 14 exemplifies how Eq. ( 14) can be used to determine the optimal measurement location for areas of different sizes, and to determine the associated error with configurations other than the optimal.
The performance of the model is tested against the normalized squared error obtained from the snow depth field in FS.The test consists of estimating the normalized squared error for a square area with side length (L) between 10 and 79 m, with a being varied between 0 and L / 2 (Fig. 15).For each value of a, the normalized squared error is estimated based on the means obtained using the five snow depth samples for each section.All squared differences are then averaged to obtain the values presented in the figure.Once again, the sampled and modeled errors follow the same trend across all a values and for the different L values.The minimum error and a optimal are also reproduced closely by the model, and as the area size increases, the sampled and modeled error approach the error for an uncorrelated field at larger separations (i.e., 0.2).These results illustrate that the performance of the model in the two-dimensional space is remarkable, similar to what was observed in the one-dimensional case.

Case 3: N by N measurements regularly spaced in the x and y directions
Similarly to the one-dimensional case, the two-dimensional case of N by N regularly spaced measurements (Fig. 12b) leads to a decreasing normalized squared error with N (Fig. 16).There is a sharp decrease in the error by just increasing the number of measurements in the lower range of N .The analysis illustrates that stratified sampling, as shown here, is an excellent approach for minimizing the error.For The Cryosphere, 9, 1249-1264, 2015 www.the-cryosphere.net/9/1249/2015/E. Trujillo and M. Lehning: Theoretical analysis of errors when estimating snow distribution 1263 a 10 by 10 area for example, increasing N to 4 (N 2 = 16) reduces the normalized squared error to less than 0.05.It is also worth noting here that for this two-dimensional case, the error is less sensitive to the value of the exponential decay exponent (ν) for the higher N values as the mean is accurately captured regardless of the correlation of the field.Beyond a certain number of measurements regularly distributed in the area, the measurements gather enough information such that there are only very minor improvements with the addition of new measurements, regardless of the exponent value.Figure 16 serves as an example of how the methodology can be used for objective selection of the number of measurements necessary to achieve a desired accuracy level using prior knowledge of the spatial covariance function.
The performance of the model is tested again for a square area with side length (L) between 10 and 79 m using the snow depth field in FS, and for an increasing number of rows/columns of measurements leading to a total number of measurements of N 2 (Fig. 17).The results illustrate again the accurate performance of the theoretical model, with sampled and model errors following closely the same squared errors.Both sampled and modeled errors increase as the size of the area increases, as expected.These results complete the model performance tests for the two-dimensional isotropic case.

Summary and conclusions
A methodology for an objective evaluation of the error in capturing mean snow depths from point measurements is presented based on the expected value of the squared difference between the real average snow depth and the mean of a finite number of snow depth samples within a defined domain (e.g., a profile section or an area).The model can be used for assisting the design of survey strategies such that the error is minimized in the case of a limited and predetermined number of measurements, or such that the desired number of measurements is determined based on a predefined acceptable uncertainty level.The model is applied to one-and twodimensional survey examples using LIDAR snow depths collected in the Colorado Rockies.The results confirm that the model is capable of reproducing the estimation error of the mean from a finite number of samples for real snow depth fields.
Here, we should highlight some of the implications of the assumptions made in the model.In simplified terms, the second-order stationarity assumption implies that the mean and the variance of the process/variable (e.g., snow depth) are independent of the spatial location, and that the covariance is dependent only on the separation vector (i.e., lag).Although these assumptions may be less valid over larger scales (e.g., greater than 100 m), in the context of the model application to snow depth the assumption should be valid at smaller scales.We present these examples to show how the error can be quantified with good accuracy at such smaller scales.Ap-plication of these types of approaches at larger scales will require additional evaluation with particular attention as to what the specific demands of the application are.Also, the methodology presented here is not suitable for discontinuous snow cover if both snow-covered and snow-free areas are considered in the error estimation.This case has not been considered in the development here.
Implementation of the model in practice requires prior assumption of a correlation/covariance model and estimates of the model parameters (e.g., the decay exponent for the exponential case).In the examples presented here we use LIDAR data for the parameter estimation to illustrate the applicability of the model and its ability to estimate the error using real snow depth data.Snow distributions in mountain environments have been shown to be consistent intra-and interannually because the controlling processes are relatively consistent during the season and from season to season.Such consistency suggests that the correlation/covariance model should also be consistent, as well as the parameters of the model.These parameters can be estimated via a dense survey either manually or with TLS of one or more small plots of a size similar to the size that is aimed to be represented.These surveys would not necessarily have to be repeated as the parameters and covariance models should be preserved.Detailed surveys can be conducted under different conditions to characterize the range of the correlation models and parameters (e.g., after a snow storm, or close to peak accumulation).Also here, we should point out that although we show results for a wide range of the exponential decay exponent values, we are finding that most of the values that we have observed are in the lower range of those presented (e.g., 0.1-0.2m −1 ).Hence, the biases in the estimated error and the survey design remain small.
Currently, remote sensing technologies (e.g., TLS, Airborne LIDAR, and ground penetrating radar) are allowing for the characterization of snow cover properties at increasing resolutions in both space and time.Such improvements can be utilized in the context presented here providing information about the range of best fitting covariance/correlation models and parameters for different conditions, supporting the application of methodologies such as the one presented here.With such improvements, survey designs can be optimized such that estimation errors can be explicitly addressed and accounted for, particularly when extrapolating a limited number of measurements to estimate the spatial distribution of snow.Such applications will continue to be relevant despite of the aforementioned improvements, as access to these technologies is limited by their cost and the expertise that is required for their application.

E. Trujillo and M. Lehning: Theoretical analysis of errors when estimating snow distribution
The Supplement related to this article is available online at doi:10.5194/tc-9-1249-2015-supplement.
Figure 1.(a) Location of the Fraser and Rabbit Ears study areas in the state of Colorado.(b) LIDAR Snow depth distributions on 8 April 2003, at the Saint Louis Creek intensive study area (ISA) and (c) on 9 April at the Rabbit Ears ISA.

Figure 2 .
Figure 2. (a) Sample normalized snow depth profile (mean = 0, standard deviation = 1) in a forested environment from LIDAR (1 m resolution) at the Fraser-St.Louis Creek (FS) intensive study area (ISA) of the Cold Land Processes Experiment (CLPX) (Trujillo et al., 2007; Cline et al., 2009).The profile is sampled with regular separations (spacing) of 5, 10, 25, 50, and 100 m (from top to bottom, respectively).(b) Average values within sampling intervals (same as in a) vs. point samples for normalized snow depth profiles in the FS ISA.The red line is a linear regression fit, with slope β and r 2 as indicated in each plot.(c) Histograms of the difference between the point and average values for each of the sampling intervals.The vertical red line marks the mean difference.
Figure 3. (a) As Fig. 2 but for an open and wind influenced environment at the Rabbit Ears-Walton Creek (RW) ISA of the CLPX (Trujillo et al., 2007; Cline et al., 2009).(b) Average values within sampling intervals (same as in a) vs. point samples for normalized snow depth profiles in the RW ISA.The red line is a linear regression fit, with slope β and r 2 as indicated in each plot.(c) Histograms of the difference between the point and average values for each of the sampling intervals.The vertical red line marks the mean difference.

Figure 4 .
Figure 4. Sub-interval standard deviation (a) and range (b) for varying interval lengths for profiles of snow depth in a forested environment (FS) and an open and wind-influenced environment (RW) in the Colorado Rocky Mountains (same regions as those in Figs. 2 and 3).The mean standard deviation and mean range for the study areas are shown by the solid lines, while the shaded areas cover the quantiles between 25 and 75 % of the values for all the intervals in these areas.

Figure 5 .
Figure 5. Survey designs for the sampling of a snow profile.

Figure 6 .
Figure 6.Comparison of the theoretical and sampled normalized squared error (σ 2 Z σ 2p ) for the case of a single measurement along a profile section of length L, as in Fig.5a.The survey case applied to profiles in FS and RW along the x and y directions.Solid lines are the theoretical error using exponential decay exponents derived from the functions fitted to the sampled correlation functions of the two surfaces in the x and y directions.

Figure 7 .
Figure 7. (a) Theoretical normalized squared error for a single measurement in the middle of a section of length, L, and for an exponential correlation function with a decay exponent, ν.(b) and (c) Comparison of the theoretical and sampled normalized squared error for the same survey case applied to profiles in FS and RW along the x and y directions.Dashed lines are the theoretical error from Eq. (7) using exponential decay exponents derived from the functions fitted to the sampled correlation functions of the two surfaces in the x and y directions.
Figure 8.(a) and (b) Theoretical normalized squared error for the three-point pattern along a profile section in Fig. 5b, and for profile section lengths (L) of 1 (a) and 25 (b).Each of the colored lines corresponds to a specific decay exponent, ν, and the black line marks the theoretical solution for a optimal .(c) Theoretical normalized error and a optimal for isolines of profile section lengths (L) and exponential decay exponents (ν) for the three-point pattern along a profile section of length L in Fig. 5b.

Figure 9 .
Figure 9. Theoretical and sampled normalized squared error (σ 2 Z σ 2p ) for the three-point pattern along a profile section in Fig.5b, and for profile section lengths (L) between 10 and 80 m in FS and RW.The solid lines are the theoretical error from Eq. (8) using exponential decay exponents derived from the functions fitted to the sampled correlation functions of the two surfaces in the x and y directions, while the dots correspond to the sampled error for profiles in FS (a-d) and RW (e-h).

Figure 10 .
Figure 10.Theoretical normalized squared error (σ 2 Z σ 2p ) for the N-point pattern along a profile section in Fig.5c, and for profile section lengths (L) between 10 and 80 obtained from Eq. (10).

Figure 11 .
Figure 11.Theoretical and sampled normalized squared error (σ 2 Z σ 2 p ) for the N-point pattern along a profile section in Fig. 5c, and for profile section lengths (L) between 10 and 80 m in FS and RW.The solid point markers are the theoretical error from (10) using exponential decay exponents derived from the functions fitted to the sampled correlograms of the two surfaces in the x and y directions, while the circle markers with the dotted lines correspond to the sampled error for profiles in FS (a-d) and RW (e-h).

Figure 12 .
Figure 12.Sample survey designs with (a) a five-point pattern centered in the area, and (b) a regularly spaced pattern.For the fivepoint pattern, a can vary between 0 and L / 2, while for the N × N points pattern, the separation between the measurements is determined by the number of points.

Figure 13 .
Figure 13.(a) Theoretical normalized squared error (σ 2 Z σ 2p ) for the two-dimensional case with a single measurement in the middle of a square area with side dimension L. (b) Theoretical and sampled normalized squared error for the same two-dimensional survey applied to the snow depth field in FS.The dashed line is the theoretical error derived for an exponential decay exponent of 0.17 derived from the sampled correlation function of snow depth in FS, while the solid line is the sampled normalized squared error for the snow cover in FS.

Figure 14 .
Figure 14.Theoretical normalized squared error (σ 2 Z σ 2p ) as a function of the distance a from the center of the area for square areas of side dimensions (L) between 10 and 80.Each curve corresponds to an exponential decay (ν) between 0.1 and 5.

Figure 15 .
Figure 15.Theoretical and sampled normalized squared error (σ 2 Z σ 2p ) for the 5-point pattern in Fig.12aover square areas of side dimensions (L) between 10.7 and 79.1 m.The separation distance (a) is varied from the center outwards.The solid line is the theoretical error derived for an exponential decay exponent of 0.17 derived from the sampled correlation function of snow depth in FS, while the solid red point markers are the sampled normalized squared error for the snow cover in FS.

Figure 16 .
Figure 16.Theoretical normalized squared error (σ 2 Z σ 2p ) for the N by N point pattern in Fig.12b, and for areas of side dimension (L) between 10 and 80.The exponential exponent is varied between 0.1 and 5.

Figure 17 .
Figure 17.Theoretical and sampled normalized squared error (σ 2 Z σ 2p ) for the N by N point pattern in Fig.12b, and over square areas of side dimensions (L) between 10.7 and 79.1 m.The solid black point markers are the theoretical error for an exponential decay exponent of 0.17 derived from the sampled correlogram of snow depth in FS.The dotted red lines with circle markers are the sampled normalized squared error for the snow cover in FS.