On the Green’s function emergence from interferometry of seismic wavefields generated in high-melt glaciers: implications for passive imaging and monitoring

Ambient noise seismology has revolutionized seismic characterization of the Earth’s crust from local to global scales. The elastic Green’s function (GF) between two receivers can be reconstructed via cross-correlation of the ambient noise seismograms. An homogenized wavefield illuminating the propagation medium in all directions is a pre-requesite for obtaining accurate GF. For seismic data recorded on glaciers, this condition imposes strong limitations on GF convergence, because of minimal seismic scattering in homogeneous ice. We address this difficulty by investigating three patterns of seismic wave5 fields: a favourable distribution of icequakes and noise sources recorded on a dense array of 98 sensors on Glacier d’Argentière (France), a dominant noise source constituted by a moulin within a smaller seismic array on the Greenland ice-sheet, and crevasse-generated scattering at Gornergletscher (Switzerland). In Glacier d’Argentière, surface melt routing through englacial channels produces turbulent water flow creating sustained ambient seismic sources and thus favorable conditions for GF estimates. From the velocity measurements of reconstructed Rayleigh waves, we invert bed properties and depth profiles, and map 10 seismic anisotropy, which is likely introduced by crevassing. In Greenland, we employ an advanced pre-processing scheme which include match-field processing and eigenspectral equalization of the cross-spectra to remove the moulin source signature and reduce the effect of inhomogeneous wavefields on the GF. At Gornergletscher, cross-correlations of icequake coda waves show evidence for homogenized wavefields. Optimization of coda correlation windows further promotes the GF convergence. This study presents new processing schemes on suitable array geometries for passive seismic imaging and monitoring 15


Introduction
Passive seismic techniques have proven efficient to better understand and monitor glacier processes on a wide range of time and spatial scales. Improvements in portable instrumentation have allowed rapid deployments of seismic networks in remote terrain and harsh polar conditions (Podolskiy and Walter, 2016;Aster and Winberry, 2017). Studies on seismic source processes 25 have revealed unprecedented details about englacial fracture propagation (e.g. Walter et al., 2009;Mikesell et al., 2012), basal processes (e.g. Winberry et al., 2013;Röösli et al., 2016a;Lipovsky et al., 2019), glacier hydrology (Bartholomaus et al., 2015;Gimbert et al., 2016) and iceberg calving (e.g. Walter et al., 2010;Bartholomaus et al., 2012;Sergeant et al., 2016Sergeant et al., , 2018. The subsurface structure of ice sheets and glaciers has been characterized by analysis of seismic wave propagation in ice bodies. As a few examples, Harland et al. (2013) and Smith et al. (2017) used records of basal seismicity to measure elastic 30 anisotropy in two Antarctic ice streams. Lindner et al. (2019) identified crevasse-induced anisotropy in an Alpine glacier from velocity anomalies by analyzing icequake seismograms at seismic arrays. Walter et al. (2015) used transient seismic signals generated in moulins to compute frequency-dependent seismic velocities through matched-field processing and estimate the depth of the ice-to-bedrock transition beneath a seismic network deployed on the Greenland Ice Sheet (GIS).
At the same time, a new approach appeared in seismology which explores not only earthquakes but also ambient noise 35 sources generated by climate and ocean activity (Ekström, 2001;Rhie and Romanowicz, 2004;Webb, 1998;Bonnefoy-Claudet et al., 2006). Shapiro and Campillo (2004) and Shapiro et al. (2005) pointed out the possibility of using continuous noise recordings to reconstruct propagating surface waves across a seismic array and to use them for crustal tomography in California.
Other studies followed, shaping the analysis of ambient noise background into a powerful tool to constrain the elastic properties of the illuminated medium, making it possible to image the Earth's interior from regional (Yang et al., 2007;Lin et al., 2008) to 40 local scales (e.g. Lin et al., 2013;Nakata et al., 2015) and monitor e.g., seismic fault (e.g. Brenguier et al., 2008b;Olivier et al., 2015) and volcanic processes (Sens-Schönfelder and Wegler, 2006;Brenguier et al., 2011). Moreover, ambient noise studies have so far led to original observations such as thermal variations of the subsoil, spatio-temporal evolution of the water content, stress changes along fault zones with applications to geomechanics, hydrology and natural hazards (Larose et al., 2015).
For the cryosphere, few studies have successfully used oceanic ambient noise at permanent broadband stations deployed on terest (see also Sect. 6.2). This technique proved to be efficient to suppress spurious arrivals in the cross-correlation function which emerge in the presence of heterogeneously distributed sources. However, this method was applied on active sources and 90 synthetic seismograms and its viability still needs to be addressed for passive recordings.
In this study, we provide a catalogue of methods to tackle the challenge of applying passive seismic interferometry on glaciers in the absence of significant scattering and/or an isotropic source distribution. After a review on glacier seismic sources (Sect. 2.1), we investigate the GF retrieval on three glacier settings with different patterns of seismic wavefields. In a first ideal case (Glacier d'Argentière in the French Alps, Sect. 3), we take advantage of a favorable distribution of noise sources and 95 icequakes recorded on a dense array. In a second case (GIS, Sect. 4), a dominant persistent noise source constituted by a moulin prevents the accurate estimate of the GF across the array. We use a recently proposed scheme (Corciulo et al., 2012;Seydoux et al., 2017) that involves matched-field processing to remove the moulin signature and improve the GF estimates. In a third case (Gornergletscher in Swiss Alps, Sect. 5), the limited distribution of icequakes is overcome by the use of crevassegenerated scattered coda waves to obtain homogenized diffuse wavefields before conducting cross-correlations. In order to 100 serve as a practical scheme for future studies, the three above sections are nearly independent from each other. They focus on the processing schemes to compute or improve the GF estimates. We refer the reader who is not familiar with ambient noise seismic processing to the appendix sections providing details on seismic detection methods, seismic array processing, and seismic velocity measurements. Finally, in the light of our analysis, we discuss suitable array geometries and measurement types for future applications of passive seismic imaging and monitoring studies on glaciers. Glaciogenic seismic waves couple with the bulk Earth, and can be recorded by seismometers deployed at local (Podolskiy and Walter, 2016) to global ranges (Ekström et al., 2003). In this study, we focus on three classes of local sources. For an exhaustive inventory of glacier seismicity and associated source mechanisms, we refer to the review papers of Podolskiy and 110  and Aster and Winberry (2017).
Typically on Alpine glaciers and more generally in ablation zones, the most abundant class of recorded seismicity is related to brittle ice failure which leads to the formation of near-surface crevasse (e.g. Neave and Savage, 1970;Mikesell et al., 2012;Röösli et al., 2014;Podolskiy et al., 2019) and the generation of 10 2 − 10 3 daily recorded icequakes ( Fig. 1a-b). Near-surface icequakes have local magnitudes -1 to 1 and seismic waves propagate a few hundred meters before falling below the background  Roux et al., 2010;Mikesell et al., 2012). The short duration and weak seismic coda after the Rayleigh wave arrival (compared to earthquake coda propagating in the crust, Fig. 1a, see also further details in Sect. 5.1) are the result of limited englacial scattering. This typically allows seismologists to approximate the glacier's seismic velocity model by a homogeneous ice layer on top of a rock half-space when locating events or modelling seismic waveforms (e.g. Walter et al., 2008Walter et al., , 2015. 125 From spring to the end of summer, another seismic source superimposes on icequake records and takes its origin in fluvial processes. Ice melting and glacier runoff create turbulent water flow at the ice surface that interacts with englacial and subglacial linked conduits. Gravity-driven transport of meltwater creates transient forces on the bulk Earth (e.g. Schmandt et al., 2013;Gimbert et al., 2014) and surrounding ice (Gimbert et al., 2016) that generate a mix of body and surface waves (Lindner et al., 2020;Vore et al., 2019). Meltwater flow noise is recorded continuously at frequencies 1-20 Hz as shown in the one-130 month spectrogram of ground velocity at Glacier d'Argentière (Fig. 1b). Seismic noise power shows diurnal variations that are correlated with higher discharge during daytime and reduced water pressure at night Nanni et al., 2019b).
Englacial and subglacial conduits can also generate accoustic (Gräff et al., 2019) and seismic wave resonances (Röösli et al., 2014) known as hydraulic tremors. In the presence of moulin, water flowing to the glacier base creates seismic tremors (Fig. 135 1c) which often dominate the ambient noise during peak melt hours. Frequency bands of either elevated or suppressed seismic energy reflect the geometry of the englacial conduit as it acts as a resonating semi-open pipe, modulated by the moulin water level (Röösli et al., 2016b).
Finally, in Alpine environments, seismic signatures of anthropogenic activity generally overlap with glacier ambient noise at frequencies > 1 Hz. Whereas anthropogenic monochromatic sources can usually be distinguished by their temporal pattern 140 , separation of all active sources recorded on glacier seismograms can prove difficult. Nevertheless, locating the source regions through matched-field processing (Corciulo et al., 2012;Chmiel et al., 2015) can help to identify the noise source processes in glaciated environments (Sect. 4).

Study sites and seismic experiments
We use seismic recordings from three seasonally-deployed networks in the ablation zones of two temperate Alpine glaciers 145 and of the GIS. Each of the acquired datasets presents different patterns of seismic wavefields corresponding to the three configurations investigated for GF estimate retrieval, as defined in the introduction. All networks recorded varying amounts of near-surface icequakes (blue dots in Fig. 2a-c). Different processing schemes were used to constitute the icequake catalogues and are detailed in Appendix A. In this study we only use vertical component data of ground velocity to generate ZZ crosscorrelation functions which primarily contain the Rayleigh wave fundamental mode (Shapiro and Campillo, 2004). Some of the 150 datasets involve surface seismometers whose horizontal components are regularly shifted over the course of the melting season.
Obtaining GF estimates from horizontal component data require additional pre-processing to obtain accurate orientations of the seismic sensors.

Greenland Ice Sheet array
The GIS network (Fig. 2b) was deployed 30 km North of the calving front of the Jakobshavn Isbrae from 2011 July 2 to August 17. The details of the study site and the seismic network can be found in Röösli et al. (2014); Ryser et al. (2014) and Andrews et al. (2014). We use seismic recordings from 13 stations: 12 seismometers (1 Hz Lennartz) installed on the surface or shallow boreholes (2-3 m deep), and one surface broadband seismometer (Trillium Compact 120 s corner period).

170
Seismometers recorded continuously with a sampling frequency of 500 Hz. The array has a 1.8 km aperture. It is located around a prominent moulin with an average intake of 2.5 m 3 /s of meltwater. At the study site, the ice is approximately 600 m-thick and flows at ∼ 0.3 m/d (Röösli et al., 2016a).

Gornergletscher array
The Gornergletscher network (Fig. 2c)  Geospace 11D and one 28 Hz Geospace 20D) installed in shallow boreholes (2-3 m deep). They recorded continuously with a sampling frequency of 1000 Hz. The array has a 320 m aperture. At the study site, the ice is approximately 160 m-thick and flows at ∼ 0.1 m/d (Walter, 2009).
3 Passive interferometry at the Glacier d'Argentière dense array We use here a standardized processing scheme for computing GF estimates. We either cross-correlate seismogram time win-180 dows which encompass ballistic seismic waves of the icequake catalogue, or cross-correlate continuous seismograms as traditionally done in ambient noise studies. Prior to any calculation, seismic records are corrected for instrumental response and converted to ground velocity. Seismograms are then spectrally whitened between 1 and 50 Hz because of low instrumental sensitivity at lower frequency.
For icequake cross-correlation (ICC), we follow the method of Gouédard et al. (2008b) and Walter et al. (2015) on 11.1×10 3 185 events. The length of the correlation window T is adjusted to the nature of seismic sources and the array aperture. Here we use T = 0.5 s given the short icequake duration and the maximum station separation of 690 m. To avoid near-field source effects and to account for near-planar wave fronts, we select events that lie outside a circle centered at the midpoint between the two considered stations and with a radius equal to the inter-station distance (Fig. B4a). The plane wave approximation implies a sinusoidal dependence of the arrival times with respect to event azimuth (Fig. B4b). When stacking the individual 190 ICC on all events, only the sources that lie in the stationary phase zones, i.e. aligned with the two-receiver direction, actually contribute to the GF (Gouédard et al., 2008b). The aperture of the stationary phase zones, also called "the endfire lobes" (Fig.   B4a), depends on the considered seismic wavelength (Roux et al., 2004). In the case of anisotropic source distribution, the contribution of non-stationary sources eventually do not vanish and give rise to spurious arrivals in the final GF estimate. Prior stacking, we assign all cross-correlations to event azimuth bins of 5 o to attribute equal weights to all incident directions. To 195 reduce eventual spurious arrivals, we compute the GF on selected sources in the endfire lobes whose aperture is calculated for maximum wavelengths corresponding to 3 Hz (Appendix B3).
For noise cross-correlation (NCC), we use a similar protocol as the one of . To reduce the effects of teleseismic events or strongest icequakes, we disregard the seismic amplitudes completely and consider one-bit normalized seismograms (Bensen et al., 2007). By doing so, we attribute a similar weight to ambient noise and icequake 200 source contributions to the GF. The traces are cross-correlated in nonoverlapping 30-min long windows. Resulting NCCs are stacked daily and then averaged over the five weeks of recording. We finally obtain a set of 4371 NCC that correspond to the GF estimates for all combinations of sensor pairs. For other station paths, we observe spurious arrivals (indicated by green dots) before the expected arrival times for Rayleigh (red bars) and P waves (blue bars) which primarily arise from non-distributed noise sources that lie outside the stationary phase zones of these stations (see main text). (c) Frequency-velocity diagram obtained from f-k analysis of NCC in (a). The dispersion curve of phase velocity for Rayleigh waves and P-waves are plotted in black dots. The dashed blue line shows the frequency-dependent resolution limit, given the maximum wavelength and sensor spacing λmax = ∆max/2. Black lines are theoretical dispersion curves for fundamental mode Rayleigh wave velocity computed for ice thickness either 150 or 250 m with Geopsy software. We used the elastic parameters for the ice and bedrock as given in (Preiswerk and Walter, 2018, Sect. 6.1). Same figure for icequake cross-correlations is available in Appendix (Fig. B2). Figure 3a shows the stacked section of NCC averaged in 1 m-binned distance intervals. Coherent Rayleigh waves with prop-205 agation velocity of 1600 m/s are well reconstructed across the array. We also observe emergence of weak but faster waves identified at higher frequencies as P-waves traveling in the ice.

Green's function estimates
Slight disparities in amplitudes of the causal and acausal parts of the GF estimates (positive versus negative times) are related to the noise source density and distribution. Higher acausal amplitudes observed at larger distances are evidence for a higher density of sources located downstream of the array, according to our cross-correlation definition. More sources downstream are likely generated by faster water flow running into subglacial conduits toward the glacier ice fall (Gimbert et al., 2016;Nanni et al., 2019b). Looking closer at NCC for individual receiver pairs, we sometimes observe spurious arrivals around time 0 (marked as green dots in Fig. 3b), mostly at stations pairs oriented perpendicular to the glacier flow (i.e. azimuth 3). This analysis shows that even if the noise sources are not equally distributed in space, averaging the NCC in regular distance intervals on a dense array deployment helps the GF estimate convergence.
The stacked section of ICC ( Fig. B2a) yields similar results to those of the NCC (Fig. 3a). The control of the icequake source aperture enables to minimize the spurious arrivals which are observed on some NCC (Fig. 3b) and obtain more accurate 220 Rayleigh wave traveltimes at most station paths (Fig. B2b). The differences in ICC and NCC support that NCC are more sensitive to the noise sources rather than icequake sources. Icequake contributions certainly enable to widen the spectral content of the NCC to frequency higher than 20 Hz, as the most energetic ambient noise is recorded in the 1-20 Hz frequency band ( Fig. 1b).
Seismic phases and their velocities can be identified on the frequency-velocity diagram (Fig. 3c, black dots) that is obtained 225 from frequency-wavenumber (f-k) analysis of the NCC computed on a line of receivers (Appendix B1). As identified above, the correlation functions reconstruct well P-waves traveling in the ice with an average velocity V p = 3870 m/s. We also observe weak intensity but fast seismic phases at frequencies above 35 Hz, which could correspond to refracted P-waves traveling along the basal interface with velocity around 5000 m/s. Surface waves are dispersive meaning that their velocity is frequency-dependent with higher frequencies being sensitive to  Figure 4. (a) Probability density function of noise cross-correlation (NCC) spectra (colors) and median average of icequake cross-correlation (ICC) spectra (black line). Note that raw data (i.e continuous noise or icequake waveforms) were spectrally whitened between 1-50 Hz prior cross-correlation. Due to spectral content of englacial noise and icequakes, NCC and ICC have different depth sensitivity due to spectral response. (b) Vertical sensitivity kernels for phase velocities of the Rayleigh wave fundamental mode for an ice thickness of 200 m over a semi-half space representing the bedrock. The kernels were computed using the freely-available code of Haney and Tsai (2017).

Dispersion curve inversion and glacier thickness estimation
Sensitivity of Rayleigh waves obtained on NCC to frequencies below 5 Hz enable us to explore the subsurface structure with inversions of velocity dispersion curves. Due to the general noise source locations up-flow and down-flow of the network, we limit our analysis to receiver pairs whose accurate GF could be obtained. We thus compute the dispersion curves on eight 245 along-flow receiver lines which constitute the array (inset map in Fig. 5). For each line, we invert the 1D ground profile which best matches seismic velocity measurements in the 3-20 Hz frequency range, using the neighbourhood algorithm encoded in the Geopsy software (Wathelet, 2008).
Following Walter et al. (2015), we assume a two-layer medium consisting of ice and granite bedrock. This is a simplified approximation and does not include 2D and 3D effects, and anisotropy introduced by englacial features (Sect. 3.3). The grid Table 1. Parameter ranges and fixed parameters for grid search to invert the dispersion curves in Fig. 5 for ice thickness. Poisson's ratios of ice and granite were varied between 0.2 and 0.5. Poisson's ratio, ice thickness and P-wave velocity Vp were coupled to the S-wave velocity Vs.

Material Thickness (m)
Vp ( 3870 m/s as measured in Fig. 3c, and couple all varying parameters to the S-wave velocity structure with the imposed condition of increasing velocity with depth.

265
From the eight receiver line inversions, we find average S-wave velocities of 1710 m/s for the ice, 2570 m/s for the granite and a P-wave velocity of 4850 m/s in the basement, which is consistent with our measurement for refracted P-waves in Fig.   3c. V p /V s ratios are found to be 2.2 and 1.9 for ice and granite, giving Poisson's ratios of 0.37 and 0.3, respectively. For the receiver lines near the array edges (Lines 1-3 and 8), the inversion yields to a low P-velocity surface layer of thickness 15 m and 7 m respectively, above thicker ice (dashed blue zone in Fig. 5c). In this thin top layer, the matching S-velocity corresponds 270 to the one for the ice (i.e. 1710 m/s). The V p /V s ratio is around 1.6 and corresponds to a Poisson's ratio of 0.2. This is what is expected for snow although only a ∼ 5 m snow cover was present in the area at the time of the experiment. This low-velocity surface layer could also at least be partially attributed to the presence of pronounced transversal crevasses (i.e. perpendicular to the receiver lines, see Sect. 3.3) near the array edges, which do not extend deeper than a few dozens of meters (Van der Veen, 1998) and can be modelled as a slow layer above faster ice (Lindner et al., 2019). across-flow profile of the glacier baseline, that was extracted from the DEM of Glacier d'Argentière (Sect. 2.2.1) and averaged over the geophone positions. Results show that ambient noise interferometry determines the depth of the basal interface with 280 a vertical resolution of 10 m, being equivalent to ∼ 5% accuracy relatively to the average depth, as we are able to reproduce the transverse variations of the ice thickness. Differences in ice thickness values between our measurements and the DEM are generally less than 20 m (being the DEM resolution), and the maximum error is 35 m for Line 3.
Errors and uncertainties on mapping the basal interface are primarily linked to bedrock velocities, as discussed above. Potentially, the bed properties can be refined using additional measurements from refracted P-waves that should be reconstructed 285 on NCC obtained on such a dense and large array and stacked over longer times. Ice thickness estimation is also affected by 2D and 3D effects as phase velocities are here averaged over multiple receiver pairs. The confidence interval we obtain for basal depth is similar order to the actual variations in glacier thickness along the receiver lines (black dashed lines). More accurate 3D seismic models of the glacier subsurface could be obtained using additional station pairs as discussed in section 6.

Azimuthal anisotropy from average phase velocities
290 Smith and Dahlen (1973) show that for a slightly anisotropic medium the velocity of surface waves varies in 2φ-azimuthal dependence according to where c 0 is the isotropic component of the phase velocity, A is the amplitude of anisotropy and ψ defines the orientation of the anisotropic fast-axis. On glaciers, azimuthal anisotropy can be induced by englacial crevasses, with fast direction for Rayleigh 295 wave propagation being expected to orient parallel to the crack alignment (Lindner et al., 2019). Glacier and ice sheets are also represented as transversely isotropic media whose type of symmetry depends on the ice fabric (e.g. Diez and Eisen, 2015;Horgan et al., 2011;Smith et al., 2017;Picotti et al., 2015).
The dense array experiment of Glacier d'Argentière covers a wide range of azimuths φ defined by the orientation of the station pairs, and allows us to investigate azimuthal variation in Rayleigh wave velocities at any given sensor. In order to 300 cover a maximum range of φ-azimuth, we compute velocity dispersion curves for Rayleigh waves obtained on the correlation functions computed on icequake signals (ICC), since accurate GF estimates from ambient noise are limited to station pair directions with azimuth φ roughly aligned with ice flow (φ ∼ 120 o , Sect. 3.1). To measure phase velocities, we use a slantstack technique similar to Walter et al. (2015), to octave-wide frequency ranges by bandpass filtering the individual ICC, at each station pair (Appendix B3).

305
For each sensor position, we obtain c velocity measurements as a function of the φ-azimuth of the receiver pair that includes the target station. To reduce the effect of spatial averaging, we compute anisotropy parameters ψ and A considering subarrays m-aperture arrays and found that adding the 4φ component to describe the azimuthal variations of phase velocities induced by glacier crevasses yields similar ψ and A. We therefore neglect the 4φ term in the present analysis.
Anisotropy is observed to be more pronounced near the glacier margins (lines 1-2 and 8 as labelled in Fig. 5c), where the anisotropy strength varies between 2% and 8% (Fig. 6b). There, fast-axis directions of Rayleigh wave propagation coincide with the observed surface strike of the ice-marginal crevasses that are also responsible for the generation of icequakes indicated 315 by red dots. At other locations, fast-axis directions indicate the presence of transversal crevasses (i.e. perpendicular to the ice flow) with weaker degrees of anisotropy up to 4%. While the near-surface crevasses observed at the array edges result from shear stress from the margin of the glacier, the transversal crevasses are formed by longitudinal compressing stress from lateral extension of the ice away from the valley side walls, which is typical for glacier flow dynamics in ablating areas (Nye, 1952).
Alignment of the fast-axis directions with that of ice flow appears along the central lines of the glacier (receiver lines 4-5) 320 with anisotropy degrees of 0.5% to 1.5%. This feature is only observed along the deepest part of the glacier where it flows over a basal depression. Results are here computed for seismic measurements at 25 Hz and maps of anisotropy do not change significantly with frequency over the 15-30 Hz range. If we extend our analysis down to 7 Hz, we notice that the alignedflow fast-axis pattern starts to become visible at 10 Hz. At frequencies lower than 10 Hz, the fast-axis generally tend to align perpendicular to the glacier flow because lateral topographic gradients introduce 3D effects and non-physical anisotropy. The 325 results presented here are not punctual measurements but are rather averaged over the entire ice column. The vertical sensitivity kernels for Rayleigh waves (Fig. 4b) are not zero in the basal ice layers at the considered frequencies. The align-flow anisotropic pattern is likely attributed to a thin water-filled conduit at depth, as also suggested by locations of seismic hydraulic tremors at the study site (Nanni et al., 2019a).
Generally, we observe an increase in the degree of anisotropy with frequency, which is evident for a shallow anisotropic As pointed out earlier, localized englacial noise sources related to water drainage can prevent the reconstruction of stable GF estimates by introducing spurious arrivals (i.e. Walter, 2009;Zhan et al., 2013;. In this case, the workflow processing traditionally used in NCC procedure as presented in Bensen et al. (2007) and Sect. 3 is not sufficient.
Accordingly, we need to apply more advanced processing methods that can reduce the influence of localized sources, and 340 enhance a more isotropic distribution of the ambient sources around receiver pairs.
One of the approaches we apply here is Matched-Field Processing (MFP) (Kuperman and Turek, 1997) which is an array processing technique allowing to locate low-amplitude sources. MFP is similar to a traditional beamforming that is based on phase-delay measurements. MFP was used for location and separation of different noise sources in various applications, i.e., to monitor geyser activity (Cros et al., 2011;Vandemeulebrouck et al., 2013), in an exploration context (Chmiel et al., 2016), in 345 geothermal field (Wang et al., 2012) and fault zone (Gradon et al., 2019) event detection. MFP was also used by Walter et al. (2015) to measure phase velocities of moulin tremor signals on the GIS.
Moreover, joint use of MFP and the Singular Value Decomposition (SVD) of the cross-spectral density matrix allows separation of different noise source contributions, as in Multi-Rate Adaptive Beamforming (MRABF: Cox, 2000). The SVD approach was explored by i.e. Corciulo et al. (2012) to locate weak amplitude subsurface sources, and Chmiel et al. (2015) for micro-350 seismic data denoising. Also, Seydoux et al. (2017) and Moreau et al. (2017) showed that the SVD-based approach improves the convergence of NCC towards the GF estimate. Here, we combine MFP and SVD in order to remove spurious arrivals in NCC caused by the moulin located within the GIS array, and thus improve the GF estimate emergence.

Location of noise sources at the GIS via matched-field processing
Röösli et al. (2014) and Walter et al. (2015) documented the presence of hour-long tremor signals in GIS seismic records, typically starting in the afternoon hours. These events occurred on 29 days out of the 45-day total monitoring period. Signal intensity and duration depended on the days of observations, and the energy was mostly concentrated in the 2-10 Hz range within distinct frequency bands (Fig. 1b). Röösli et al. (2014) and Röösli et al. (2016b) showed a clear correlation between water level in the moulin, and start and end times of the tremor, therefore the tremor signal is referred to as "moulin tremor". Figure 1c shows a spectrogram of a moulin tremor lasting for 10 hours on the night of 28th/29th July 2011 and recorded at one 360 station located 600 m away from the moulin. This signal is generated by the water resonance in the moulin, is coherent over the entire array, and dominates the ambient noise wavefield during peak melt hours (Röösli et al., 2014;Walter et al., 2015).
We briefly summarise the basics of MFP and the details of the method can be found in Cros et al. (2011);Walter et al. (2015) and Chmiel et al. (2016). MFP exploits the phase coherence of seismic signals recorded across an array. It is based on the match between the Cross-Spectral Density Matrix (CSDM) and a modelled GF. The CSDM captures the relative phase 365 difference between the sensors, as it is the frequency-domain equivalent of the time-domain correlation of the recorded data.
The CSDM is a square matrix with a size equivalent to the number (N = 13 for the GIS array) of stations (N -by-N matrix).
MFP is a forward propagation process. It places a "trial source" at each point of a search grid, computes the model-based GF on the receiver array, and then calculates the phase match between the frequency-domain-modeled GF and the Fourier transform of time-windowed data. The optimal noise source location is revealed by the grid points with maximum signal coherence across 370 the array.
In order to calculate the MFP output we use 24 h data of continuous recordings on the 27 th of July which encompass the moulin tremor. We calculate a daily estimate of the CSDM by using 5 min-long time segments in the frequency band between 2.5 and 6 Hz, which gives in total M = 288 of segments for one day. This ensures a robust, full-rank estimation of the CSDM (M >> N ). The modelled GFs are computed over the two horizontal spatial components (Easting and Northing) using a 375 previously optimized Rayleigh wave velocity of c = 1680 m/s corresponding to the propagation of Rayleigh waves within the array obtained by Walter et al. (2015). The MFP output is averaged over 30 discrete frequencies in the 2.5-6 Hz range.
The lower frequency bound (2.5 Hz) ensures a higher rank regime of the seismic wavefield, as defined in Seydoux et al. (2017). It means that the degree of freedom of the seismic wavefield is higher than the number of stations. The degree of freedom of the seismic wavefield is defined as a number of independent parameters that can be used to describe the wavefield    Figure 7a shows the grid search for MFP performed over Easting and Northing positions. In order to reveal the location of the source, we use the Bartlett processor (Baggeroer et al., 1993) to measure the match between the recorded and modelled 385 wavefield. The MFP output reveals two dominant noise sources: a well constrained focal spot corresponding to the moulin position inside the GIS array and another source located North of the array. The latter source is revealed by a hyperbolic shape.
This shape is related to a poor radial resolution of the beamformer for sources located outside of an array. Walter et al. (2015) suggested that this dominant source might correspond to another moulin as satellite imagery shows the presence of several drainage features North of the array. Both noise source signals contribute to the NCC. However, while the source located 390 outside of the array contributes to the stationary-phase zone ("end-fire lobes") of certain receiver pairs, the moulin located within the array will mostly cause spurious arrivals on NCC. In order to separate the contribution of these noise sources we first perform SVD of the CSDM, and then use a selection of eigenvectors and eigenspectral equalization (Seydoux et al., 2017) to improve the convergence of NCC towards an estimate of the GF.

395
SVD is a decomposition of the CSDM that projects the maximum signal energy into independent coefficients (i.e. Moonen et al., 1992;Konda and Nakamura, 2009;Sadek, 2012). It allows to split the recorded wavefield into a set of linearly independent eigencomponents, each of them corresponding to the principal direction of incoming coherent energy and bearing their own seismic energy contribution: where, K is the CSDM, N is the number of receivers, U and V are unitary matrices containing the eigenvectors, and S is a diagonal matrix representing the eigenvalues Γ, and T denotes the transpose of the matrix. The total number of eigenvalues corresponds to the number N of receivers. The CSDM can be represented as the arithmetic mean of individual CSDMs (K i ), where each K i is a CSDM corresponding to a given singular value Γ i .
The SVD separates the wavefield into dominant (coherent) and subdominant (incoherent) subspaces. It has been shown that 405 the incoherent sources correspond to the smallest eigenvalues (Bienvenu and Kopp, 1980;Wax and Kailath, 1985;Gerstoft et al., 2012;Seydoux et al., 2016Seydoux et al., , 2017. Therefore, a common noise removal method consists of setting a threshold that distinguishes between coherent signal and noise, and to keep only the index of eigenvectors that are above the threshold Here, we follow the approach of Seydoux et al. (2016) for choosing the threshold. In the 2.5-6 Hz frequency band, the wavefield is undersampled by the seismic array (which means that the typical radius of GIS seismic array is larger than half a wavelength of the analyzed Rayleigh waves). Seydoux et al. (2016) showed that in this case, the eigenvalue index cut-off threshold should be set to N/2 in order to maximize the reconstruction of the CSDM. This means that we reject the last 415 eigenvectors (from 7 th to 13 th ) as they do not contain coherent phase information. depending on the frequency related to the seismic signature of the hydraulic tremor and the distinctive frequency bands gen-420 erated by the moulin activity (Fig. 1c). Moreover, the eigenvalue distribution decays steadily, and does not vanish with high eigenvalue indexes. The latter confirms that the wavefield is undersampled by the seismic array (see Seydoux et al. (2017) for details).
The CSDM can be then reconstructed by using only individual eigenvectors as in: Note that we do not include the eigenvalues Γ in the CSDM reconstruction, which is equivalent to equalizing them to 1. That is why we refer to the reconstructed CSDM as "equalized" (Seydoux et al., 2016). a weaker focal spot corresponding to the moulin location. This indicates, similarly to the hydraulic tremor spectrum (Fig. 1c) and the singular value distribution (Fig. 7b), that the spatial distribution of dominant noise sources varies within the analyzed frequency band. Furthermore, higher eigenvectors do not reveal any strong noise sources localized within the array, and their MFP output points towards sources located outside of the array.

435
This MFP-based analysis of spatial noise source distribution allows us to select the eigenvectors of CSDM that contribute to noise sources located in the stationary phase zone (i.e. in the endfire lobes of each station path). We now reconstruct the NCC in the frequency band of 2.5-6 Hz with a step equivalent to the frequency sampling divided by the number of samples in the time window (here 0.0981Hz, so 1019 individual frequencies in total). We perform the inverse Fourier transform of the equalized CSDM reconstructed using the 1 st , 3 rd , 4 th , 5 th , and 6 th eigenvectors.

440
Figures 9a and 9b compare the NCC before (in a) and after (in b) the eigenvector selection and eigenspectrum equalization procedure. The displayed NCC are bin-averaged in fixed distance intervals (every 100 m) in order to improve the signal-tonoise ratio (SNR). The blue line shows the propagation of the Rayleigh waves with the velocity of 1680 m/s. On Figure 9a we observe spurious arrivals (marked with green dots) that dominates the NCC together with a non-symmetrical shape. On average, the CSDM equalization process (Fig. 9b) enhances the symmetry of NCC by 40%. To quantify the symmetry of NCC 445 we used the correlation asymmetry as proposed in (Ermert et al., 2015, equation 11).
Unfortunately, we notice that the equalization process reduces the overall SNR of the GF estimates and does not eliminate all spurious arrivals. This might be related to the imperfect separation of different noise sources which is likely influenced by the frequency variation of the moulin contribution. For example, we still keep some contribution of the central moulin in the 1 st eigenvector. Moreover, by removing the second eigenvector we remove not only the seismic signature of the moulin, but 450 also contribution of coherent far-field sources.
To further assess the isotropy of the reconstructed noise field, we use the conventional plane wave beamformer (e.g. Veen and Buckley, 1988). The plane wave beamforming technique estimates the isotropy and coherence of the ambient seismic noise wavefield with respect to the slowness and back-azimuth. For the plane wave beamforming calculation, we use the original ( Fig. 9c) and the previously equalized CSDM (Fig. 9d). Figure 9c, d shows the beamforming output before (c) and after (d) (Sergeant et al., 2013) or other continuous noise generated by calving and ice-mélange dynamics in the proglacial fjord of nearly Jakobshavn Isbrae (Amundson et al., 2010), one of Greenland's largest ice-streams.
Finally, it seems that not much seismic energy is incident from the inland of the East GIS.

465
After the eigenspectrum equalization, we are able to extract Rayleigh waves dispersion curve from the averaged seismic section obtained in Fig. 9b. For calculating the averaged dispersion curve we use a version of the Aki's spectral method (Aki, 1957) which consists of fitting a Bessel function to the real part of the cross-correlation spectrum. This method is referred to as SPAC and is described in Appendix B2. The Rayleigh wave phase velocity dispersion curve averaged over all station measurements is shown in Fig. 9e with yellow squares and errorbars representing the measurement discrepancies for 470 individual NCC. The dotted line presents an apriori Rayleigh velocity dispersion curve extracted from Walter et al. (2015).
High discrepancies observed at lower frequencies mainly arise from the limited frequency band for computing the NCC (see the appendix method section B). The slight differences between the two dispersion curves might be related to the different approaches used for phase velocity dispersion curve extraction (MFP in Walter et al. (2015), and SPAC in the current work). and in our study we use only one-day of data.
Several additional tests could be used to further improve the SNR of the NCC and their convergence to GF. For example, a similar procedure could be performed on other days, and the eigenormalized NCC could be stacked over a few days to increase the SNR. However, we verified that the index of eigenvectors corresponding to the moulin changes over days (the moulin can be located in the first, second, third, etc eigenvector). This is the reason why it would be useful to find an automatic criterion for 480 the eigenvalue selection based on the MFP output. However, this is beyond of the scope of this paper. Another improvement could consist of azimuthal stacking the NCC according to the direction of the noise sources, although the GIS array does not have sufficient azimuthal and spatial coverage to implement this. Moreover, we could envisage calculating a projector based on the SVD (as in MRABF) only for the time period when the moulin is active and then project-out the moulin signature from the continuous seismic data.

485
In summary, we conclude that the CSDM eigenspectrum equalization together with beamforming-based selection of eigenvectors is an useful method to separate seismic sources in a glaciated environment. It can further improve the GF emergence from ambient seismic noise in the presence of strong, localized englacial noise sources for imaging applications.

Cross-correlation of icequake coda waves: a window-optimization approach
Contrary to ballistic waves, the likely diffuse coda arises from multiple seismic scattering (Aki and Chouet, 1975;  In the following, we explore the application of coda wave interferometry (CWI) on selected near-surface icequakes in 495 Gornergletscher, to estimate the GF which could not be obtained from traditional processing of icequake cross-correlations because of lacking sources in the stationary phase zones of the seismic array (Fig. 2c). The use of icequakes here is fundamentally different than in section 3, in that the ballistic arrivals are specifically avoided (whereas in the other case, the ballistic component was the primary source of energy in the cross-correlation functions).

500
The strongest 720 events chosen out of more than 24000 icequakes detected at Gornergletscher exhibit a sustained coda with approximate duration of 1.5 s (Fig. 1d-e). The propagation regime of seismic waves can be identified by the evolution of the elastic intensity ("coda power spectrum") being the squared seismic amplitudes (color lines in Fig. 1d). Before the source energy has reached the receiver, the elastic intensity is equal to some background or ambient level. Once the source pulse arrives at the receiver, the intensity rises up and then begins to decay exponentially. This is the ballistic regime. After several 505 mean-free times which are to be related to the scattering strength (De Rosny and Roux, 2001), the intensity begins to decay diffusively with time as multiple scattering slows the transport of energy out of the scan region (Malcolm et al., 2004). This is the diffusion regime and it is characterized by a linear decay of the coda intensity (Aki and Chouet, 1975). Eventually, intrinsic attenuation (anelastic loss) dominates and the energy falls to the noise level. Figure 1d shows such linear decay of the coda power spectrum starting at ∼ 0.5 s indicating that icequake seismogram 510 signals contain enough scattered energy that may approach from a wide range of directions assuming that the scatterers are homogeneous around the network site (Chaput et al., 2015a). In the present study we do not investigate further the cause of wave scattering in glaciers and particularly in Gornergletscher but suggest a relation to the presence of conspicuous nearsurface crevasses (Fig. 2c) and deeper fractures as intermediate-depth and basal fault planes have been reported at the study site (Walter et al., 2008, as well as topography gradients, reflections at the glacier margins and/or rock and air 515 inclusions.

Coda wave interferometry and Green's function estimate
We first apply a standardized CWI processing scheme following Gouédard et al. (2008b). The cross-correlations are computed on 10-30 Hz spectrally whitened seismograms to reduce the influence of background noise. As a first guess, coda waves are arbitrarily time windowed around 0.5-1 s (gray horizontal bar in Fig. 1d) by looking at the decay of the waveform amplitudes.

520
The first sample of the coda correlation window corresponds to the two station average of the time when the seismogram envelope falls below 5% of the ballistic wave maximum amplitudes. Because of the decrease of coda amplitude with time, we cannot perform a simple cross-correlation between the coda signals without strongly overweighing the earliest part of the coda.
To avoid this problem, we follow Paul et al. (2005) by disregarding the amplitudes and considering one-bit signals. For the pair of closer stations (Fig. 10b), the reconstructed acausal times still depend on the source position while the CWCC causal times are stable with the event azimuth. The source position signature on one side of the correlation function could be an effect of heterogenous scatterers which cause single scattering and then skew the illumination pattern at one side of the receiver pair. Another explanation could be that the correlation window used here for CWI is still influenced by the incoming 535 energy flux from ballistic waves which then create an anisotropic incident wavefield as we are in the presence of limited energy diffusion (Paul et al., 2005).
Focusing on a complex scattering medium at the glaciated Erebus volcano (Antarctica), Chaput et al. (2015b) showed that symmetric GF could be recovered when optimizing the icequake coda correlation window over the sources. In the case of a weak scattering medium as glacial ice, the coda time window for the diffusion regime should notably depend on the distance of 540 the scatterers to the recording seismic sensor. We therefore use a similar optimizing-window processing scheme for improving the GF convergence at each station pair.
The overall processing and technical details of coda window optimization are described in Chaput et al. (2015b). We refer to the method as MCMC processing as it involves a Markov Chain Monte Carlo scheme. A Bayesian inversion determines the best coda window to generate a set of CWCC that are the most coherent and symmetric across the source events. We first MCMC processing involves several parameters that need to be tuned. As for traditional ICC processing we need to define the frequency band of analysis (here 10-30 Hz) and the coda correlation window length T . Here we use T = 0.5 s and we force the algorithm to shift the correlation window to no later than 1.5 s in order to stay within the icequake coda and to not correlate noise (see Fig. 1d). We use N = 40 for event-binning, M = 10 for event trace selection, and use up to 2 × 10 4 iterations. We need to define the portion of the cross-correlation stack where we want to optimize the causal-acausal symmetry, 555 and the relative importance of the causal-acausal symmetry and the CWCC matrix coherency that is used to optimize the misfit function (coefficient factors A and B in Chaput et al., 2015b, equation 2). We choose here to optimize the GF symmetry for both reconstructed ballistic and coda waves, i.e. at later times than what expected for a Rayleigh wave propagating at velocity 1700 m/s, and we weight the symmetry and coherency evenly. Figure 11a shows the MCMC optimization of the CWCC symmetry at the station pair already presented in Fig. 10d. Blue 560 and red lines are the causal and acausal parts of the correlation stack, respectively. Solid and dashed lines are the resulting CWCC obtained from MCMC processing and standard processing (i.e. first iteration of the MCMC inversion), respectively.
While the ballistic Rayleigh waves could not be reconstructed in the acausal part of the correlation function using the first coda wave windowing, the MCMC output approaches the symmetrical GF as we see a Rayleigh wave propagation in both directions, and also the emergence of a coherent coda in both parts of the correlation function. The MCMC inversion gives optimized coda 565 windows in the range 0.7-1.2 s for the majority of events, i.e. at later times than the initially used coda window. Enhanced symmetry of the CWCC computed in later time windows is consistent with the expectation to have a more isotropic diffuse wavefield as all directions of propagation are closer to be equally represented after several mean free paths (Paul et al., 2005). Figure 11b shows the final stack of CWCC gather sorted by increasing inter-station distance and averaged in 10 m-binned intervals. For comparison, CWCC computed with the standard processing are in blue. CWCC are noisier that ICC obtained 570 from correlations of icequake ballistic waves when computed on homogeneous source distributions (Gouédard et al., 2008b).
Nevertheless, at most station pairs, the MCMC processing managed to extract Rayleigh waves with consistent traveltimes in both causal and acausal parts. We extract a dispersion curve of phase velocities averaged over the station components (Fig.   11c) using the SPAC method already used in section 4.2 (Appendix B2). We find an average Rayleigh wave velocity of 1600 m/s which is in the estimate range of what Walter et al. (2015) find at Gornergletscher using slant-stacking of ICC arrival times 575 (Appendix B3). Errors introduced at lower frequencies arise from the limited frequency band used for computing the CWCC and filtering effects (see the appendix method section B).

Coda wavefield isotropy and Green's function convergence
MCMC processing coherently increases the presence of energy at zero lag-time ( Fig. 11a-b). Such spurious arrival likely arises because scattered coda also contains a strong vertically trapped body wave that correlates at 0 across relatively near receivers, On the other hand, spurious arrival contributions will not vanish in case of localized scatterers around the seismic array if the incident waves do not illuminate the scatterers with equal power from all directions because of limited source aperture (Snieder et al., 2008). This second argument is also supported by the observations of non-symmetric CWCC. At some locations, there still exist differences in the amplitudes of the Rayleigh waves in the causal and acausal parts of the final GF estimate (Fig. 11a), 590 meaning that the icequake coda is not entirely diffuse and may result from single reflections on preferred scatterers. Paul et al. (2005) could not obtain symmetric CWCC from regional earthquake coda seismograms and attribute this to the long-lasting anisotropy of the diffuse energy flux. Indeed, in weak (or homogeneous) media, the incident energy flux from earthquakes can still dominate the late coda resulting in GF time-asymmetry, provided the sources are located in the same distant region. The CWCC asymmetry is expected to disappear with an isotropic distribution of sources or scatterers around the seismic network.

595
In the case of Gornergletscher icequakes, we still see the influence of the energy flux approaching from the direction of the source at a few station pairs and for some events as depicted in Fig. 10b, supporting the argument for single scattering rather than multiply scattering as the cause of icequake coda.
In general, CWI must be processed on carefully selected events which show sustained coda above the background noise. We try coda wave correlations on the Argentière node grid and notice an influence of the source position on the retrieved CWCC 600 likely because we did not select strong enough signals with sustained coda that is coherent enough across the sensors. Similarly, GF convergence does not work for weak Gornergletscher icequakes. Moreover, the abundance of seismic sources in glaciers often pollutes coda wave seismograms. We often find the situation where ballistic body and surface waves generated by early aftershocks from repetitive and subsequent events (or bed reflections) arrive at the seismic sensor only a few milliseconds after the onset of the first event of interest and therefore fall in its coda window. This typically introduces anisotropic wavefields.

605
The brief icequake coda duration, the interevent time distribution and the weak scattering in glacial ice impose limitations of CWI on large arrays.
To conclude, even if limited, the extraction of GF from icequake coda waves allows imaging a glacier subsurface between station pairs. In principle, this can be done even in cases when (skewed) distribution of icequake sources or sustained noise sources does not allow for GF estimation.

Discussion
The three methods proposed in this study could theoretically be applied to each one of the explored dataset but were not further tested here. The standard processing schemes proposed in section 3 for computations of NCC and ICC successfully work on Glacier d'Argentière as we benefit from favorable seismic illumination patterns and redundancy of the measurements on the dense array. The need of spatial averaging of the correlation responses in regular distance intervals could potentially be 615 overcome when (1) stacking the GF over longer time range of typically weeks or months (e.g. Sabra et al., 2005;Larose et al., 2008) or (2) applying more advance processing schemes as the one proposed for GIS and Gornergletscher studies, to overcome anisotropic source distribution effects.
Eigenspectral equalization of the cross-spectral matrix of ambient noise seismograms (Sect. 4) enable to distinguish propagation directions of incoming seismic energy. This method improves the spatial homogenization of the incident wavefield 620 in order to reveal weaker sources that could eventually be used for extracting the GF at sensor pairs which initially lacked stationary phase contributions. Same technique could be applied to Glacier d'Argentière in order to improve the GF estimates which show spurious arrivals arising from the dominant sources near the glacier tongue (Fig. 3b). By doing so, we would improve our spatial coverage of seismic velocity measurements and could invert a more accurate 3D model of the subsurface. The eigenspactral equalization method is particularly well suited for large sensor arrays, also the geometry of the glacier discharge 625 system (channelized versus distributed) is the primary controlling factor for successful applications.
In the absence of distributed noise sources, the use of icequakes is a good alternative especially in winter when the glacier freezes preventing for generation of coherent water flow noise. As icequakes propagate to few hundreds meters, ICC studies are well suited for medium-size arrays with aperture of typical order of 500 m. In the case of anisotropic distributions of icequakes which map the crevassed ice, GF estimates can be optimized with CWI and the coda-window optimization approach used in 630 section 5. CWI successfully works on the strongest selected events at Gornergletscher because we could record a coherent coda at adjacent sensors, with seismic amplitudes above background noise level. CWI should be appropriate for smaller icequake datasets (here we employ the method on a loop of 700 events, i.e. less than 7% of the icequake catalogue used in Argentière) and smaller size arrays to be able to record coherent coda across the network (typically 250 m-wide). Such arrays can be deployed in targeted regions where pervasive crevasses are present, i.e. typically near the glacier margins. As an example, CWI 635 could help to estimate the GF at smaller subarrays at the edges of a larger array as in the configuration of Glacier d'Argentière experiment.
In the following we discuss the type of array deployment, geometry and measurements suitable for structural and monitoring studies using the GF obtained by either one of the processing described above.
6.1 Implications for glacier imaging 640 The performance of an array for imaging the structure at depth first relies on its geometry, secondly on the wavefield characteristics as discussed in section 3.1. Wathelet et al. (2008) recommend that the array diameter should be at least as large as the longest wavelength of interest (we conventionally take two to three wavelengths). The minimum station spacing for any direction should be less than half the shortest wavelength of interest to avoid spatial aliasing. To be able to sample the basal interface and target elastic parameters of sediments constituting the till layer or the bedrock, one should design suitable 645 array geometry. For a glacier thickness of 200 m, we need an array aperture of at least 600 m to measure propagating surface waves with a one-wavelength cycle. For a 500 m-thick glacier, we need sensors that are at least 1500 m apart, although prior knowledge on the basal interface allows to better constrain the inversion of depth with lower-size arrays.
Seismic velocity measurements can additionally be complemented by other types of observations computed on the horizontal and vertical components of the GF, such as the horizontal-to-vertical (H/V) ratio. Assuming a horizontally homogeneous 650 medium, the resonance frequency of the H/V ratio spectrum can be used to constrain the layered structure (Zhan et al., 2013) and the ice thickness (Picotti et al., 2017) or to investigate 3D effects of the recorded wavefield . H/V ratios and dispersion curves of surface wave velocities can be jointly inverted (Lin et al., 2014) for an even more accurate 3D model of the glacier subsurface.
6.2 Implications for glacier monitoring 655 Repeated analysis of cross-correlation analysis allows to detect changes in the subsurface properties. Seismic velocity monitoring is usually performed in the coda part of the cross-correlation function through the application of CWI, as multiple scattered coda waves are less sensitive to the source distribution and travel larger distances accumulating time delays (Hadziioannou et al., 2009). CWI enables to detect relative velocity changes as small as 0.1% (e.g. Sens-Schönfelder and Wegler, 2006;Brenguier et al., 2008a, b;Mainsant et al., 2012).

660
CWI computed on cross-correlation functions could lead to the monitoring of englacial crevasses, failure of calving icebergs, glacial lake outburst flood, break-off of hanging glaciers, surface mass balance and such bed conditions as the evolution of the glacier hydraulic system and subglacial till properties. Such topics are currently investigated with active seismics or through the spatio-temporal evolution of passive seismicity and associated source mechanisms (e.g. Walter et al., 2008;Bartholomaus et al., 2015;Preiswerk et al., 2016;Podolskiy et al., 2017;Lipovsky et al., 2019). Passive seismic monitoring of glaciers could 665 lead to the detection and understanding of processes related to climate conditions, glacier hydraulics and ice flow dynamics, which today are labor-intensive to investigate with active geophysical measurements.
Unfortunately, the weak ice scattering limits the emergence of coherent coda in the correlation functions which appear still to be affected by the changing nature of primary ambient noise sources (Walter, 2009;. To overcome source effects and enhance the coda SNR, it is possible to design arrays for multidimensional deconvolution (MDD) of the time-averaged cross-correlations. Seismic interferometry by MDD measures the illumination pattern (e.g. Wapenaar et al., 2011;Weemstra et al., 2017) using recordings by a set of additional receivers along a contour which goes through the virtual-source's location. MDD processing then enables to remove the imprint of the (non-uniform) source distribution from the correlation responses and improves the quality of the retrieved GF. We follow Lindner et al. (2018) and use the outer receiver contour of the Glacier d'Argentière array (black triangles in Fig. 12a) as virtual boundaries for MDD of the icequake 675 correlations computed at one pair of sensors and sorted for a two-day window with 50% overlap. MDD processing appears successful in retrieving accurate GF estimates which feature a coda that is coherent over time (Fig. 12c) and does not appear on more classical ICC as defined in section 3 (Fig. 12b). The technique based on virtual reflectors allows to create artificial coda which consists of multiple reflections of the ballistic wavefield trapped within the receiver contour. This approach is a promising candidate for a monitoring scheme of any changes in englacial seismic velocities even in the absence of scattering 680 coda. Long-term installations directly on or in the ice have increased in recent years due to technological improvements (Aster and Winberry, 2017). MDD could be applied to glacier seismic sources recorded over time scales longer than a month, using borehole sensors which ensure a solid coupling to the ice.

Summary and concluding remarks
This study explores the application of seismic interferometry on on-ice recordings to extract the elastic response of the glacier 685 subsurface beneath one array deployment. In contrast to ambient noise studies focusing on the Earth's crust, the GF retrieval from cross-correlations of glacier ambient seismicity is notoriously known as difficult due the limited spatial coverage of glacier point sources and the lack of seismic scattering in homogeneous ice. We investigate the GF emergence on three particular cases.
We design processing schemes suitable for each configuration of seismic deployment, wavefield characteristics and glacier setting. In Gornergletscher (Sect. 5), cross-correlations of icequake coda waves show evidence for a quasi-homogenized incident wavefield as a result of seismic scattering by the crevassed ice. An optimization of the coda window based on the coherency and the causal-acausal symmetry of the cross-correlation functions is used to improve the GF convergence. CWI allows to retrieve GFs at seismic arrays where icequakes are not recorded evenly around the study site.

705
The capability of extracting accurate seismic velocities on a line of receivers can be substantially improved when stacking the correlation functions on a large number of receiver pairs. However, MPF and CWI allow for new kinds of measurements on sparse seismic networks and enable to speedup the GF convergence for non-idealized seismic illumination patterns which commonly arise in glacier settings.
Finally, the use of nodal sensor technology enables fast deployment of large N-arrays suitable for seismic interferometry 710 studies. This opens new ways for characterizing and monitoring glacial systems using continuous passive seismic recordings.
Code and data availability. All data except for Argentière are available by request on the data center of the Swiss Seismological Service (http://seismo.ethz.ch). Argentière data are hosted at ISTerre and will be made soon available. Access can be granted by request to Philippe Roux (philippe.roux@univ-grenoble-alpes.fr) or Florent Gimbert (florent.gimbert@univ-grenoble-alpes.fr). Obspy Python routines (www. obspy.org) were used to download waveforms and process icequake catalogues. NCC of Argentière dataset were computed using the MSNoise Python package (www.msnoise.org, Lecocq et al., 2014).
Author contributions. PR and FG designed the Argentière experiment in the scope of the RESOLVE project (https://resolve.osug.fr/). FL, PR, FW and AS processed the icequake catalogues. AS processed the correlation functions and analyzed the results at Argentière. MFP of Greenland data was processed by MC. CWI at Gornergletscher was processed by AS and FW with the input of JC. Icequake correlations by 720 MDD were processed by FL. AM provided the code for computing phase velocities by fitting the cross-spectrum with a Bessel function. AS and MC prepared the manuscript with contributions from all co-authors.
Competing interests. The authors declare that they have no conflict of interest.
SNSF and ETH Zürich participated to the data collection in Greenland (grants 200021_127197 SNE-ETH and 201 ETH-27 10-3) and on Gornergletscher (grants 200021-103882/1, 200020-111892/1). We gratefully acknowledge all the people involved in the RESOLVE project and who participated to the array deployment in Argentière, collected and processed raw formats of seismic data. We thank Olivier Laarman, Bruno Jourdain, Christian Vincent and Stéphane Garambois for having constructed bed and surface DEMs using ground penetrating radar for the bed and drone data for the surface. We also thank Léonard Seydoux for insightful discussions on MPF and SVD.
Appendix A: Construction of icequake catalogues There exists a wide range of seismic sources in glaciers as well as detection schemes (see Podolskiy and Walter, 2016, for a review on the methodology). The processing employed for icequake detections and localizations must be adapted to the type of network (number of sensors, sensor spacing and array aperture) and to the type of the events of interest which involve 735 various waveforms (e.g. Walter, 2009;Röösli et al., 2014;Podolskiy and Walter, 2016). Dispersive Rayleigh waves are well suited for investigating the glacier subsurface including the basal interface as they are primarily sensitive to the S-wave velocity structure and can sample depths up to approximately 1/3 of the wavelength. We then focus our study on one class of glacier seismic events being surface icequakes generated by ice crevassing. Another advantage of using such events is their high rate of time occurence (∼ 10 2 − 10 3 recorded events per day) and their potentially wide spatial coverage which is optimal for the 740 application of seismic interferometry techniques . We here introduce the methods used to compute the icequake catalogues at Gornergletscher (Sect. A1 and A2) and Glacier d'Argentière (Sect. A3).

A1 Icequake detection
Seismic waveforms of surface icequakes generally exhibit a first low-amplitude P-wave followed by impulsive Rayleigh waves ( Fig. 1a). Such events can be detected using a template matching on continuous seismograms (Mikesell et al., 2012). This 745 cross-correlation method exploits the signal coherency with a reference waveform and can be used on single stations or across a network. Nevertheless, the most common and straigthforward detection approach is to implement an amplitude threshold trigger.
The most broadly used algorithm in weak-motion seismology and on glaciers for detecting impulsive events (e.g. Walter, 2009;Canassy et al., 2012;Barruol et al., 2013) is the "short-time-average long-time-average ratio trigger" referred to as 750 STA/LTA (Allen, 1978). It continuously calculates the average values of the absolute amplitude of a seismic signal in two consecutive moving-time windows. The short time window (STA) is sensitive to seismic events while the long time window (LTA) provides information about the temporal amplitude of seismic noise at the recording site. When the ratio of both exceeds a preset value at a single station or coherently across a network, an event is declared.
As icequakes usually propagate to a few hundred meter distances before attenuating to the background noise level, the 755 amount of identified events varies with the network configuration and the minimum number of stations to required trigger concurrently. For the Gornergletscher study, we work with events that have been detected by running a STA/LTA trigger over 5-15 Hz bandpass filtered continuous seismograms, using an STA windows of 0.3 s (i.e. typical icequake duration), and an LTA windows 10 times longer with a threshold value of 8. To declare an event, we require at least half of the network stations to trigger.

A2 Icequake location
The vast majority of icequakes recorded on glaciers are localized near crevasses that extend no deeper than ∼ 30 m (Walter et al., 2008;Lindner et al., 2019). To locate the events at Gornergletscher, we fix the source depth to the surface and invert for the epicenter distance following the automated approach of Roux et al. (2010) and Walter et al. (2015), also similar to Mikesell et al. (2012). This method employs cross-correlations to automatically measure differences in Rayleigh wave arrival 765 times across the network.
To be able to record coherent icequake waveforms with high enough signal-to-noise ratios (SNR) at couples of sensors, the network aperture should be less than 1 km, or at least consist of several pairs of stations whose separation is shorter than the distance at which surface waves start to be strongly attenuated in the ice.
In the same spectral band that is used for event detection, icequake signals are first windowed around the Rayleigh wave 770 and cross-correlated for each pair of stations to obtain time delays. Time delay measurements are then refined to subsample precision by fitting a quadratic function to the cross-correlation function centered on its discrete maximum. The best Easting and Northing coordinates of the source are concurrently inverted with the apparent propagation velocity to match the time delay catalogue for pre-selected pairs of stations. We only consider time shift measurements at pairs of stations whose crosscorrelation maximum is above 0.8. This allows us to minimize complex source and/or propagation effects on seismic waveforms 775 and the observed arrival times that can not be fit with the oversimplified velocity model.
The inversion process is an iterative procedure using a quasi-Newton scheme (Roux et al., 2010, equation 3). Reliability of icequake locations varies as a function of the events being inside or outside the network. Using a seismic network similar to the one of Gornergletscher used in the present study, Walter et al. (2015) estimated that, in the azimuthal direction the error remains below 2 • for average apparent velocities in the range of 1600-1650 m/s.

780
A3 Array processing: matched-field processing using beamforming A seismic network is called an array if the network aperture is shorter than the correlation radius of the signals, which is the maximum distance between stations at which time series are coherent, i.e. typically less than 1 km for glacier sources recorded by on-ice deployments (Podolskiy and Walter, 2016). A seismic array differs from a local network mainly by the techniques used for data analysis.

785
Dense sensor arrays have many advantages as the SNR can be improved by summing the individual recordings of the array stations. Compared to a single or couples of sensors, array processing techniques, such as beamforming, allow for time domain stacking which constructively sums coherent signals over the sensors and cancels out incoherent random noise, enhancing the signal detection capability.
Continuous data are scanned through matched-field processing (MFP) which involves time domain beamforming (Kuperman 790 and Turek, 1997). Beamforming uses the differential travel times of the plane wave front due to a specific apparent slowness (inverse of velocity) and back azimuth to individual array stations (Rost and Thomas, 2002). If the single-station recordings are appropriately shifted in time for a certain back azimuth and velocity, all signals with the matching back azimuth and slowness will sum constructively. MFP can be processed using a decomposition of seismic signals in frequency components.
To be declared as an event and furthermore, with accurate location, the beampower of aligned seismic waveforms at a given 795 frequency (i.e. the norm of the cross-spectral density matrix and the array response, see Lindner et al., 2019, equation 3) must pass a preset trigger threshold. The final event location can be averaged from the beamforming solutions obtained at several successive discrete frequencies (assuming that slowness and back azimuth are close to constant in the considered frequency band).
MFP was successfully used by Corciulo et al. (2012) to localize microseismic sources at the exploration scale using ambient-800 noise data. Moreover, recent studies focused on developing an automatic, optimization-based MFP approach that does not require grid-search to localize thousands of weak seismic events in a complex fault zone (Gradon et al., 2019) and hydraulically fractured area (Chmiel et al., 2019).

Appendix B: Computation of phase velocity dispersion curves
Because dispersive surface waves of different frequencies propagate at different speeds, computation of seismic velocities generally involves Fourier analysis to decompose the wave into frequencies that compose it. One can distinguish two types of wave speeds.

815
The phase velocity c is the speed at which the phase of a wave propagates in space and is related to the angular frequency ω and the wavenumber k as The angular frequency is related to the time periodicity of the signal of frequency f as ω = 2πf . The ground displacement is also periodic in space over a distance equal to the wavelength λ that is used to describe how the wave oscillation repeats in 820 space via the wavenumber k = 2π/λ.
If the harmonic waves of different frequencies propagate with different phase velocities, the velocity at which a wave group propagates differs from the phase velocity at which individual harmonic waves travel (Stein and Wysession, 2009). The group velocity u of a wave is the velocity with which the overall energy of the wave propagates through space. If the signal has energy over a wide range of frequencies, u = dω/dk and the group velocity is related to the phase velocity as 825 u(ω) = c(ω) + dc dk (B2) In ambient noise tomography, it is common to measure the group velocity of dispersive Rayleigh waves travelling in the Earth's crust and upper mantle (e.g. Shapiro et al., 2005;Mordret et al., 2013). Dispersion curves of group velocities are usually computed using the Frequency Time Analysis (FTAN) of the noise cross-correlation time-series (Levshin et al., 1992). In glaciers, due to homogeneous ice, only weakly dispersive surface waves are recorded at on-ice seismometers. It is then 835 difficult to use FTAN to measure Rayleigh wave group velocity dispersion. We choose here to compute the phase velocity dispersion curve for Rayleigh waves using different approaches. Obtaining the group velocity from the phase velocity is then straightforward while the reverse is not possible due to unknown additive constants which arise from the integration of the phase velocity over frequency (equation B2).

B1 Array processing: frequency-wavenumber analysis 840
Frequency-wavenumber analysis (f-k) was used to compute the velocity dispersion curves at Glacier d'Argentière (Sect. 3.1 and 3.2). The f-k analysis is a standard array processing for computing phase velocities from seismic time-series recorded on a line of receivers (Capon, 1969). It enables to identify and separate wave types and wave modes and also design appropriate f-k filters to remove any seismic energy in the original signal time-series.
The most basic f-k processing employs a 2D Fourier transform on both time and spatial components to construct the f-k 845 diagram (Fig. B1b). We then need to select the dispersion curve of the phase of interest by picking the energy maxima of the 2D Fourier transform output. The absolute value of the f-k space is then transformed into the velocity c(f ) via the equation B1 (Fig. B1c).
There exist multiple array techniques for computing frequency-velocity diagrams using spectral analysis in time and space domains. Some of them are described in Rost and Thomas (2002), Gouédard et al. (2008a) and referenced in Ohrnberger et al.

850
(2004). Concerning the cross-correlation functions obtained at Argentière array, we employ the phase-shift method of Park et al. (1998) which allows to construct a frequency-velocity diagram where dispersion trends are identified from the pattern of energy accumulation in this space. Then, necessary dispersion curves are extracted by following the diagram amplitude trends ( Fig. 3c and Fig. B2c).
The performance of an array for deriving phase velocity values in a wavenumber or frequency range depends on its geometry 855 and on the wavefield characteristics (i.e. frequency range and magnitude of seismic energy with respect to attenuation). The capability to resolve phase velocity at a given frequency depends on the array aperture (described by the array diameter ∆ max ) and minimum sensor spacing (∆ min ) so that at least two wavelengths are sampled between adjacent receivers to avoid aliasing in the wavenumber domain (e.g. Wathelet et al., 2008). Phase velocities should then be computed for frequencies which satisfy ∆ min ≤ nλ ≤ ∆ max , with n usually taken as 2 or 3. This relationship depends on the expected phase velocity as λ = c/f .

B2 Aki's spectral method
This method was used to compute the Rayleigh wave velocity at the GIS and Gornergletscher arrays (Sect. 4.2 and 5.2).
Whereas the f-k techniques are based on the assumption of a plane wave arriving at the array, the Aki's spectral method (also referred to SPAC method, Tsai and Moschetti, 2010) bases its theoretical foundation on the precondition of a scalar wavefield which is stationary in both space and time. As detailed below, this technique does not require specific array geometries to 865 compute phase velocities and can be used on single pairs of stations as long as one is in the presence of an isotropic incident wavefield. Another advantage concerns the capability to resolve discrete frequencies on a potential wider range than what is possible using f-k methods as the Aki method produces robust and unbiased measurements at distances smaller than two wavelengths (Ekström et al., 2009). The major limit is set by the seismic wavefield characteristics. Aki (1957) states that the azimuthally averaged normalized cross-spectrum S(∆, ω 0 ) for a receiver separation ∆ and fre-

870
quency ω 0 varies as J 0 , the zero-order Bessel function of the first kind This relation suggests that the dispersion curve of phase velocities can be obtained from the fit of a J 0 Bessel function to the cross-spectrum obtained on a loop of receivers of same radius ∆, or also, as demonstrated by Cox (1973), to the cross-spectrum obtained for a single station pair if computed on an azimuthally isotropic wavefield.
875 Ekström et al. (2009) successfully obtain phase velocity estimates at discrete frequencies from ambient noise cross-correlations, by associating the zero-crossing of the real part of the data cross-spectrum with zeros of a Bessel function following equation B3.  and Lindner et al. (2018) both use this method to obtain dispersion curves of Rayleigh wave speeds from cross-correlations of on-ice seismic records.
Application of this method is presented in Fig. B3 for one cross-correlation function obtained at Argentière. Because of 880 possible noise contained in the correlation time-serie, the cross-spectrum is first smoothed to avoid any extra zero-crossing measurement. As there is a possibility of having missed one or several zero-crossings (indicated by black squares in b), several dispersion curves are generated (black dashed lines in c). The correct one still needs to be identified by judging the plausibility of the results given the expected velocities of the propagation medium (Ekström et al., 2009).
The dispersion curve estimation can be refined by fitting the entire cross-spectrum with a Bessel function, instead of fitting 885 the zero-crossings only. We develop an approach similar to Menke and Jin (2015) who employ a grid search to generate an initial estimate of the phase velocity that matches the observed cross-spectrum and then use a generalized least-squares procedure to refine this initial estimate.
The prior dispersion curve has to be as close as possible to the measured dispersion curve to avoid cycle skipping during the fitting. In our procedure, we take as a starting model, the average dispersion curve obtained from f-k analysis of the cross-890 correlation section computed at Argentière array. For sparse network configuration (as in Greenland and Gornergletscher, i.e. Fig. 9d and 11c), we take the theoretical dispersion curve computed with Geopsy software and based on the prior knowledge we have on the subsurface. The dispersion curves are then modelled as polynomial functions to enforce their smoothness and to reduce the number of fitting parameters and help the convergence of the fitting process. The order of the polynomial can be varied. Usually a polynomial order of 5 is a good compromise between smoothness and complexity of the dispersion curve. 895 We then use a least-square inversion procedure of the polynomial coefficients to estimate the dispersion curve which best reproduces the observed cross-spectrum. Fig. B3 shows the output of this procedure which yields to the same dispersion curve (red line in c) as the most probable one computed by fitting the zero-crossings. The overall-fitting method is particularly efficient for estimating more accurate phase velocities than with the zero-crossing fit, when considering correlation functions with low SNR (Menke and Jin, 2015).

900
However it is particularly sensitive to the frequency range in which the least-square inversion is performed. When considering cross-correlation functions computed in narrow frequency bands as in Sect. 4 and 5, the method introduces strong side effects near the frequency corner limits due to filtering ( Fig. 9d and 11c). The cross-spectrum must then be fitted considering carefully selected frequency components.
In the example shown in Fig. B3, the cross-correlation function is computed for frequencies 1-25 Hz. The Bessel fitting 905 method is applied to frequencies above 2 Hz and enables to widen the velocity estimates down to 3 Hz (that is the frequency at which the inter-station distance is approximately equal to one wavelength) when compared to the zero-crossing output.

B3 Slant-stack technique on discrete sources
This method employed by Walter et al. (2015) can only be applied on cross-correlation functions that are computed on discrete sources (i.e. icequakes). It was used to obtain the velocity from ICC at Glacier d'Argentière in section 3.3. We here exploit the 910 phase time-difference in the arrival times of Rayleigh waves with respect to the source position, and reproduce the azimuthal variations of phase times assuming a constant velocity.
The plane wave approximation implies a sinusoidal dependence of the arrival times which depend on the source azimuth and propagation velocity c as ∆ cos (θ)/c with ∆ the station separation and θ here defined as the source azimuth relative to the station pair axis (Fig. B4 a-b). We call the endfire lobes the two areas aligned with the receiver direction, in which the phase 915 of the correlation function is stationary with respect to azimuth. The angular aperture of the endfire lobes depends on the ratio between the seismic wavelength λ and the station separation as δθ = λ/∆ (Roux et al., 2004;Gouédard et al., 2008b).
To measure seismic velocities at one station path and at different frequencies, we filter the individual correlation functions computed for each event to octave-wide frequency ranges. The lower frequency we can resolve is determined by the icequake spectral content and is most importantly related to the station separation as we require that at least two wavelengths are sampled.
arrival time of the Rayleigh wave is measured as the maximum of the correlation function computed at each frequency. We then invert the best sinusoide fit to the times of the maxima (in the least-square sense). The best solution gives the velocity estimate at the central frequency of the spectral band.

925
The velocity solution estimated by this method is naturally averaged over the azimuth range and can only be considered as the average velocity in the presence of strong azimuthal anisotropy which implies azimuthal variations of propagation velocities (Sect. 3.3). Nevertheless, the least-square solution fits very well the Rayleigh wave arrival times in the azimuth range of the stationary phase zones (Fig. B4b), and then is considered to represent well the propagation medium between the two stations.
To minimize the effects of location errors or low SNR of the correlation components, we perform a jackknife test on randomly 930 selected events to fit the sinusoide. We require that the maximum of the cross-correlation stacked over bootstrap samples (i.e. selected correlation functions that have been shifted by the inverted arrival times prior stacking) exceeds 0.7. We require that at least 10 azimuth bins including the endfire lobes are considered in the sinusoidal fit. The final velocity at each frequency is then averaged over 200 jackknife tests. to the cross-correlations in (b). In this particular case, a receiver contour (consisting of 32 regularly-spaced receivers) and a single receiver between these receiver lines (green triangle) are illuminated by 4282 sources (i.e. icequakes) on either side of the receiver cavity. The receiver colored in red is here turned into a virtual source, whose response is recorded at the green receiver. Due to the reflecting boundary conditions and the receiver geometry, the emitted wave travels back and forth between the two receiver lines indicated with the black dashed lines.
We obtain multiple reflections noted Ri (i indicates the number of virtual reflection), which are visible on the MDD correlation gather and averaged stack in (c). Virtual reflections Ri create an artificial coda after the ballistic Rayleigh wave reconstructed on the GF estimate. This coda is observed to be coherent through time (here we show cross-correlation stacks computed on two-day sliding window with an overlap of one day) and is suitable for CWI studies. (b) is the same as (c) but for standard processing of icequake cross-correlations at the two sensors in green and red. The real part of the spectrum of (a) is in black and associated zero-crossings are marked by squares. The gray line indicates the real part of the spectrum associated with a priori phase velocity dispersion curve which serves as a starting model for the least-square fit (in red) of the observations. (c) Corresponding phase velocities estimated by zero-crossings (black dashed lines) and least-square fit (red). The prior dispersion curve used for the Bessel fit is plotted in gray.
The black arrow indicates the minimum frequency above which we can trust the velocity measurements and corresponds to approximately one wavelength.