Interactive comment on “ An updated and quality controlled surface mass balance dataset for Antarctica ” by V .

a consequence, defining objectively whether annual layers have been preserved (and can be easily identified) is totally crucial. We do not believe that it is generally achievable for such a large database. For instance, at Dome C, observation of stake networks suggests that erosion is present at 30% of the stakes, although the area presents a positive mean distributed accumulation every years (see Glacioclim stake data at Dome C, http://www-lgge.ujf-grenoble.fr/ServiceObs/SiteWebAntarc/dc.php). At plateau stations, and more generally “low accumulation sites”, missing some annual layers seems rather plausible. However, even for more coastal regions (e.g., in Adelie Land the Glacioclim-Samba stake network) we observed that several stakes may present erosion during low accumulation years, even if the mean accumulation there is around 300 mm w.e.a-1. As a consequence, dating techniques based on layer counting does not offer absolute ages, and were rated “B”. However, it should be noted that “B” rated data are not rejected but conditionally accepted. Because all the data are available in the full database, it is suggested that scientists perform their own quality control on the data. In this paper we decided using only the “A” rated data, because this option was the more restrictive and less subject to discussion. However, a similar analysis may be done with less restrictive criteria.


Introduction
In the context of global warming, particular attention is being paid to the mass balance of the Antarctic ice sheet (AIS) and its impact on sea level rise (e.g., Lemke et al., 2007;Shepherd et al., 2012).With a surface area of 12.3 × 10 6 km 2 , the annual surface mass balance (SMB) of the grounded ice represents a huge mass flux, which is expected to increase in the future leading to a significant compensation of solid ice discharge, and hence to the eustatic sea level rise (e.g., Monaghan et al., 2006;Krinner et al., 2008).However, because reliable field information concerning the Antarctic SMB is scarce, the integrated SMB over the continent presents a large uncertainty (between −4.9 ± 0.1 and −5.7 ± 0.3 mm sea level equivalent a −1 ; Lenaerts et al., 2012b).Thus, it is crucial to aggregate all available field data to better constrain interpolation techniques based on modeling or remote sensing data.
Even though several methods have been developed to assess the SMB in the field (see Eisen et al., 2008, for a review), direct SMB measurements are rare in Antarctica and existing ones generally span a very local area (e.g., stake and ice core measurements).The size and remoteness of the AIS and the harsh climatic conditions make long-term investigation difficult.All available data have only been compiled once previously by Vaughan and Russell (1997).This Antarctica database (hereafter referred to as V99) was described in detail by Vaughan et al. (1999).The V99 database legitimately became a reference for climate studies in Antarctica and was Published by Copernicus Publications on behalf of the European Geosciences Union.
V. Favier et al.: An updated and quality controlled surface mass balance dataset regularly used for model validation (e.g., Van de Berg et al., 2006;Krinner et al., 2007Krinner et al., , 2008;;Lenaerts et al., 2012b).However, only partial updates have been undertaken since 1999 (e.g., Magand et al., 2007;Van de Berg et al., 2006;Lenaerts et al., 2012b), even if important new datasets have been acquired since 1999.For instance, during the last international polar year 2007-2008 (IPY) (IPY), several inland traverses were performed with several scientific goals including filling the gaps in SMB measurements.In the framework of the international TASTE-IDEA programs (Trans-Antarctic Scientific Traverse Expeditions -Ice Divide of East Antarctica), isolated measurements and traverses were performed, as from Troll station to South Pole (Anschütz et al., 2009), from the Swedish Wasa station to the Japanese Syowa station (Fujita et al., 2011) and along the French traverse to Dome C (Verfaillie et al., 2012).
Based on the V99 database, several authors interpolated the SMB data to the whole AIS.The current surface accumulation value integrated over the grounded ice sheet is generally assumed to range between 143 mm w.e. a −1 (Arthern et al., 2006) and 168 mm w.e. a −1 (Van de Berg et al., 2006).These two studies are generally considered the most reliable ones: Arthern et al. (2006) computations included interpolation methods of remote sensed passive microwave data to accurately fit the observed SMB from the V99 database (Monaghan et al., 2006), and Van de Berg et al. (2006) calibrated model results.However, these values should be considered with caution because a reliability check of the V99 data, as proposed by Magand et al. (2007), was not performed before interpolating field data.In fact, different problems affect estimates of the Antarctic SMB, particularly limited or unwarranted spatial and temporal coverage and measurements inaccuracy (Magand et al., 2007).Surface measurements bias can strongly affect SMB estimation for the whole Antarctica (e.g., Genthon et al., 2009;Lenaerts et al., 2012b).Such a bias was observed by Verfaillie et al. (2012) who identified a serious discrepancy between the SMB of Arthern et al. (2006) and recently updated SMB estimates for Adelie Land.Similar discrepancies were also mentioned from observation of SMB in the Norway-USA traverse (Anschütz et al., 2009(Anschütz et al., , 2011)).Further, SMB interpolations (e.g., by passive microwave) may be inaccurate in steep slope terrain, in wind glazed snow areas (Scambos et al., 2012) and in melting snow areas (Magand et al., 2008).
Here, we present an updated SMB database for Antarctica.An important part of the work was documenting and formatting so-called "metadata" (e.g., time coverage, measurement methods, altitude) which is required when using data, especially to check the quality of the SMB values.In the next Sect.2, we present this updated database, we describe the improvements in spatial coverage, and compare the data with the V99 dataset (Sect.2.2).A quality control allows us to reject data considered as unreliable (Sect.2.3).The impact of this quality control on the spatial distribution of reliable data over Antarctica is discussed in Sect.2.4.In Sect.3, we compare the data with ERA-Interim reanalysis (Simmons et al., 2006), and show the importance of the selected data for climate model validation.The comparison highlights the remaining gaps in the spatial coverage of surface mass balance data in Antarctica, and the biases that can occur when interpolating these data.Finally, in Sect.4, we discuss the main gaps in the SMB database and suggest how to achieve a better estimate of the Antarctic SMB.

Definitions
The surface mass balance (or net accumulation of snow/ice; hereafter referred to as SMB) can be expressed as the balance between the accumulation and ablation terms as follows: where P S , P L , ER, SU and RU are solid precipitation, liquid precipitation, erosion by the wind, sublimation and runoff, respectively.Drifting snow deposition is represented by a negative ER term.Hence SMB is the result of the competition between accumulation and ablation terms.The knowledge of erosion or deposition is crucial in windy areas where these processes lead to extremely high spatial variability of SMB values.For instance, in the coastal area of Adelie Land, the SMB may change from negative to highly positive values within a distance of one or two kilometers (Agosta et al., 2012).

The fully updated database
Because the international polar year (IPY) recently provided a large amount of new SMB data, an update of existing SMB compilation is timely.We consequently updated the V99 database by including the large amount of new SMB data obtained since 1999 (Fig. 1b).Important new information was obtained during the European EPICA and international TASTE-IDEA programs, when isolated measurements and traverses were performed (Fig. 1a), including in Dronning Maud Land (e.g., Rotschky et al., 2007), from Ross Sea to Talos Dome (French-Italian contribution to ITASE (Frezzotti et al., 2004).Measurements were also taken along the French traverse to Dome C (Agosta et al., 2012;Verfaillie et al., 2012), along the Norway-USA scientific traverse from South Pole to Dronning Maud Land (Anschütz et al., 2009(Anschütz et al., , 2011;;Müller et al. 2010), and along the Japanese-Swedish traverse from the Swedish Wasa station to the Japanese Syowa (also spelled Showa) station (Fujita et al., 2011).A large new dataset was acquired from Zhongshan station to Dome A by the Chinese Antarctic Research Expedition (CHINARE) (Ding et al., 2011).Some traverses have also been revisited like the Japanese traverse from Syowa to Dome Fuji (e.g., Motoyama and Fujii, 1999;Motoyama, personal communication, 2011), resulting in a major update and  Bull (1971) which were directly excluded from the Vaughan et al. (1999) database due to their low reliability (digitalized from maps).Background map is elevation according to (Bamber, 2009).(c) Location of reliable field data (black dots) and selected datasets for model validation.Background map is elevation according to (Bamber, 2009).completing SMB data close to Fujiwara and Endo (1971) route.Finally, we also present unpublished stake data from the coast to Princess Elizabeth station which result from the collaboration between the Belgian Antarctic expeditions and the French Polar Institute (IPEV) in the framework of the GLACIOCLIM observatory (Agosta et al., 2012).However, in this paper, we did not include SMB values obtained with ground-penetrating radar (GPR), because -unlike stake mea-surements for example -these are indirect measurement of SMB, and require an interpretation of radargrams.In fact, difficulties in signal processing and interpretation may occur in picking the reflectors, which are sources of error (Verfaillie et al., 2012).Moreover, even if radargrams are available as graphs, the age of reflectors is generally not identified in publications, and getting data from publication is not straightforward.Thus, we choose to not include the published GPR data In addition to SMB values, information essential for a quality control is also provided, i.e., location, methodology, altitude, local mean temperature, distance to the coast, dates of measurements, SMB units in the primary data sources, time period covered by the SMB values, primary data sources.This primary information was retrieved for both new data and for previous V99 data, which enabled us to correct several data.For instance, correction of longitude for measurements on Siple Coast was possible thanks to the primary publication (Thomas et al., 1984;Bindschadler et al., 1988).In some cases, if measurements were a short distance apart (within approx.20 × 20 km 2 ), the V99 database only gives their averaged values.Instead, we documented each data point.This was mainly the case at the South Pole and along traverses around Lambert Glacier, in Wilkes Land and from Syowa to Dome Fuji (Table 1).This increases the number of available measurements by 1493 (Table 1) (even though these data did exist in the V99 database it was at a lower spatial resolution).Of these 1671 data, 215 from Lambert Glacier traverse to Dome A were updated using new measurements made since 1999.These data offer a more accurate description of small scale (1 to 2 km scale) SMB spatial variability.Other specific characteristics were also added to the database, for instance, the presence of blue ice and of megadunes (when available in primary sources).
Retrieving the primary information was complex because the whole information is usually not available in one single publication.After tracking down previous publications, we were able to select the most relevant data together with precise information on the method used and the location.This included digitalizing data from figures or maps when necessary, which is clearly indicated in the final database.Finally, when different time periods were available for a single location (for instance, when several layers were reliably dated in ice cores), SMB estimates are given for each period.
This involved compiling and documenting more than 5800 SMB data distributed over the whole continent (Fig. 1b).Following Magand et al. (2007), we rejected data that did not correspond to measurements of annual SMB.This was the case of 255 data provided by Bull (1971) for which metadata are missing (e.g., Vaughan and Russell, 1997).Several data, as for instance between Dome Fuji and South Pole, can be traced as probably originating from a traverse undertaken in the area before 1971 (Fujiwara and Endo, 1971).However, original publication suggests that data are not highly reliable, justifying their rejection.
The full updated surface mass balance of Antarctica database (called the SAMBA-LGGE database) now contains 5548 data (Table 2).This database is fully and freely available on the GLACIOLIM-SAMBA Observatory website: http://www-lgge.ujf-grenoble.fr/ServiceObs/SiteWebAntarc/database.php.

A reliable dataset extracted from the full database
A first update and improvement of the V99 database was performed by Magand et al. (2007), who focused on a limited part of Antarctica (90-180 • East Antarctic sector).These authors applied a quality control to SMB estimates based on objective criteria of reliability, as initially suggested by Bull (1971).We applied the quality rating based on measurement techniques provided by Magand et al. (2007).We do not discuss the quality and reliability of the method here because this has already been done by Magand et al. (2007), but the main explanations for the data rating are summarized in Table 3.The quality control enabled us to select only reliable SMB values leading to a new subset, hereafter referred to as "A" rated dataset.The measurement techniques we considered very reliable are rated "A".Techniques considered less reliable are provisionally accepted and rated "B", while those considered unreliable are rated "C" (Table 3).Like Magand et al. (2007), we also rejected data when information that was crucial for the quality control was missing, i.e. location, SMB value and unit, method and period covered (for stake data).
Results rated "A" form a new dataset of 3539 reliable SMB values (Table 2, Fig. 1c).This is about four times more than the 745 reliable data obtained by Lenaerts et al. (2012b), who conducted a similar quality control on the V99 database.Since our aim was to retrieve a high quality dataset, our data filtering may be too restrictive.Note that the fully documented database is available on the GLACIOCLIM-SAMBA (hereafter referred to as GS) website, so that any other control can be applied to the data.

Analysis of the "A" rated dataset
The impact of the quality control on the distribution of available data over Antarctica was tested by comparing the full database with the "A" rated dataset (Table 2).The quality control led us to remove data from large areas (Fig. 1c), mainly in West Antarctica.In particular, measurement lacks for a large area between Marie Byrd Land and the coast.This is particularly important because models were initially suspected to have common positive biases (i.e., overestimated SMB) compared to surface accumulation compilations (Genthon and Krinner, 2001;Van de Berg et al., 2006).Since data for this area are not reliable, it is difficult to know whether the models are correct or not.Data availability is also particularly poor for the region from the Filchner-Ronne ice shelf to the South Pole, and for the Pine Island glacier catchment, which was the site of considerable research in the past but where SMB values were usually obtained through snow stratigraphy studies (e.g., Pirrit and Doumani, 1961;Shimizu, 1964).Stratigraphy data are generally assumed to be ambiguous because precipitation is low, presents high annual variability, and is affected by strong surface snow metamorphism, resulting in partial or sometimes total obliteration  Bull, 1971)."A" rated data Strict quality control (see Table 3) , only "A" rated 3539 data are retained.For 20th century model validation Blue ice data, data covering more than 70 yr, and data with 3242 differences in elevation of more than 200 m from the digital elevation model from Bamber et al. (2009) were excluded.
of annual layering (e.g., Magand et al., 2007).Other large datasets from traverses to and around the South Pole were also excluded because the data were originally obtained from digitalized maps (e.g., Bull, 1971) or from snow stratigraphy studies (Brecher, 1964).Finally, the quality control resulted in a huge reduction in available SMB values at Siple Coast and on Ross Ice Shelf because the data are mainly stake measurements made over only one year (Bindschadler et al., 1988;Thomas et al., 1984).Because inter-annual variability of snow accumulation is large in Antarctica, a one year SMB estimate cannot be representative of the mean local SMB, and more than 3 yr are required to obtain an accurate estimate of the average SMB (Magand et al., 2007).However, this data gap is not as serious because snow accumulation on the Ross Ice Shelf does not affect the grounded ice SMB so that changes in accumulation in this area do not directly affect sea level rise.Nevertheless, surveying possible future melting over the ice shelf is an important scientific concern and obtaining new SMB data there is essential.The proximity of the main Antarctic station (McMurdo station) is an ideal opportunity to plan future studies since it is the departure point for scientific research on the Ross Ice Shelf.
The removal of suspicious data considerably has modified the distribution of the SMB.Especially, the SMB-elevation relationship is different when calculated with only the "A" rated dataset or the whole dataset.There is a significant difference between 200 m a.s.l. and 2000 m a.s.l.over East Antarctica (Fig. 2a), because few observations are made over this elevation range and removing incorrect data thus had a significant impact on the mean SMB.Because large A-rated datasets exist in the high interior of the 0-90 • E sector of Antarctica (Fig. 1), the difference between "A" rated dataset and the whole dataset was limited above 2500 m a.s.l.over the East Antarctic Ice Sheet (EAIS) (Fig. 2a).However, inaccurate observations provide a wet bias at high elevations in the 90-180 • E sector of Antarctica, as already observed by Magand et al. (2007).There was a significant difference at every elevation over West Antarctica (Fig. 2b) because the number of unreliable observations is high for all elevation ranges on this side of the continent.The mean SMB of areas with field measurements (Table 4, see values in italics) over Antarctica differed significantly before (154 mm w.e. a −1 ) and after the quality control (140 mm w.e. a −1 ), and the difference was even higher in West Antarctica (238 versus 157 mm w.e. a −1 ) than in East Antarctica.
After the removal of unreliable data, the SMB of Antarctica can be studied with more confidence.The SMB significantly increases from 200 m to 1000 m a.s.l., although with marked scattering (Fig. 3).At higher elevations, between 1800 and 4000 m a.s.l., the SMB and its scattering decreases progressively as the SMB is very low over interior plateaus.The frequency distribution of surface elevation for the entire continent or for only the observation points differs (Fig. 4a), which means that the observations are not equally distributed as a function of altitude.Indeed, the frequency of surface elevations in Antarctica peaks at around 0 m a.s.l.(ice shelves) and at 3200 m a.s.l., with a very broad maximum between 1800 m a.s.l. and 3400 m a.s.l., whereas a narrow www.the-cryosphere.net/7/583/2013/The Cryosphere, 7, 583-597, 2013  maximum appears at 2800 m a.s.l. in the case of SMB measurements.Although new data at low elevation were added to this dataset, low elevation areas are not sufficiently documented considering their contribution to the total SMB and to the high spatial variability of their SMB.There is still insufficient available data and measurements were mainly made in East Antarctica.The low density of field measurements is a serious obstacle to accurately assessing the Antarctic SMB (e.g., Van de Berg et al., 2006).
Each SMB value was measured over a different period of time.Ninety percent of the periods covered less than 20 yr and 43 % less than 5 yr (Fig. 4c, d).The covered period is closely related to the method used to estimate the SMB.The major cause of the stair-like distribution of the histogram in Fig. 4d is the presence of data from very large stake networks (e.g., around Lambert Glacier; Higham and Craven, 1997;Ding et al., 2011), that span only a few years.Dating known horizons in cores or snow pits (volcanic eruptions, nuclear tests) is accurate and provides good estimates of the SMB over long periods (15 to 60 yr).But these observations are isolated because they are difficult to perform at a high spatial density.On the other hand, stake measurements are very useful because they are generally made at a high spatial density, which leads to a correct sampling of the actual SMB distribution in the field.This is particularly useful in coastal areas, because stake networks provide relevant information over a wide range of elevations, and enable the increase in SMB caused by orographic precipitation to be accurately measured   (e.g., Agosta, 2012;Agosta et al., 2012).Stake networks also allow information to be collected on the inter-annual variability of the SMB.However, acquiring long time series requires the maintenance of a regular stake network with regular renewal of the stakes and annual assessment of stake height and density, which is difficult over long periods.For this reason, stake measurements generally cover periods of less than 10 yr.Hence, stake measurements represent the largest pro-portion (82 %) of observations, because several large stake networks (containing many stakes) exist, but were measured only a few times.For these reasons, the scientific community cannot rely only on this method to increase data density for continental scale.
4 Comparison of the "A" rated dataset with results of ERA reanalysis 4.1 A subset of data used for the comparison Regional features like elevation, continentality, location of sites relative to major and minor ice divides, surface slope and so on, clearly impact SMB distribution in Antarctica.However, large-scale features do not have the same consequences on SMB distribution, because SMB is more precisely related to how depressions penetrate inland and provoke precipitation, and on how the wind affect snow distribution.Although perfectible, model outputs are useful here because of their large scale coverage and their ability to predict geographical distribution of the current and future SMB.
Thus combining observational data with model outputs is essential both to identify biases in the model but also biases due to heterogeneous data coverage.
It is difficult to compare spotty field data and model outputs on a regular grid.For this reason, we defined a special dataset for a (basic) model validation.Because climatic models generally focus on climatic conditions at the end of the 20th century, we filtered the database for this period, to avoid possible long-term climate variations.Here, we only considered data covering the last 70 yr, leading to a slight reduction in the database (52 data were removed).We are aware that this process does not remove the decadal bias of each datum, because data present distinct time coverage.Now, this sub-dataset should be rescaled to a reference time period to produce a homogeneous climatology.But our purpose here was not to provide an accurate SMB map at the scale of Antarctica, but to compare the available field information with ERA-Interim data to judge if their spatial distribution is sufficiently regular and dense to allow model validation.In a future work, data will be rescaled against a common period to remove regional trends caused by heterogeneous coverage of time.
Several data were further left aside because the elevation (as given in published works) differed from the local elevation given by the 1 km resolution digital elevation model (DEM) of Bamber et al. (2009).Differences may result from errors in compiling field data (for instance, if an elevation or geographic location was incorrectly estimated in the field).Differences can also be due to the DEM resolution (1 km), because local variations in topography may be smaller than those of the real terrain.A significant error in the DEM which may apply to several points is also possible when the slope is very steep.Consequently, we removed data for which the www.the-cryosphere.net/7/583/2013/The Cryosphere, 7, 583-597, 2013 difference in elevation exceeded a 200 m threshold (Fig. 5).This led to the removal of 44 observations.Finally, when validating the climate model, we noted that a few points still require a detailed analysis: 26 observations by Sinisalo et al. (2003) and 164 observations on Taylor glacier by Bliss et al. (2011) were in blue ice areas and should not be included in a validation process unless the climate model concerned took erosion and sublimation processes into account (Fig. 3).Removing these data does not guarantee that snow drift sublimation or transport plays no role at the other points but allows focusing where these processes are not the major ones.These additional removals led to a subset of data totaling 3242 observations for comparison with model outputs (Table 2).
We also chose to focus on low elevation areas of Antarctica where much of the snow accumulation occurs.Seventy percent of the Antarctic SMB accumulates below 2000 m a.s.l., although this elevation range represents only 40 % of the total area of Antarctica.Low elevation areas are those where spatial variability in the SMB is the highest, and where the largest future changes in SMB are expected to occur in the 21st century (e.g., Krinner et al., 2007Krinner et al., , 2008;;Genthon et al., 2009;Agosta, 2012).Conversely, accumulation over interior plateaus is very low (less than 50 mm w.e. a −1 ) and rather homogeneous over long distances as the topography is flat.Thus, field observations at low elevation are most appropri- ate for model validation, as already demonstrated in coastal Adelie Land, where data from the GS observatory allowed us to identify a number of discrepancies in various models (Agosta et al., 2012).Because low elevation areas (that is, where high SMB values are observed: Fig. 4b) are undersampled by field observations, a focus on these specific areas is necessary.We selected datasets starting from coastal regions and extending inland, in order to include a strong topographic contrast (between 0 m a.s.l. and 2000 m a.s.l., and sometimes extending up to 3000 m a.s.l.when data from a continuous traverse were available).These Data cover the peripheral regions and key catchments of Antarctica.We further selected homogeneous data in terms of temporal coverage and methodology, and gathered data resulting from the same initial publications and origin.This led us to select the 10 datasets listed in Table 5 and shown in Fig. 1c, corresponding to traverse lines in Adelie Land (GS dataset), around Law Dome, from Zhongshan to Dome A, around the west side of Lambert glacier (above Mawson station), from Mirny to Vostok and from Syowa station to Dome F. Considering the spatial density of measurements, these data are particularly appropriate for model validation in coastal areas.We additionally selected three datasets not from traverses but from points located in Byrd region, along the Antarctic Peninsula and in Dronning Maud Land.
For Dronning Maud Land, Mirny to Vostok and the Peninsula, these observations cover a wide range of elevations (Fig. 6a) and present a very low spatial density.These values thus provide important information on the regional increase in the mean SMB but data are also highly impacted by small scale variability due to local erosion or deposition processes (e.g., Eisen et al., 2009;Agosta et al., 2012).In addition, Byrd, Peninsula and Dronning Maud Land are atypical climate settings, but it is important to study these particular areas because considerable environmental changes are expected to occur there in the future.For instance, the Byrd dataset presents the particularity of low SMB values in low elevation areas (Fig. 6b).
Among these datasets, the GS dataset and the one from Law Dome are particularly appropriate for model validation, because they have a high spatial resolution and cover a long observation period.Data from Zhongshan to Dome A (CHINARE in Fig. 6) and the west side of Lambert glacier (above Mawson station) are mainly located above 1500 m a.s.l.(Fig. 6a): this reduces their usefulness for studying processes that take place at low elevations.Data from Syowa station to Dome F traverse cover a more interesting range of elevations but 75 % of these observations are also above 1500 m (Fig. 6a), where SMB is low (Fig. 6b).

Available SMB data from ERA-Interim reanalysis
Because reanalysis provide valuable information to study climatic features during recent decades, these data were used to study whether the SMB database allows us to reconstruct an accurate description of the main SMB distribution features in Antarctica.Reanalysis have been largely used to estimate climatic conditions and the Antarctic SMB (e.g., Monaghan et al., 2006;Genthon et al., 2005;Agosta et al., 2012), as well as to force regional circulation models (e.g., Van de Berg et al., 2006;Lenaerts et al., 2012a;Gallée et al., 2013).The reanalysis methodology is based on assimilating meteorological observations (e.g., Bromwich et al., 2011), which provides more reliable outputs than classical atmospheric models.ERA-Interim (Simmons et al., 2006) likely offers the most realistic depiction of precipitation in Antarctica (e.g., Bromwich et al., 2011), which justifies to focus on these data.
In the following section, ERA-Interim SMB values are tested against the SMB values of our database.The aim is to evaluate the accuracy of the ERA-Interim reanalysis data, and conversely, to check whether some areas are insufficiently documented in the database to allow model validation and to evaluate an accurate SMB average.We focused on the datasets for elevations between 0 and 3000 m a.s.l.(Table 5).
ERA-Interim is an improved operational analysis: efficient four-dimensional variational data assimilation (4D-Var) is performed by taking additional data into account.ERA-Interim data are produced by applying the IFS model (Cy31r2 version), running in spherical harmonic representation (T255, nominal resolution of 80 km).Calculations are performed on 60 vertical levels (hybrid pressure-sigma coordinates) from the surface to the mesosphere at 0.1 hPa or 65 km.Here, we used ERA-Interim outputs over the period 1989-2010, even though data are now available for the period 1979-1988.Data were interpolated over a 15 km Cartesian grid resulting from a stereographic projection with the standard parallel at 70 • S and the central meridian at 15 • W. The liquid phase (P L and RU; see Sect.2.1 for abbreviations) is assumed to refreeze entirely.The simulated SMB is thus the balance between precipitation (P S and P L ) and sublimation (SU).The model used for ERA-Interim does not account for wind erosion or deposition processes (ER).Snow drift and wind processes are expected to have significant effects on SMB when wind speed is high (e.g., Gallée et al., 2013;Lenaerts et al., 2012a).These processes introduce a major uncertainty in SMB computations by ERA-Interim in low elevation areas.Hence, in our study, we did not focus on areas where SMB is controlled by snow erosion over long  distances, in this case, large blue ice areas.However, these data are still available in the full database, and should be included if the atmospheric model or the studied processes include erosion.
To compare simulated and observed SMB values, we extracted grid boxes including at least one field measurement.Each field datum was then compared to the simulated one of the corresponding grid cell.We also calculated the average of all observed values included in the same model grid cell, and compared it to the SMB simulated by ERA-Interim.Observed and modeled data were compared as a function of elevation.

Comparison between the subset of SMB data and ERA-Interim outputs
Averaging ERA-Interim simulated data over the grounded ice sheet leads to a value of 128 mm w.e. a −1 (4.4 mm a −1 in terms of sea level equivalent).This estimate is among the lowest published values (Monaghan et al., 2006), and is well below estimates by Vaughan et al. (1999) and Arthern et al. (2006).This low value is mainly due to very low accumulation modeled at high elevations (above 2000 m a.s.l..), where ERA-Interim is known to considerably underestimate the actual amount of solid precipitation, and also below 1000 m a.s.l., where ERA-Interim overestimates ablation.The areas located below 1000 m a.s.l.cover a narrow belt around Antarctica, in mountainous regions (the Antarctic Peninsula, in Palmer Land, along the Transantarctic mountains at 160 • E and in Mary Byrd Land).This elevation range is crucial for the Antarctic SMB because it concentrates most of the total accumulated SMB.
In grid cells containing measurements, ERA-Interim values are close, although lower, than measurements (Fig. 7a).This shows that SMB measurements are reasonably well reproduced by ERA-Interim.Performing the same comparison with non-"A" rated data (Fig. 7b) shows a lower quality relationship between data and model, suggesting that the filtering process removed lower accuracy data.Nevertheless, for "A" rated data, each elevation range between 200 and 1000 m a.s.l., the mean simulated SMB computed over all grid cells is significantly higher than the one computed over grid cells containing measurements (Fig. 7a: red circles versus red squares).With the hypothesis that ERA-Interim output is close to the real world also for areas with no observations, this means that field data mainly reflect the low SMB areas and poorly constrain areas where SMB values are high, suggesting that observations do not correctly sample the SMB between 200 and 1000 m a.s.l.(as already suggested in Sect.3.1).Above 2500 m a.s.l., this discrepancy does not hold, suggesting that the observations may be representative of the entire range of elevations over the icecap.Nevertheless, large part of the plateaus of the EAIS are characterized by wind-glazed areas where wind and sublimation removes the annual solid precipitation (e.g., Scambos et al., 2012), creating a hiatus in accumulation which is problematic for ice core interpretation.Existing "A" rated data are thus located only across the wind glazed areas, leading to overestimates of the net mass accumulation (e.g.Scambos et al., 2012) which may partly justify the wet bias observed according to ERA-Interim output (Fig. 7).
The datasets selected at low elevations also provide interesting information.The ERA-Interim simulation fits observations acceptably despite significant differences (Fig. 8).In some cases (in Adelie Land (GS dataset), Syowa station to Dome F, from Zhongshan to Dome A (CHINARE), around Lambert glacier) ERA-Interim misses the mesoscale variability, while in other cases (Law Dome, Byrd, Peninsula) ERA-Interim is mainly too dry.A large proportion of SMB differences is due to biases in the surface elevation used by the model.In fact, temperature and all related energy fluxes directly depend on elevation.However, some of the differences are directly related to the model's inability to simulate accurate SMB values.For instance, ERA-Interim assumes too low albedo values at low elevations (values between 0.1 and 0.75) and calculates too high runoff and sublimation.Overestimation of melting by ERA-Interim has already been demonstrated (Agosta et al., 2012) and may be accounted for by considering that liquid water entirely refreezes.However, incorrect albedo values have serious consequences for the entire surface energy balance (SEB), for instance on sublimation.Finally, we observe that SMB variability is very large at the 1 km scale in coastal areas (see GS, Syowa station to Dome F, and Zhongshan to Dome A traverses, for instance: Fig. 8a, f, e).Using data points every 10 or 50 km (see Law Dome for instance: Fig. 8c) does not distinguish the regional mean from local variability.A survey of dense stake networks is clearly better in such cases.Another way to obtain a better estimate of spatial variability may be to use ground penetration radar (GPR) data to interpolate SMB point estimates from ice cores (e.g., Verfaillie et al., 2012).

Discussion and conclusions
In this paper, we present an up-to-date surface mass balance database for the entire Antarctic continent, including relevant information about the data (location, measurement methods, time period covered, specificity of the data, references) and recommendations for the use of data in particular regions.This database was carefully checked with a quality control.This method of selection was designed to keep only highly reliable data.The quality control led to a significant change in data distribution over Antarctica and in mean regional values; although, as already shown by Magand et al. (2007), this process removes suspicious data that could have a major impact on any kind of SMB interpolation (e.g., Magand et al., 2007;Genthon et al., 2009;Verfaillie et al., 2012;Lenaerts et al., 2012b).
Inspection of the "A" rated dataset showed that our knowledge of SMB distribution is even less than previously supposed, because for large areas data are unreliable.This is particularly true in the Antarctic Peninsula, in West Antarctica, and along the margins of the ice sheet.Large scale field campaigns in these regions should thus be a scientific priority, with particular focus on elevations between 200 and 1000 m a.s.l., because measurements are currently mainly located in low SMB areas and no measurements are available in large areas in which a significantly higher SMB is expected.
Despite these limitations, the present work provided a new and more reliable database for climate model validation.The datasets described in this paper should make a correct assessment of model quality possible in several specific areas (see Table 5).For model validation, similar approaches to those performed by Agosta et al. (2012) with the GS network should be extended to the whole of Antarctica, using any climate model and the selected datasets.In the present study, we demonstrated the interest of comparing field data with ERA-Interim outputs.On one hand, our comparison confirmed that ERA-Interim reasonably fits observations, even though the computed SMB presents significant dry biases.On the other hand, the comparison demonstrated that observations do not correctly sample the SMB between 200 and 1000 m a.s.l., and that very few data are available for high SMB areas.New field data along the AIS margin and new traverses in unexplored areas are thus still required to validate climate models for Antarctica.To fill the knowledge gap, research should be performed in the Antarctic Peninsula, between Marie Byrd Land and the coast, on Ronne and Ross Ice Shelves, because these are areas where data are less reliable.Important scientific and logistic stations are located in these regions (e.g., McMurdo station, Byrd station), which are ideal opportunities to plan future traverses.Traverses may revisit routes that were already explored during the sixties and seventies, but using current techniques to offer more reliable SMB estimates.Explorations should associate GPR studies to pits and ice cores (with absolute dating techniques) to get continuous and accurate SMB data, as suggested by the ITASE program (e.g., Anschütz et al., 2009Anschütz et al., , 2011;;Fujita et al., 2011;Verfaillie et al., 2012).Finally, observation should focus where remote sensed data (passive microwave) are not reliable, i.e. in steep slopes, in wind glazed areas and where melting may occur.
The current quality-checked database is now available to assess the temporal and spatial variability of the Antarctic SMB.First, the dataset can be rescaled to obtain a temporally unbiased SMB climatology for the end of the 20th Century.This temporal rescaling step may be performed against ERA data.For this task, field data from each specific period and each region will be rescaled based on the SMB difference given by ERA between this specific period and a reference period.Second, collecting available GPR data in Antarctica into a similar database is highly relevant and is now timely.This is currently under process at NASA (by the SUMup working Group).When available, the data will be adapted to the current database format and will be included into the present database.Nevertheless, getting a correct estimate of the Antarctic SMB at a regional scale cannot be done with field measurements only, and cross comparison with remote sensing data is needed.A step forward is the use of the database to apply the method of Arthern et al. (2006) based on passive microwave.The approach should allow the treatment or removal of serious biases in passive microwave data due to steep slopes, to melting at low elevations, and to erosion in wind glazed snow areas.The use of other sources of data (e.g., altimetry) is also highly interesting here (e.g., Helsen et al., 2008;Shepherd et al., 2012), even if getting access to density is still an important limitation in this case.Finally, assessing the mean Antarctic SMB will need information given by atmospheric models at high resolution (∼ 10-20 km) to correctly account for the effects of local topography on precipitation and ablation processes (e.g., Van de Berg et al., 2006;Monaghan et al., 2006;Krinner et al., 2008;Genthon et al, 2009;Lenaerts et al., 2012b).Regional circulation models (e.g., MAR, RACMO2, PMM5) are good candidates for this task.The present database is clearly a relevant tool for model calibration.
This paper presented the most recent updated surface mass balance dataset for Antarctica.The database is freely available on the GLACIOCLIM-SAMBA website (http://www-lgge.ujf-grenoble.fr/ServiceObs/SiteWebAntarc/database.php) for any scientific use.Continuous updating of the database is planned but will require data owners to share their published data.This will also be possible on the GS website.

Fig. 2 .
Fig. 2. Mean SMB computed using field data measured within each 200 m elevation range on the grounded ice sheet, (a) for the eastern Antarctic sector (longitude between 0 • E and 180 • E), and (b) western Antarctic sector (longitude between 0 • W and 180• W).We first computed the average SMB for each 15 × 15 km 2 grid cell (values from points located in the same grid cell are averaged), and then the mean SMB every 200 m in elevation assuming that each grid cell had the same weight.Dark green squares are mean SMB computed with the full database, and light green squares are mean SMB computed with "A" rated data only.Gray and black dots are the number of grid cells within each elevation range for the "A" rated data and the complete ("full" SAMBA-LGGE) database, respectively.
)a More precisely, for the 0-180 • E sector of Antarctica.b More precisely, for the 0-180 • W sector of Antarctica.c We first computed the average SMB for each 15 × 15 km 2 grid cell (values from points located in the same grid cell are averaged), and then computed the mean SMB over Antarctica assuming that each grid cell has the same weight.d We first computed the average SMB for each 15 × 15 km 2 grid cell, then we computed a mean SMB for each 200 m elevation range (with the same weight for each grid cell).Finally, the mean SMB for Antarctica was computed by weighting each 200 m elevation range with its area.

Fig. 3 .
Fig. 3. Variation in SMB according to elevation based on reliable data.Data spanning a period of more than 70 yr are not shown.Elevations are from Bamber et al. (2009) digital elevation model (DEM).Blue dots are the selected observations for comparison with ERA-Interim, red dots are observations presenting a difference in elevation greater than 200 m compared with Bamber et al. (2009) DEM, gray dots are data from blue ice areas described in Sinisalo et al. (2003) and Bliss et al. (2011).Horizontal bars are the mean (orange) and 50 % occurrence (green) of blue dots for each 200 m elevation range.

Fig. 4 .
Fig. 4. Main characteristics of the reliable SMB data.(a) Comparison between the distribution of elevation in the database (blue histogram, left axis) and the distribution of surface elevation of Antarctica (white histogram, right axis).Black histograms are the same as blue histograms but represented on the right axis.Elevation is deduced from Bamber et al. (2009) DEM and are displayed for elevation ranges of 250 m each.(b) Number of observations as a function of SMB values.(c) Number of observations as a function of time coverage.(d) Variations in the number of observations over time (histogram) and in the time period used for their average since 1940 (red dots).

Fig. 5 .
Fig. 5. Distribution of the difference in elevation between observed data and data from the digital elevation model of Bamber et al. (2009).The white lines represent the 200 m threshold which led to the rejection of 44 observations.

Fig. 6 .
Fig. 6.Boxplot distribution of (a) elevation and (b) SMB values for each selected dataset.Red dots are the mean values.Red lines represent 50 % occurrence, the first and third quartiles are represented by the box bounds, and the minimum and maximum values by the black lines.

Fig. 7 .
Fig. 7. Mean SMB over the grounded ice sheet as a function of elevation a) for "A" rated data and b) for non "A" rated data.Pink squares are the mean SMB calculated by ERA-Interim for grid cells containing observations within each elevation range.Red circles are mean SMB calculated by ERA-Interim over each entire range of elevations.Green squares are mean observed SMB from grid cells containing observations within each elevation range.The gray line represents the contribution of areas with observation to the grounded ice sheet area (for each elevation range).The red line represents the contribution of entire elevation range to the grounded ice sheet.Each elevation range is 200 m.RMSE is computed between observations and ERA-Interim for grid cells containing observations by elevation ranges.

Fig. 8 .
Fig. 8. Surface elevation ("El") and variations in the SMB in specific areas and along traverses from the coast to plateaus where field data are available: (a) along the GLACIOCLIM-SAMBA observation transect in Adelie Land, (b) between Dumont d'Urville (DDU) station and Dome C (DC), (c) around Law Dome, (d) in Byrd Station region and on Ross Ice Shelf (Byrd), (e) between Syowa (SHW) and Dome Fuji (DF) (f) along the traverse route from Zhongshan station to Dome A (CHINARE) (g) along the west side of Lambert glacier (LBw) close to Mawson station (MWS) (h) in the Antarctic Peninsula (i) in Dronning Maud Land (DML).For each region, surface elevation values are presented in the upper panel and SMB values in the lower panel.Values calculated by ERA-Interim (thick red line) are compared with the mean of field data included in each EAR-Interim grid cell (thick green line).Point field data before averaging are represented by a thin light green line.The surface elevation of field observations is from Bamber et al. (2009) digital elevation model (DEM).Also shown in the upper panels are the differences in surface elevation between ERA-Interim and Bamber et al. (2009) DEM ( El, black line, right axis).

www.the-cryosphere.net/7/583/2013/ The Cryosphere, 7, 583-597, 2013 V. Favier et al.: An updated and quality controlled surface mass balance dataset in
the present paper which is dedicated to direct SMB estimates.

Table 1 .
List of sectors where data are presented separately instead of their average over 20 × 20 km 2 grid cells given in V99.

Table 2 .
SMB datasets, and available data at each step.
a The methods deemed very reliable are rated "A", the methods deemed reliable are provisionally accepted (rated "B"), unreliable methods are rated "C".b Over one or several decades.c Applicable to single stakes and stake networks.d The natural 210 Pb SMB method is reliable only over 4 to 5 decades (∼ two half life periods)

Table 4 .
Mean SMB computed from field observations for Antarctica, and for the eastern and western parts of Antarctica.Note that these SMB averages are only for areas with observation, and do not represent a mean SMB for the whole continent.

Table 5 .
Description of selected datasets in low elevation areas for comparison with ERA-Interim reanalysis.Number of 15 × 15 km 2 grid cells containing field measurements. *