A new L4 multi-sensor ice surface temperature product for the Greenland Ice Sheet

. The Greenland Ice Sheet (GIS) is subject to amplified impacts of climate change and its monitoring is essential for understanding and improving scenarios of future climate conditions. Surface temperature over the GIS is an important variable as it regulates processes related to the exchange of energy and water between the surface and the atmosphere. As few key local observation sites exist, an important alternative to obtain surface temperature observations over the GIS is space-borne sensors 5 that carry thermal infrared instruments. These offer several passes per day with a wide view and are the basis of deriving Ice Surface Temperature (IST) products. The aim of this study was to compare several satellite IST products for the GIS and develop and validate the first multisensor, gap-free (Level 4, L4) product for 2012. High resolution Level 2 (L2) IST products from the European Space Agency (ESA) Land Surface Temperature Climate Change Initiative (LST_cci) project and the Arctic & Antarctic Ice Surface Tem- 10 peratures from Thermal Infrared Satellite Sensors (AASTI) dataset, were assessed using observations from the PROMICE stations and IceBridge flight campaigns. The AASTI data showed overall better performance compared to LST_cci data, that in return had superior spatial coverage and availability. Both datasets were further utilised to construct a daily, gap-free, L4 IST product using the optimal interpolation (OI) method. The resulting L4 IST product performed satisfactorily in terms of quality when compared with surface temperature observations from the PROMICE stations and IceBridge flight campaigns. presents the results from a user case study within the ESA LST_cci project about the uptake of the first satellite multi-sensor, optimal-interpolated L4 IST fields covering the GIS during 2012. IST data from IR satellite sensors were used from the ESA LST_cci project as well as from the Arctic and Antarctic Ice Surface Temperatures from thermal Infrared satellite sensors (AASTI) data set (Dybkjaer


PROMICE
The Programme for Monitoring of the Greenland Ice Sheet (PROMICE) data are provided by the Geological Survey of Den-95 mark and Greenland (Ahlstrøm et al., 2008;van As and Fausto, 2011;Fausto et al., 2021). The surface temperatures are derived from up-welling longwave radiation measured by Kipp and Zonen CNR1 or CNR4 radiometers by assuming an emissivity of 0.97 (Fausto et al., 2021). Only PROMICE data from the upper ablation zone and accumulation zone were used to ensure that data are only acquired over permanently snow-or ice-covered surfaces. Figure 1 shows the geographical distribution of the eight selected PROMICE stations and their elevation, also listed in Table 2.

IceBridge
The Operation IceBridge project (Kurtz et al., 2013) conducts flight campaigns over the Arctic sea ice and the GIS, carrying various instruments amongst which a thermal infrared radiometer, KT19, which observes in a similar IR frequency interval as the AHVRR Channel 4 (9.6-11.5 µm). Surface temperatures are retrieved by measuring brightness temperatures and assuming a constant surface emissivity of 0.97. In total, IST retrievals from 27 IceBridge flights (version 2) (Studinger, 2020)  kilometre to make them more comparable to the lower resolution satellite data. It should also be noted that the IceBridge observations have not been screened for potential clouds. If clouds occur between the aircraft and the surface, the radiometer will observe the temperature of the (usually colder) clouds instead of the surface.

Level 4 OI IST
Upstream L2 observations were aggregated on a fixed grid to Level 3 (L3) and combined using a statistical methodology similar to Høyer and Karagali (2016), resulting in L4 gap-free, merged and optimal interpolated daily fields with a 0.01°l atitude and 0.02°longitude resolution. Prior to the optimal interpolation (OI) an intermediate L3 super-collated (L3S) product 115 was generated from the collation of all the L3 fields. The OI method is similar to the one from the high latitude SST DMI processing scheme (Høyer and She, 2007;Høyer and Karagali, 2016), which operates with anomalies from a first guess field.
In the current approach, a persistence-based method is applied, which uses the previous analysis field as the first guess field.
The IST observations from within 48 hours of the analysis time are aggregated and interpreted as anomalies with respect to For the control simulation, surface energy balance outputs from the RCM were used to calculate the IST and melt potential as normal. This control simulation was initially evaluated against the L4 IST data (not shown). For the simulation with assimilation 155 of the L4 IST, HIRHAM5 RCM forcing was initially used to calculate IST, and if this was below -2°C at a grid point for a given time-step, the L4 IST product was assimilated. Therefore, at any given time-step, the GIS IST output by the SMB model is a combination of modelled and observed L4 IST. The threshold was chosen to filter out biases at higher temperatures observed within the L4 IST product compared with PROMICE weather station data.

Inter-comparison of IST products
Examples of the different L3 satellite products generated from the L2 data-sets described in Table 1 are shown in Figure 2.
The L3 products are aggregated for January 9, 2012 into the L3S product (bottom left) while after optimal interpolation, the   and L4 IST products reaching their lowest ISTs of -35°C to -40°C. All products, including the L4 IST, well represent the annual cycle with the warming that started in early March and peaked in July, followed by cooling and winter minimum at the end of December.
The mean monthly IST and its standard deviation for all L3 products and the L4 IST are shown in Figure 4. Although

175
AATSR was only available until the beginning of April, a monthly value was calculated nonetheless. From January to March, mean monthly ISTs from MODIS (magenta) and AATSR (green) were similar and significantly lower than AASTI (blue), the L3S (cyan) and the L4 IST product (red) accompanied by a higher standard deviation. These differences decreased from April to June, yet MODIS consistently showed lower values and higher variability compared to AASTI and the derived L3S and L4 products, which consistently agreed throughout the year. All products showed a peak mean monthly value in July, while 180 June was warmer than August. From January to March, mean monthly temperatures were comparable to November-December means and standard deviations were of the same order.  AATSR and the derived L3S and L4 IST product, when available. Such differences and variabilities are also reflected in the mean seasonal and annual estimates, shown in Table 3  standard operational cloud mask. For the case of AATSR, in addition to the cold bias, there also was the sampling issue (see Figure 2) and the limited availability of data for the reference year 2012 (contact with ENVISAT was lost in April). Therefore, AATSR was not used to generate the final L4 IST product. With respect to the MODIS product, the pixel-to-pixel variability with values from the L4 IST product, extracted for the grid points corresponding to the flight path (blue line). The mean bias for that campaign was 0.40°C±4.30°C. This campaign covered various zones of the GIS, and the variability of the IST was intense as revealed by the 1-km averaged measurements. Beyond the warm bias during the first 800km of the flight, the L4 IST captured the variability of the ISTs over the GIS remarkably well. As the IceBridge radiometer measures the radiometric surface temperature from an aircraft at an approximate height of 450 meters above ground level without any cloud screening, 245 the presence of clouds may explain the discrepancies for the first 800 km of the flight where the IceBridge data are significantly colder than the L4 IST.
In order to assess the impact of the high resolution footprint of the IceBridge measurements on the validation statistics, i.e.
1-km averages over 5-km averages of the L3 and L4 IST products, the standard deviation of averaging raw measurements over different spatio-temporal windows minus the raw measurements were computed (not shown). This is an assessment of biases 250 introduced from comparing flight data of very high resolution, i.e. resolving small-scale variability, against space-borne sensors which although are referred to between 1 km and 5 km grids, are known to resolve scales lower than their reference grid.
Using only IceBridge campaign data, averaging for different spatial windows, i.e. 1 km, 5 km and 25 km, and subtracting raw measurements, the standard deviation values for each campaign were estimated (not shown). The largest component standard deviation was introduced when processing the raw data to 1-km averages, in the order of 2°C. Depending on the campaign, 255 an additional 0.1°C to 0.6°C of the standard deviation was attributed to the averaging from 1 km to 5 km. When averaging from 1km to 25 km, differences in the standard deviation reached 0.9°C while the mean difference in the standard deviation over all campaigns was 0.22°C from 1 km to 5 km averaging and 0.4°C from 1 km to 25 km. Thus, between 0.2°C to 0.4°C of the standard deviation in all comparisons against IceBridge campaigns (see Figure 7) can be attributed to the different spatial scales represented in the IceBridge data compared to satellite observations.

260
The validation of the AASTI, MODIS, L3S and L4 IST showed that for both PROMICE and IceBridge, AASTI had an overall better performance, with lower biases and standard deviations. MODIS data from the LST_cci project, used in this study, had a significant cold bias, associated to the less advanced cloud mask algorithm applied to the v1.0 data, which influenced the performance of the derived L4 IST product. Nonetheless, the better spatial coverage and higher resolution of MODIS rendered it crucial for the generation of the L4 IST product and thus its inclusion was justified. A new, improved version of the MODIS 265 L2P dataset, to be released by the LST_cci, is expected to result in better performance of the L4 IST OI product.

Analysis of L4 IST
Monthly averages from the L4 IST product for 2012, shown in Figure  and up to 0°C, in agreement with Hall et al. (2008).
Melt days were defined as days for which the IST was −1°C or higher, following Hall et al. (2013). They were estimated for the period May 1st to August 31st, at each grid point over the GIS. Figure 11 (right panel) shows the number of melt days from the L4 IST product, where white areas experienced zero melt days and coloured areas indicate at least one melt day or more.   Melting was observed over large parts of the GIS for more than one day while significant parts of the middle and lower zones 285 experienced more than 30 days of melt conditions.

IST Assimilation experiment
Mean monthly IST values for May were estimated from the control (left) and updated simulations including assimilation of the L4 IST product (middle), shown in Figure 12, along with the estimated anomaly (right). The month of May was selected since it is the month when the onset of melting commonly occurs across much of western and southern Greenland; this is a 290 challenging period for SMB models to simulate and the use of IST observation can potentially have a positive impact.
The updated simulation using assimilation of L4 IST, was generally warmer over a large part of the GIS, especially the east and north-east regions. The difference between the two mean May estimates, computed by subtracting mean May estimates of the control simulation from the one using assimilation of the L4 IST (right panel in Figure 12) highlighted the areas for which the control simulation was consistently colder by 2°C and more, even up to 5°C, extending to the north, south and east parts 295 of the GIS. To the contrary, the control simulation showed warmer temperatures on the west and central part of the GIS yet differences in this case did not exceed 1°C and only for small areas were they up to 3°C.
Comparing the simulations against the PROMICE stations for May 2012 (Figure 13) showed that mean daily temperatures from both the control (top) and updated simulation, using assimilation of the L4 IST product (bottom), were colder compared to PROMICE station measurements. The bias was lower for the updated simulation (−1.14°C) compared to the control simulation 300 (−2.16°C) yet the standard deviation was higher, 3.55°C for the updated simulation compared to 2.9°C for the control. Only at the TAS_U station did both simulations indicate warmer surface temperatures, but this needs to be assessed cautiously given the very few observations available in May 2012 at this station (not shown).
Comparison with the IceBridge flight campaign measurements to assess the control (top) and updated simulation (bottom) with assimilation of the L4 IST product ( Figure 14) showed a marked improvement with the assimilation of L4 IST data, with     IST product, providing more confidence in the existing SMB internal parameterisations, for the least challenging periods. (2013) manually removed daily MODIS IST fields when the cloud mask erroneously identified the ice surface as cloud free -particularly occurring during the summer. Their 2012 summer season mean IST was −6.38±3.98°C, which is significantly higher than the −11.5±5.2°C found for MODIS in the present study, and closer to the −5.5±4.5°C for the L4 IST product.

325
Validation of the upstream input datasets and the derived L4 IST indicated larger differences between the satellite products and PROMICE measurements during winter, which can likely be associated to the higher diurnal variability in IST during winter (Nielsen-Englyst et al., 2019) and the fact that cloud masking algorithms can suffer from reduced skill to identify cloudy from clear-sky pixels over ice-covered surfaces during the polar winter season (Dybkjaer et al., 2012).
The validation of satellite IST products against in situ observations suffers from the lack of available fiducial reference 330 observations for the IST over the GIS. As discussed in Høyer et al. (2017), the use of point-wise in situ observations can introduce a sampling uncertainty ranging from 0.4°C to 5°C, depending on the type of in situ observations, when compared to satellite observations with a 1km footprint. Such contributions are not related to the performance of the satellite products but arise from the spatial and temporal sampling uncertainty and the type of in situ observations. For the types of in situ observations used in this validation, it is expected that the PROMICE broadband IST observations will have higher uncertainties compared 335 to the IceBridge data. This is a consequence of the PROMICE broad band radiometer observations versus the narrow band IceBridge observations, where the snow and ice surface emissivity effects, due to surface properties or incidence angles, vary much more for the broadband observations compared to the narrow band radiometer. Also, the Spectral Response Functions (SRFs) of the IceBridge KT19 instrument are very similar to the actual IR satellite SRFs. Therefore, the results from the inter-comparison should not be viewed as an estimate of the uncertainty of the satellite products.  Figure 2), although for that study not ice-covered land surface temperatures where also considered thus above zero values were included.