the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Visual Interpretation of Synthetic Aperture Radar Sea Ice Imagery by Expert and Novice Analysts: An Eye Tracking Study
Abstract. We demonstrate the use of eye tracking methodology as a non-invasive way to identify elements behind uncertainties typically introduced during the process of sea ice charting using satellite synthetic aperture radar (SAR) imagery. To our knowledge, this is the first time eye tracking is used to study the interpretation of satellite SAR images over sea ice. We describe differences and similarities between expert and novice analysts while visually interpreting a set of SAR sea ice images.
In ice charting, SAR imagery serves as the base layer for mapping the sea ice conditions. Linking the backscatter signatures in the SAR imagery and the actual sea ice parameters is a complex task which requires highly trained experts. Mapping of sea ice types and parameters in the SAR imagery is therefore subject to an analyst's performance which may lead to inconsistencies between the ice charts. By measuring the fixation duration over different sea ice types we can identify the features in a SAR image that require more cognitive effort in classification, and thus are more prone to miss-classification. Ambiguities in classification were found especially for regions less restrictive for navigation, consisting of mixed sea ice properties and uneven thicknesses. We also show that the experts are able to correctly map large sea ice covered areas only by looking at the SAR images. Based on the eye movement data, ice categories with most of the surface covered by ice, i.e. in ice charts fast ice and very close ice, were easier to classify than areas with mixed ice thicknesses such as open ice or very open ice.
This preprint has been withdrawn.
-
Withdrawal notice
This preprint has been withdrawn.
-
Preprint
(8111 KB)
Interactive discussion
Status: closed
-
RC1: 'Comment on tc-2022-8', Anonymous Referee #1, 28 Mar 2022
After going through the first part of this paper I realized that there were already some fundamental flaws in the experiment. There was a lot of very good and detailed information regarding the contrast between expert and non-expert analysts, as well as differences within these groups. However, the overall objective of this paper and the relevance of these comparisons was not clear in the beginning, nor was it throughout the manuscript.
Regarding the experiment design, in this study (and Karvonen et al 2015) the analysts are only shown a series of satellite images on Irfan view from different snapshots during the winter season. However, this did not include analysts having any of the ancillary information that typically goes into an analysis, such as: knowledge of prior sea ice conditions (large-scale and regional), alternative ground-truth/observations (if used on these days), and prevailing weather conditions. The continuity that analysts have about environmental conditions is the core of the local knowledge that ice services rely for their routine chart production. The use of unusual hardware and software that puts the expert and non-expert analysts in unnatural situations to do an ice chart the way they would normally be done. The expert analysts are having to navigate a new setup that changes the way they habitually put together information for an ice chart analysis. Generally expert analysts have different systems between one another that allow them to be efficient to mentally collate all the information as they generate an analysis. For example, some will spend more time in the beginning reviewing information from the previous day(s) and others will focus on setting up the GIS layout. How this information is transferred between their own understanding and integrated in the analysis can vary greatly between analysts. Additionally, using a graphics viewer, Irfan Viewer, that is not the standard GIS that allows you to use familiar tools that expert analysts would be used to, that includes being able to access multiple sources of information and overlay them on one another, and have the same resolution they’re accustomed, introduces a significant error in the design of these comparisons. If this were to be applied to analysts from other ice services we would expect to see a much greater spread. Fig. 7 shows these clear differences between the unnatural analysis using solely the image, and the actual ice chart which has much more information content.
Regarding the issue of using this method to measure uncertainties of human perception, rather than trying to pick out a signal from noisy and subjective data, such as human bias, intercomparison studies between sea ice features in different products are a more accurate assessment of subjectivity and experiments are easier to carry out, especially when there's such a critical need to evaluate multiple skill sets of ice analysts from various international agencies. Studies like the Karvonen et al, 2015 and Cheng et al, 2020, do these types of comparisons on heavily processed datasets, where drawing the delineations between areas by the analyst is not required. Given that reference data is useful for quantifying the ability of automation to capture the variability in sea ice, an independent variable in which to compare a real vs controlled situation is necessary when it comes to human subjectivity. There does not seem to be anything in the paper that supports the assumption that the areas where one tends to focus should correlate with lower confidence, thus higher uncertainty. Please refer to my previous comment above describing how analysts use information. The conclusion that analysts use more cognitive effort in areas where there is more uncertainty is also not convincing given the spread of the expertise included in the small sample size of this preliminary experiment. This same expert analyst from one ice service would not be expected to be as proficient in understanding ice conditions in different areas covered by other ice services due to different environmental conditions, such oceanographic and weather conditions, nor would they have expertise on the regional variability of the ice.
Despite the small sample size and relatively new approach that this proposed methodology could add in understanding uncertainties introduced by ice analysts, the initial outcomes from this case study does not add any value to the work already being done to resolve the issues of subjectivity in ice charts. Though the authors state in conclusions (5.1) this was proof of concept and they recognize a larger sample if needed, this current experiment design is not a reasonable method and complicates the evaluation process further because there are more variables that need to be taken into account regarding the expertise of the user and the amount of information that is available to them. This method would be especially challenging during the melt and summer seasons where the spread is going to significantly vary due to the geophysical limitations with the satellite sensors. Therefore, the continuity of the analyst needs to be taken into account, similar to weather forecasters, and the amount of time it takes for them to understand the situation should not be as significant a factor as how close the analysis is to actual environmental conditions.
Last, this method is not easily feasible, economically or timewise, to use with ice analysts. Though the cost of the eye-tracking software is a factor, the usefulness is more related to the amount of time ice analysts are able to spare outside of operations to provide feedback towards these types of intercomparison studies. This approach is far more cumbersome to implement and open to further interpretation, rather than developing a more scientific metric-based evaluation to analyze uncertainties with subjectivity in ice charts.
The current method does not support the outcome that “the long fixation duration are connected with larger uncertainties in the final ice charts” stated on P27 L5, as there are a number of other factors which can be affecting the analysts decision-making. It is important that these types of studies are being developed so we can understand the human bias in ice charts and it is great to see these new and innovative approaches. However, the experiment in this methodology needs to be 1) redesigned to allow the analysts to include additional sources of information that they would regularly require for routine ice analysis, as well as 2) putting them in their normal working environment using the common systems that they are familiar with. This will allow them to use all necessary sources of information without compromising the functionality or spatial resolution in which they’re familiar and will allow for more appropriate assessments on the subjective nature between expert and non-expert analysts.
Reference: Cheng, A., Casati, B., Tivy, A., Zagon, T., Lemieux, J. F., & Tremblay, L. B. (2020). Accuracy and inter-analyst agreement of visually estimated sea ice concentrations in Canadian Ice Service ice charts using single-polarization RADARSAT-2. The Cryosphere, 14(4), 1289-1310.
The following are specific comments from the first part of the paper:
-----------------------------------------------------------------------------------------------------------------------------------------
P2 L9: use of term inconsistencies
P2 L10 Replace “miss-classification” with “misclassification”.
P2 L10-12: wouldn't areas that require more cognitive effort be prone to less miss-classification? The following sentence then states that areas less restrictive to navigation are more flawed. Ice analysts spend more time on areas where high traffic areas are known to be, including areas that are more restrictive, as a safety precaution. If areas are less restrictive, ideally they would require less cognitive effort. These sentences contradict one another. Additionally, the combination of both sea ice regimes and level of regulation in a given area for ice charting has significant implications on how analysts focus on the attention to detail in a particular area. Sea ice operations in the Baltic and the Arctic are often confused so this should specify that this paper is focused on the Baltic.
P2 L12-13: What is not being highlighted is that experts are able to map large sea ice covered areas because they have continuity in observing how the ice is changing on a daily or weekly basis. This is very different from someone who understands how to interpret sea ice in SAR imagery and may be looking at it for the first time, without having knowledge of environmental conditions in the area. This statement is incredibly misleading.
P2 L15: What is the purpose of this paper? To use eye-tracking as a metric to calculate uncertainty? If so, this should be stated clearly.
P2 L14 Confusion of terminologies, “open ice” and “very open ice” refer to concentration and not to whether the ice thicknesses are mixed.
P2 L16: What is meant by "large areas?" Does this mean synoptic? If so, up-to-date information is required at meter scale resolution, especially for tactical navigation. For route planning, large-scale information is more useful. Depending on the area, navigators require both but the "typically over large areas" simplifies the needs of maritime users and their data requirement needs.
P2 L24: need to include the challenges that snow cover and melt have on the surface roughness because this is the key challenge in sea ice monitoring and one of the main reasons for ice charts continuing to be fully manual, as opposed to semi-automated.
P2 L26: Sentence needs to be revised.
P2 L31: Is there a metric used in this comparison?
P4 L7-8 MANICE gives only a brief outline of ice charting practices, specific to Canadian Ice Service, and more the type of information content to be found in ice charts.
P4 Table 1: New ice and level ice categories are typically not used in sea ice concentration analysis.
P4 L16-18 Check Zakhvatkina et al 2019 reference is an overview, maybe more just what AARI have been doing?
P5 L1: Omit "Even"
P5 L10: Omit "for long"
P6 L1: Revise, awkward. Suggestion: "The FIS ice analysts have experience with analysing SAR images for drawing sea ice charts since....."
P6 L5: This does not need to be a separate sentence and Table 3 can just be referenced at the end of sentence from P6 L4.
P6 L7: Does this refer to Table 3 or Figure 3?
Pg6 L10: Specify original resolution
P6 L9-10 Specify the original resolution of RADARSAT-2 ScanSAR Wide. Depending on the processing it can be 100 or 50 m.
P7 L18 “an external monitor with a 22" diagonal size, similar to the ones used in the operational ice charting”. FMI typically uses a Wacom digitizing screen so that the analyst is looking directly at and drawing on the image being processed. Was this set up changed for this experiment?
P7 L27-28 “the SAR images were opened and viewed with an image viewing program (Irfan View)” This is again different from the ArcGIS software used by FIS ice analysts.
Pg10 L10: Replace "fore" with "for a"
P10 L28: Who does "he" refer to? E1 or E2? Probably the use of pronouns should be neutral throughout the paper to maintain neutrality in subjects.
Citation: https://doi.org/10.5194/tc-2022-8-RC1 - AC1: 'Reply on RC1', Alexandru Gegiuc, 22 May 2022
-
RC2: 'Comment on tc-2022-8', Anonymous Referee #2, 19 Apr 2022
Review of: Visual Interpretation of Synthetic Aperture Radar Sea Ice Imagery by Expert and Novice Analysts: An Eye Tracking Study. Alexandru Gegiuc et al.
The paper presents the results of an experiment in the use eye tracking technology to “identify elements behind uncertainties typically introduced during the process of sea ice charting using satellite synthetic aperture radar (SAR) imagery” by comparing the efforts of experts to novices.
Unfortunately – almost all of the papers insights into the problem could have been written down at the outset. The paper demonstrates the obvious – that SAR images of sea ice are complex and difficult to interpret particularly in areas that contain the signatures of several different phenomena. And that even expert analysis can take different approaches and produces inconsistent results. In fact the Novice analysis is almost completely superfluous – you could easily delete it from the paper focus on the expert analysis (which you do for pages 13 to 22 anyway) and which are the basis for almost all the conclusions (except that Novices are bad at interpreting SAR sea ice imagery). .
The expert analysis is used to conclude that “eye movement data in further studies to deepen this kind of knowledge and to understand the uncertainties introduced” [in ice charting]. And while that seems to be a reasonable and noteworthy result- there seems to be no path by which these results can be effectively translated into practice (“link between the observer and the automated method has not yet been established”). Analysts looking at imagery is slow, inefficient and (as the authors report – inconsistent) and is not a viable way to process the ever increasing amount of available SAR sea ice data. It would seem that establishing this link is one of the most important thing to do. (rather than compare expert and novice with the obvious conclusion)
It was frustrating that after reading through 26 pages – with lots of detail on pages 11 to 24 only for the authors decide to tell me on page 26 that the “main findings are qualitative in nature” (so why all the detail) and “more SAR data and more ice analysts are needed for a comprehensive study”. Really - it would have been nice to know all that up front (I would have skipped the details).
The paper in fact immediately raised a red flag when I discovered it relied on imagery from 2010 to 2012 – i.e. imagery 10 to 12 years old as its primary focus. The study area in the Baltic Sea has approximately 2000 passes from the Sentinel-1 spacecraft between 2014 and 2022. It is particularly odd to focus on old imagery given the references on page 3 to… “The basis for the Baltic Sea ice charting at FIS is daily SAR mosaic…. The mosaic is updated once per day, typically in the morning, to include most recent available SAR” So why was RS2 and not S1 used for the study?
Please go back and decide what paper you want to write Novice v Expert SAR Ice Analysis (nothing to see here / expected results) or What Can the Eye Tracking of Experts Tell Us about How to Improve Sea Ice Charting (potentially really useful) – in particular please tie it to a potential way to quantify error in charting or improve ice classification. But certainly don’t spend pages on details and then tell me at the end they don’t matter. (and that you need a bigger study to get any useful answers)
Specific Issues:
Abstract: “We also show that the experts are able to correctly map large sea ice covered areas only by looking at the SAR images” I am not quite sure how they reached this conclusion give that the Balitc Sea is a rather small mostly enclosed marginal sea.
Section 2.1 “We used five RADARSAT-2 (RS-2) ScanSAR Wide images covering different regions of Baltic sea across three different winter seasons”. Why was it important to use 3 seasons – all imagery from February. The sea ice conditions might or might not be similar.. (or was that the point?)
“In FIS, the original SAR images are typically reduced in size for easier manipulation and saving disk space. This reduces the amount of detail available for analysis. The 100 m resolution of the SAR imagery used by the FIS analysts is lower than the original resolution. Here we used the same down-scaled resolution for the RS-2 SAR images” This is one of those paragraphs that make me wonder if this paper has been sitting on a shelf for a decade and someone just dusted it off. Are the authors really having disk space limitation issues? A ScanSAr tiff is about 300 MB. They never specify the original resolution (its 50 m for ScanSAR – meaning at most you saved at most factor of 4 by moving to 100 m). What other preparation of the imagery were undertaken? Are they calibrated to produce Normalized Radar Cross Section so that the SAR signatures across years can be properly inter-compared or was it just the digital number?
Section 2.2 “Two novices (N1 and N2) with little or no familiarity with classification of SAR sea ice imagery participated to the study” and Section 3.1 “We instructed the participants to look at the selected images and interpret the content verbally….When looking at the SAR images, the participants had the task to describe their content freely by identifying sea ice types and features and classify them as they would in a typical ice charting routine”
So this would appear to be contradictory – The authors expected the Novices to verbally describe sea ice types and features – when they had “little or no familiarity with classification of SAR sea ice”
Section 3.1: “while the SAR images were opened and viewed with an image viewing program (Irfan View) which allowed users to freely change the scale or pan the viewed images.” Why this program – were the participants familiar with it? Was the SAR imagery presented in isolation? I.E was there a map reference for location? Reference scale for NRCS gray scale values or a map scale to indicate the size of the image and features?
Figure 3: Why are the land areas presented as both white and black. Is this display representative of how the images were displayed to the participants – without scales or map references?
Section 3.3 “We divided the gaze data into segments that correspond to the scanning phase and the analysis phase” Is this a standard practice when performing eye tracking work?
Section 3.3 “We computed the average dwell time, fixation duration mean, standard deviation, and fixation density (number of fixations per ice area).” Is this standard reporting metrics when performing eye tracking work? If so what kinds of information do these metrics convey?
Section 4. The title is Visual Interpretation of Synthetic Aperture Radar Sea Ice Imagery by Expert and Novice Analysts. Yet section 4 is devoted to the experts and there appears to be no corresponding section for the Novices. As stated above – the authors could delete the Novice analysis entirely and not lose much)
Section 4.1 “The first difference between experts (E) and novices (N) was noticed right away, based on their reactions when they were shown the first SAR image.” … and the first difference is associated with…….? They never explicitly call out a second (subsequent) difference..
Section 4.1 “Even if novices recognized fast (in the first five seconds) that the image shown is a SAR image” Even if!!!? Did either novice need more than 5 seconds to determine they were looking at a SAR image given that the other images were of easy to recognize natural images (human face, a flower, a fish, a cat and a bird) displayed on entirely different system (Tobii Studio vs Irfan View) - Seriously
Section 4.2.2: “This is an interesting result, showing that the experts need only few seconds (5-14) of fixation time to be able to classify an ice area. Also it underlines a difference in style of analysis. E2 spent in average about five times longer analyzing an image and about three times longer analyzing an ice area than E1. These differences could be explained by the inequality between the two experts in terms of years of experience and of training.” I’m not sure its really all that interesting. Is there an alternative explanation? I don’t not believe that the years of experience (10 v 25) can account for that amount of difference. After 10 years of experience a good analyst has seen pretty much everything. More likely this difference is explained by “personal styles of analysis” and not expertise.
Section 5: Limitations and Future work “The aim of this study was to act as a proof of concept study… The sample number in our study is low, and thus, the main findings are qualitative in nature..” All of this should have been stated at the beginning of the paper.
Citation: https://doi.org/10.5194/tc-2022-8-RC2 - AC2: 'Reply on RC2', Alexandru Gegiuc, 22 May 2022
Interactive discussion
Status: closed
-
RC1: 'Comment on tc-2022-8', Anonymous Referee #1, 28 Mar 2022
After going through the first part of this paper I realized that there were already some fundamental flaws in the experiment. There was a lot of very good and detailed information regarding the contrast between expert and non-expert analysts, as well as differences within these groups. However, the overall objective of this paper and the relevance of these comparisons was not clear in the beginning, nor was it throughout the manuscript.
Regarding the experiment design, in this study (and Karvonen et al 2015) the analysts are only shown a series of satellite images on Irfan view from different snapshots during the winter season. However, this did not include analysts having any of the ancillary information that typically goes into an analysis, such as: knowledge of prior sea ice conditions (large-scale and regional), alternative ground-truth/observations (if used on these days), and prevailing weather conditions. The continuity that analysts have about environmental conditions is the core of the local knowledge that ice services rely for their routine chart production. The use of unusual hardware and software that puts the expert and non-expert analysts in unnatural situations to do an ice chart the way they would normally be done. The expert analysts are having to navigate a new setup that changes the way they habitually put together information for an ice chart analysis. Generally expert analysts have different systems between one another that allow them to be efficient to mentally collate all the information as they generate an analysis. For example, some will spend more time in the beginning reviewing information from the previous day(s) and others will focus on setting up the GIS layout. How this information is transferred between their own understanding and integrated in the analysis can vary greatly between analysts. Additionally, using a graphics viewer, Irfan Viewer, that is not the standard GIS that allows you to use familiar tools that expert analysts would be used to, that includes being able to access multiple sources of information and overlay them on one another, and have the same resolution they’re accustomed, introduces a significant error in the design of these comparisons. If this were to be applied to analysts from other ice services we would expect to see a much greater spread. Fig. 7 shows these clear differences between the unnatural analysis using solely the image, and the actual ice chart which has much more information content.
Regarding the issue of using this method to measure uncertainties of human perception, rather than trying to pick out a signal from noisy and subjective data, such as human bias, intercomparison studies between sea ice features in different products are a more accurate assessment of subjectivity and experiments are easier to carry out, especially when there's such a critical need to evaluate multiple skill sets of ice analysts from various international agencies. Studies like the Karvonen et al, 2015 and Cheng et al, 2020, do these types of comparisons on heavily processed datasets, where drawing the delineations between areas by the analyst is not required. Given that reference data is useful for quantifying the ability of automation to capture the variability in sea ice, an independent variable in which to compare a real vs controlled situation is necessary when it comes to human subjectivity. There does not seem to be anything in the paper that supports the assumption that the areas where one tends to focus should correlate with lower confidence, thus higher uncertainty. Please refer to my previous comment above describing how analysts use information. The conclusion that analysts use more cognitive effort in areas where there is more uncertainty is also not convincing given the spread of the expertise included in the small sample size of this preliminary experiment. This same expert analyst from one ice service would not be expected to be as proficient in understanding ice conditions in different areas covered by other ice services due to different environmental conditions, such oceanographic and weather conditions, nor would they have expertise on the regional variability of the ice.
Despite the small sample size and relatively new approach that this proposed methodology could add in understanding uncertainties introduced by ice analysts, the initial outcomes from this case study does not add any value to the work already being done to resolve the issues of subjectivity in ice charts. Though the authors state in conclusions (5.1) this was proof of concept and they recognize a larger sample if needed, this current experiment design is not a reasonable method and complicates the evaluation process further because there are more variables that need to be taken into account regarding the expertise of the user and the amount of information that is available to them. This method would be especially challenging during the melt and summer seasons where the spread is going to significantly vary due to the geophysical limitations with the satellite sensors. Therefore, the continuity of the analyst needs to be taken into account, similar to weather forecasters, and the amount of time it takes for them to understand the situation should not be as significant a factor as how close the analysis is to actual environmental conditions.
Last, this method is not easily feasible, economically or timewise, to use with ice analysts. Though the cost of the eye-tracking software is a factor, the usefulness is more related to the amount of time ice analysts are able to spare outside of operations to provide feedback towards these types of intercomparison studies. This approach is far more cumbersome to implement and open to further interpretation, rather than developing a more scientific metric-based evaluation to analyze uncertainties with subjectivity in ice charts.
The current method does not support the outcome that “the long fixation duration are connected with larger uncertainties in the final ice charts” stated on P27 L5, as there are a number of other factors which can be affecting the analysts decision-making. It is important that these types of studies are being developed so we can understand the human bias in ice charts and it is great to see these new and innovative approaches. However, the experiment in this methodology needs to be 1) redesigned to allow the analysts to include additional sources of information that they would regularly require for routine ice analysis, as well as 2) putting them in their normal working environment using the common systems that they are familiar with. This will allow them to use all necessary sources of information without compromising the functionality or spatial resolution in which they’re familiar and will allow for more appropriate assessments on the subjective nature between expert and non-expert analysts.
Reference: Cheng, A., Casati, B., Tivy, A., Zagon, T., Lemieux, J. F., & Tremblay, L. B. (2020). Accuracy and inter-analyst agreement of visually estimated sea ice concentrations in Canadian Ice Service ice charts using single-polarization RADARSAT-2. The Cryosphere, 14(4), 1289-1310.
The following are specific comments from the first part of the paper:
-----------------------------------------------------------------------------------------------------------------------------------------
P2 L9: use of term inconsistencies
P2 L10 Replace “miss-classification” with “misclassification”.
P2 L10-12: wouldn't areas that require more cognitive effort be prone to less miss-classification? The following sentence then states that areas less restrictive to navigation are more flawed. Ice analysts spend more time on areas where high traffic areas are known to be, including areas that are more restrictive, as a safety precaution. If areas are less restrictive, ideally they would require less cognitive effort. These sentences contradict one another. Additionally, the combination of both sea ice regimes and level of regulation in a given area for ice charting has significant implications on how analysts focus on the attention to detail in a particular area. Sea ice operations in the Baltic and the Arctic are often confused so this should specify that this paper is focused on the Baltic.
P2 L12-13: What is not being highlighted is that experts are able to map large sea ice covered areas because they have continuity in observing how the ice is changing on a daily or weekly basis. This is very different from someone who understands how to interpret sea ice in SAR imagery and may be looking at it for the first time, without having knowledge of environmental conditions in the area. This statement is incredibly misleading.
P2 L15: What is the purpose of this paper? To use eye-tracking as a metric to calculate uncertainty? If so, this should be stated clearly.
P2 L14 Confusion of terminologies, “open ice” and “very open ice” refer to concentration and not to whether the ice thicknesses are mixed.
P2 L16: What is meant by "large areas?" Does this mean synoptic? If so, up-to-date information is required at meter scale resolution, especially for tactical navigation. For route planning, large-scale information is more useful. Depending on the area, navigators require both but the "typically over large areas" simplifies the needs of maritime users and their data requirement needs.
P2 L24: need to include the challenges that snow cover and melt have on the surface roughness because this is the key challenge in sea ice monitoring and one of the main reasons for ice charts continuing to be fully manual, as opposed to semi-automated.
P2 L26: Sentence needs to be revised.
P2 L31: Is there a metric used in this comparison?
P4 L7-8 MANICE gives only a brief outline of ice charting practices, specific to Canadian Ice Service, and more the type of information content to be found in ice charts.
P4 Table 1: New ice and level ice categories are typically not used in sea ice concentration analysis.
P4 L16-18 Check Zakhvatkina et al 2019 reference is an overview, maybe more just what AARI have been doing?
P5 L1: Omit "Even"
P5 L10: Omit "for long"
P6 L1: Revise, awkward. Suggestion: "The FIS ice analysts have experience with analysing SAR images for drawing sea ice charts since....."
P6 L5: This does not need to be a separate sentence and Table 3 can just be referenced at the end of sentence from P6 L4.
P6 L7: Does this refer to Table 3 or Figure 3?
Pg6 L10: Specify original resolution
P6 L9-10 Specify the original resolution of RADARSAT-2 ScanSAR Wide. Depending on the processing it can be 100 or 50 m.
P7 L18 “an external monitor with a 22" diagonal size, similar to the ones used in the operational ice charting”. FMI typically uses a Wacom digitizing screen so that the analyst is looking directly at and drawing on the image being processed. Was this set up changed for this experiment?
P7 L27-28 “the SAR images were opened and viewed with an image viewing program (Irfan View)” This is again different from the ArcGIS software used by FIS ice analysts.
Pg10 L10: Replace "fore" with "for a"
P10 L28: Who does "he" refer to? E1 or E2? Probably the use of pronouns should be neutral throughout the paper to maintain neutrality in subjects.
Citation: https://doi.org/10.5194/tc-2022-8-RC1 - AC1: 'Reply on RC1', Alexandru Gegiuc, 22 May 2022
-
RC2: 'Comment on tc-2022-8', Anonymous Referee #2, 19 Apr 2022
Review of: Visual Interpretation of Synthetic Aperture Radar Sea Ice Imagery by Expert and Novice Analysts: An Eye Tracking Study. Alexandru Gegiuc et al.
The paper presents the results of an experiment in the use eye tracking technology to “identify elements behind uncertainties typically introduced during the process of sea ice charting using satellite synthetic aperture radar (SAR) imagery” by comparing the efforts of experts to novices.
Unfortunately – almost all of the papers insights into the problem could have been written down at the outset. The paper demonstrates the obvious – that SAR images of sea ice are complex and difficult to interpret particularly in areas that contain the signatures of several different phenomena. And that even expert analysis can take different approaches and produces inconsistent results. In fact the Novice analysis is almost completely superfluous – you could easily delete it from the paper focus on the expert analysis (which you do for pages 13 to 22 anyway) and which are the basis for almost all the conclusions (except that Novices are bad at interpreting SAR sea ice imagery). .
The expert analysis is used to conclude that “eye movement data in further studies to deepen this kind of knowledge and to understand the uncertainties introduced” [in ice charting]. And while that seems to be a reasonable and noteworthy result- there seems to be no path by which these results can be effectively translated into practice (“link between the observer and the automated method has not yet been established”). Analysts looking at imagery is slow, inefficient and (as the authors report – inconsistent) and is not a viable way to process the ever increasing amount of available SAR sea ice data. It would seem that establishing this link is one of the most important thing to do. (rather than compare expert and novice with the obvious conclusion)
It was frustrating that after reading through 26 pages – with lots of detail on pages 11 to 24 only for the authors decide to tell me on page 26 that the “main findings are qualitative in nature” (so why all the detail) and “more SAR data and more ice analysts are needed for a comprehensive study”. Really - it would have been nice to know all that up front (I would have skipped the details).
The paper in fact immediately raised a red flag when I discovered it relied on imagery from 2010 to 2012 – i.e. imagery 10 to 12 years old as its primary focus. The study area in the Baltic Sea has approximately 2000 passes from the Sentinel-1 spacecraft between 2014 and 2022. It is particularly odd to focus on old imagery given the references on page 3 to… “The basis for the Baltic Sea ice charting at FIS is daily SAR mosaic…. The mosaic is updated once per day, typically in the morning, to include most recent available SAR” So why was RS2 and not S1 used for the study?
Please go back and decide what paper you want to write Novice v Expert SAR Ice Analysis (nothing to see here / expected results) or What Can the Eye Tracking of Experts Tell Us about How to Improve Sea Ice Charting (potentially really useful) – in particular please tie it to a potential way to quantify error in charting or improve ice classification. But certainly don’t spend pages on details and then tell me at the end they don’t matter. (and that you need a bigger study to get any useful answers)
Specific Issues:
Abstract: “We also show that the experts are able to correctly map large sea ice covered areas only by looking at the SAR images” I am not quite sure how they reached this conclusion give that the Balitc Sea is a rather small mostly enclosed marginal sea.
Section 2.1 “We used five RADARSAT-2 (RS-2) ScanSAR Wide images covering different regions of Baltic sea across three different winter seasons”. Why was it important to use 3 seasons – all imagery from February. The sea ice conditions might or might not be similar.. (or was that the point?)
“In FIS, the original SAR images are typically reduced in size for easier manipulation and saving disk space. This reduces the amount of detail available for analysis. The 100 m resolution of the SAR imagery used by the FIS analysts is lower than the original resolution. Here we used the same down-scaled resolution for the RS-2 SAR images” This is one of those paragraphs that make me wonder if this paper has been sitting on a shelf for a decade and someone just dusted it off. Are the authors really having disk space limitation issues? A ScanSAr tiff is about 300 MB. They never specify the original resolution (its 50 m for ScanSAR – meaning at most you saved at most factor of 4 by moving to 100 m). What other preparation of the imagery were undertaken? Are they calibrated to produce Normalized Radar Cross Section so that the SAR signatures across years can be properly inter-compared or was it just the digital number?
Section 2.2 “Two novices (N1 and N2) with little or no familiarity with classification of SAR sea ice imagery participated to the study” and Section 3.1 “We instructed the participants to look at the selected images and interpret the content verbally….When looking at the SAR images, the participants had the task to describe their content freely by identifying sea ice types and features and classify them as they would in a typical ice charting routine”
So this would appear to be contradictory – The authors expected the Novices to verbally describe sea ice types and features – when they had “little or no familiarity with classification of SAR sea ice”
Section 3.1: “while the SAR images were opened and viewed with an image viewing program (Irfan View) which allowed users to freely change the scale or pan the viewed images.” Why this program – were the participants familiar with it? Was the SAR imagery presented in isolation? I.E was there a map reference for location? Reference scale for NRCS gray scale values or a map scale to indicate the size of the image and features?
Figure 3: Why are the land areas presented as both white and black. Is this display representative of how the images were displayed to the participants – without scales or map references?
Section 3.3 “We divided the gaze data into segments that correspond to the scanning phase and the analysis phase” Is this a standard practice when performing eye tracking work?
Section 3.3 “We computed the average dwell time, fixation duration mean, standard deviation, and fixation density (number of fixations per ice area).” Is this standard reporting metrics when performing eye tracking work? If so what kinds of information do these metrics convey?
Section 4. The title is Visual Interpretation of Synthetic Aperture Radar Sea Ice Imagery by Expert and Novice Analysts. Yet section 4 is devoted to the experts and there appears to be no corresponding section for the Novices. As stated above – the authors could delete the Novice analysis entirely and not lose much)
Section 4.1 “The first difference between experts (E) and novices (N) was noticed right away, based on their reactions when they were shown the first SAR image.” … and the first difference is associated with…….? They never explicitly call out a second (subsequent) difference..
Section 4.1 “Even if novices recognized fast (in the first five seconds) that the image shown is a SAR image” Even if!!!? Did either novice need more than 5 seconds to determine they were looking at a SAR image given that the other images were of easy to recognize natural images (human face, a flower, a fish, a cat and a bird) displayed on entirely different system (Tobii Studio vs Irfan View) - Seriously
Section 4.2.2: “This is an interesting result, showing that the experts need only few seconds (5-14) of fixation time to be able to classify an ice area. Also it underlines a difference in style of analysis. E2 spent in average about five times longer analyzing an image and about three times longer analyzing an ice area than E1. These differences could be explained by the inequality between the two experts in terms of years of experience and of training.” I’m not sure its really all that interesting. Is there an alternative explanation? I don’t not believe that the years of experience (10 v 25) can account for that amount of difference. After 10 years of experience a good analyst has seen pretty much everything. More likely this difference is explained by “personal styles of analysis” and not expertise.
Section 5: Limitations and Future work “The aim of this study was to act as a proof of concept study… The sample number in our study is low, and thus, the main findings are qualitative in nature..” All of this should have been stated at the beginning of the paper.
Citation: https://doi.org/10.5194/tc-2022-8-RC2 - AC2: 'Reply on RC2', Alexandru Gegiuc, 22 May 2022
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
864 | 327 | 61 | 1,252 | 51 | 68 |
- HTML: 864
- PDF: 327
- XML: 61
- Total: 1,252
- BibTeX: 51
- EndNote: 68
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1