There has been little change in the proportion of medical decision errors in radiology over the last 60 years, despite substantial advances in technology. The field has not succeeded in capturing or understanding the fundamental properties of visual search and the allocation of visual attention of the expert radiologist, nor in translating the essential search skills into training programs. We therefore welcome this work, in the hope that a new bridge will be developed that will connect visual science with this radiological challenge. As Hulleman & Olivers (H&O) state, the fields of medical imaging visual search “have been underserved by item-based models (or any form of overarching theory of search…)” (sect. 6.6, para. 7). What constitutes an “item” in the medical image is not at all obvious. H&O suggest that the focus on the individual search items should now give way to a greater emphasis the properties of the functional visual field (FVF).
There is some evidence from our own work that this approach is relevant to visual search with medical images. In tasks where we have conducted eye-tracking experiments on groups with differing levels of expertise, we found that experts will typically make fewer fixations than novice observers. This happens for a relatively straightforward task such as fracture detection in bones (Donovan et al. Reference Donovan, Manning, Phillips, Higham and Crawford2005) and also in more complicated tasks, such as chest radiographs with many potential “items” or structures which resemble pathology. We have demonstrated distinct differences between radiologists (experts), radiographers (pre- and posttraining in chest radiograph interpretation) and novice observers when searching for lung nodules in chest radiographs. Experts find many more lung nodules while generating fewer fixations and larger saccadic amplitudes (see Manning et al. Reference Manning, Ethell, Donovan and Crawford2006). This supports the idea that the FVF is modifiable and does change according to the level of expertise. The work also sheds some light on the timescale of this learning or plasticity. After six months of training the number of fixations of the radiographers had decreased compared with their pretraining levels but had not reached that of the expert radiologist (see Table 1). Importantly, as well as making fewer fixations there was a more uniform distribution of fixations across all regions of the chest radiograph by the experts, suggesting that once the FVF has adapted to the task as a result of training, it is applied consistently across the medical image.
Table 1. Mean number of fixations per zone (n = 27 X-ray films)
More direct evidence of modifications to FVF could be explored with gaze-contingent display paradigms to isolate the expertise-dependent changes in visual search from the benefits (and costs) of initially processing the entire scene (Litchfield & Donovan Reference Litchfield and Donovan2016). The link of fixations with the speed of RTs points to one of the hallmarks of expertise – that experts are able to find targets faster and with fewer fixations than novices (Reingold & Sheridan Reference Reingold, Sheridan, Gilchrist and Everling2011). Another hallmark of expertise, which is best confirmed using gaze-contingent paradigms, is that the perceptual span increases as a function of expertise (Charness et al. Reference Charness, Reingold, Pomplun and Stampe2001; Kundel et al. Reference Kundel, Nodine, Toto, Gale and Johnson1984; Rayner Reference Rayner2009). Simple RT slopes have not helped us to understand why so many cancers are missed in medical imaging; therefore, we appreciate the central role of the FVF in this conceptual model. Unpacking the dynamic nature of FVF as a function of task and expertise may yield greater insight into this process.
However, we see an area of concern: The model replicates the conventional finding that target-present decisions are conducted more quickly than target-absent decisions (in medical images, “target-absent” would be equivalent to the true negative images – i.e., images where no cancer modules are present). H&O state “…all models of visual search, including the framework presented here, seem much better at describing target-present trials than target-absent trials” (sect. 6.6, para. 6). In our study of chest X-rays where some films showed cancerous nodules and some did not, the target-absent decisions (true negatives) were faster than the true positive decisions (see Manning et al. Reference Manning, Barker-Mill, Donovan and Crawford2005). Interestingly, this applied to both the experts (radiologists) and the novices. Our concern therefore goes beyond the lack of an explicit stop search signal. There appears to be a fundamental reversal in the normal pattern of target-absent versus target-present decisions when visual search is conducted with a chest X-ray.
Recently, Litchfield & Donovan (Reference Litchfield and Donovan2016) used a gaze-contingent preview to explore the effects a preview window in the domain of a naturalistic scene versus a medical image for radiologists and novices. The work found a clear dissociation between the two domains, with a strong preview benefit on the visual search performance for naturalist scenes, but no benefit with medical images for either group. Thus, our earlier and more recent work urges caution in extrapolating across the different search domains of feature search tasks, naturalist scenes and medical images. This suggests the bridge that the authors are seeking to construct will be more complex than they envisaged.
There has been little change in the proportion of medical decision errors in radiology over the last 60 years, despite substantial advances in technology. The field has not succeeded in capturing or understanding the fundamental properties of visual search and the allocation of visual attention of the expert radiologist, nor in translating the essential search skills into training programs. We therefore welcome this work, in the hope that a new bridge will be developed that will connect visual science with this radiological challenge. As Hulleman & Olivers (H&O) state, the fields of medical imaging visual search “have been underserved by item-based models (or any form of overarching theory of search…)” (sect. 6.6, para. 7). What constitutes an “item” in the medical image is not at all obvious. H&O suggest that the focus on the individual search items should now give way to a greater emphasis the properties of the functional visual field (FVF).
There is some evidence from our own work that this approach is relevant to visual search with medical images. In tasks where we have conducted eye-tracking experiments on groups with differing levels of expertise, we found that experts will typically make fewer fixations than novice observers. This happens for a relatively straightforward task such as fracture detection in bones (Donovan et al. Reference Donovan, Manning, Phillips, Higham and Crawford2005) and also in more complicated tasks, such as chest radiographs with many potential “items” or structures which resemble pathology. We have demonstrated distinct differences between radiologists (experts), radiographers (pre- and posttraining in chest radiograph interpretation) and novice observers when searching for lung nodules in chest radiographs. Experts find many more lung nodules while generating fewer fixations and larger saccadic amplitudes (see Manning et al. Reference Manning, Ethell, Donovan and Crawford2006). This supports the idea that the FVF is modifiable and does change according to the level of expertise. The work also sheds some light on the timescale of this learning or plasticity. After six months of training the number of fixations of the radiographers had decreased compared with their pretraining levels but had not reached that of the expert radiologist (see Table 1). Importantly, as well as making fewer fixations there was a more uniform distribution of fixations across all regions of the chest radiograph by the experts, suggesting that once the FVF has adapted to the task as a result of training, it is applied consistently across the medical image.
Table 1. Mean number of fixations per zone (n = 27 X-ray films)
Data from Manning et al. (Reference Manning, Ethell, Donovan and Crawford2006).
More direct evidence of modifications to FVF could be explored with gaze-contingent display paradigms to isolate the expertise-dependent changes in visual search from the benefits (and costs) of initially processing the entire scene (Litchfield & Donovan Reference Litchfield and Donovan2016). The link of fixations with the speed of RTs points to one of the hallmarks of expertise – that experts are able to find targets faster and with fewer fixations than novices (Reingold & Sheridan Reference Reingold, Sheridan, Gilchrist and Everling2011). Another hallmark of expertise, which is best confirmed using gaze-contingent paradigms, is that the perceptual span increases as a function of expertise (Charness et al. Reference Charness, Reingold, Pomplun and Stampe2001; Kundel et al. Reference Kundel, Nodine, Toto, Gale and Johnson1984; Rayner Reference Rayner2009). Simple RT slopes have not helped us to understand why so many cancers are missed in medical imaging; therefore, we appreciate the central role of the FVF in this conceptual model. Unpacking the dynamic nature of FVF as a function of task and expertise may yield greater insight into this process.
However, we see an area of concern: The model replicates the conventional finding that target-present decisions are conducted more quickly than target-absent decisions (in medical images, “target-absent” would be equivalent to the true negative images – i.e., images where no cancer modules are present). H&O state “…all models of visual search, including the framework presented here, seem much better at describing target-present trials than target-absent trials” (sect. 6.6, para. 6). In our study of chest X-rays where some films showed cancerous nodules and some did not, the target-absent decisions (true negatives) were faster than the true positive decisions (see Manning et al. Reference Manning, Barker-Mill, Donovan and Crawford2005). Interestingly, this applied to both the experts (radiologists) and the novices. Our concern therefore goes beyond the lack of an explicit stop search signal. There appears to be a fundamental reversal in the normal pattern of target-absent versus target-present decisions when visual search is conducted with a chest X-ray.
Recently, Litchfield & Donovan (Reference Litchfield and Donovan2016) used a gaze-contingent preview to explore the effects a preview window in the domain of a naturalistic scene versus a medical image for radiologists and novices. The work found a clear dissociation between the two domains, with a strong preview benefit on the visual search performance for naturalist scenes, but no benefit with medical images for either group. Thus, our earlier and more recent work urges caution in extrapolating across the different search domains of feature search tasks, naturalist scenes and medical images. This suggests the bridge that the authors are seeking to construct will be more complex than they envisaged.