By taking fixations, not individual items, as the central unit, Hulleman & Olivers (H&O) put forward a promising, unified account of both eye movements and manual reaction times (RTs) in visual search. However, their conceptual framework makes two oversimplified assumptions: (1) the size of the functional viewing field (FVF) being solely dependent on the visual discriminability of the search elements; and (2) constant FVF processing time (i.e., a constant fixation duration of 250 ms), ignoring any dynamic interactions between the two parameters. Although the assumption of constancy of fixation durations makes the framework easily comparable with traditional, item-based selection models, it limits the explanatory potential of H&O's account, as we will outline in this commentary.
It is generally accepted that “fixate” and “move” oculomotor activities are governed by parallel “when” and “where” commands generated across the entire visual-perceptual hierarchy (Findlay & Walker Reference Findlay and Walker1999). Concerning top-down influences, fixation durations are influenced by task difficulty (Hooge & Erkelens Reference Hooge and Erkelens1998; Moffitt Reference Moffitt1980; Pomplun et al. Reference Pomplun, Garaas and Carrasco2013), memory about spatial context (van Asselen et al. Reference van Asselen, Sampaio, Pina and Castelo-Branco2011; Zang et al. Reference Zang, Jia, Müller and Shi2015), visual search strategy (Geyer et al. Reference Geyer, Von Mühlenen and Müller2007), and multisensory experience (Zou et al. Reference Zou, Müller and Shi2012). For example, Geyer et al. (Reference Geyer, Von Mühlenen and Müller2007) compared fixation durations between static and dynamic search displays with identical target-distractor discriminability, except that search items were randomly reshuffled every 117 ms in the latter condition. Mean fixation duration, as well as the latency of the first saccade, was increased by some 100–150 ms for the dynamic compared to the static condition, although “standard” measures of search efficiency (slope of the search function) were comparable between the two types of display. These findings clearly suggest that fixational dwell times are not solely under the control of the current sensory environment, or in H&O's terms, the perceptual discriminability of the search items. Instead, observers' strategic efforts in solving the task at hand must also be considered in accounting for such extended fixation durations (Geyer et al. Reference Geyer, Von Mühlenen and Müller2007).
Rather than being independent, in most cases fixation duration and the FVF interact in a nonlinear fashion (Nuthmann et al. Reference Nuthmann, Smith, Engbert and Henderson2010; Unema et al. Reference Unema, Pannasch, Joos and Velichkovsky2005). One strong piece of evidence of a dynamic interaction between the two parameters comes from an oculomotor study on the “pip-and-pop” effect (Zou et al. Reference Zou, Müller and Shi2012). In “pip-and-pop” visual search displays, beeps are synchronized with (task-irrelevant) color changes of the target, which is presented in a cluttered and heterogeneous item field (with search being extremely “inefficient”). Zou et al. found that fixation durations increased by some 150 ms for beep-present versus beep-absent trials: an “oculomotor freezing” effect. Such extended fixations at beeps allow information to be sampled over a larger FVF, as indicated by larger saccade amplitudes immediately after the beeps. In other words, beep-induced prolonged fixation times and subsequent large saccade amplitudes mediate fast detection of target presence, yielding the “pip-and-pop” effect. This pattern also suggests that the oculomotor scanning strategy can affect the rate of information processing, as evidenced by increased information uptake per fixation for the beep-present relative to the beep-absent condition. Another very recent study (Zang et al. Reference Zang, Jia, Müller and Shi2015) on context-based guidance of visual search also revealed a beneficial effect of extended fixation duration on task performance. In this study observers were first trained with an artificial FVF size, implemented by a gaze-contingent tunnel-viewing technique. With 4–5 items visible inside of the FVF, the mean fixation duration was already extended in the training session for repeated “old,” compared to randomly generated “new,” display (item) layouts. Further, the scan path for old relative to new displays was closer to the optimal scan path, indicating that learned context improves the efficiency of oculomotor scanning. Increased fixational dwell times and shortened scan paths for old relative to new displays remained evident even after the constraining tunnel view was removed from the task. Such dynamic adjustments of fixation duration and saccade amplitude are quite common during scene search. It has been shown, for instance, that fixation duration and saccade amplitude gradually change over the first few seconds, and then approach their asymptotic levels (Unema et al. Reference Unema, Pannasch, Joos and Velichkovsky2005). Both asymptotes, however, depend on the number of objects in the scene, which indicates that the complexity of the scene, too, changes oculomotor scanning.
These findings, amongst others, provide converging evidence that the size of the FVF and fixation duration are not determined by visual discriminability alone, as assumed by H&O. Rather, oculomotor scanning is dynamic in that the size of the FVF and fixation duration must be considered together to discern moment-by-moment adjustments of information processing. Despite the H&O conceptual framework's current lack of flexible oculomotor parameters, the idea of fixation as a central processing unit of visual search remains very promising. However, to incorporate the above findings of dynamic interactions between fixation duration and saccade amplitude, we propose that fixational eye movements are best characterized by both spatial (i.e., the size of FVF in H&O terms) and temporal (i.e., fixation duration) factors. Combining the two could provide insight into how oculomotor scanning strategies influence the fixation-by-fixation information processing rate, which might turn out to be the distinguishing feature for comparing different visual search tasks.
By taking fixations, not individual items, as the central unit, Hulleman & Olivers (H&O) put forward a promising, unified account of both eye movements and manual reaction times (RTs) in visual search. However, their conceptual framework makes two oversimplified assumptions: (1) the size of the functional viewing field (FVF) being solely dependent on the visual discriminability of the search elements; and (2) constant FVF processing time (i.e., a constant fixation duration of 250 ms), ignoring any dynamic interactions between the two parameters. Although the assumption of constancy of fixation durations makes the framework easily comparable with traditional, item-based selection models, it limits the explanatory potential of H&O's account, as we will outline in this commentary.
It is generally accepted that “fixate” and “move” oculomotor activities are governed by parallel “when” and “where” commands generated across the entire visual-perceptual hierarchy (Findlay & Walker Reference Findlay and Walker1999). Concerning top-down influences, fixation durations are influenced by task difficulty (Hooge & Erkelens Reference Hooge and Erkelens1998; Moffitt Reference Moffitt1980; Pomplun et al. Reference Pomplun, Garaas and Carrasco2013), memory about spatial context (van Asselen et al. Reference van Asselen, Sampaio, Pina and Castelo-Branco2011; Zang et al. Reference Zang, Jia, Müller and Shi2015), visual search strategy (Geyer et al. Reference Geyer, Von Mühlenen and Müller2007), and multisensory experience (Zou et al. Reference Zou, Müller and Shi2012). For example, Geyer et al. (Reference Geyer, Von Mühlenen and Müller2007) compared fixation durations between static and dynamic search displays with identical target-distractor discriminability, except that search items were randomly reshuffled every 117 ms in the latter condition. Mean fixation duration, as well as the latency of the first saccade, was increased by some 100–150 ms for the dynamic compared to the static condition, although “standard” measures of search efficiency (slope of the search function) were comparable between the two types of display. These findings clearly suggest that fixational dwell times are not solely under the control of the current sensory environment, or in H&O's terms, the perceptual discriminability of the search items. Instead, observers' strategic efforts in solving the task at hand must also be considered in accounting for such extended fixation durations (Geyer et al. Reference Geyer, Von Mühlenen and Müller2007).
Rather than being independent, in most cases fixation duration and the FVF interact in a nonlinear fashion (Nuthmann et al. Reference Nuthmann, Smith, Engbert and Henderson2010; Unema et al. Reference Unema, Pannasch, Joos and Velichkovsky2005). One strong piece of evidence of a dynamic interaction between the two parameters comes from an oculomotor study on the “pip-and-pop” effect (Zou et al. Reference Zou, Müller and Shi2012). In “pip-and-pop” visual search displays, beeps are synchronized with (task-irrelevant) color changes of the target, which is presented in a cluttered and heterogeneous item field (with search being extremely “inefficient”). Zou et al. found that fixation durations increased by some 150 ms for beep-present versus beep-absent trials: an “oculomotor freezing” effect. Such extended fixations at beeps allow information to be sampled over a larger FVF, as indicated by larger saccade amplitudes immediately after the beeps. In other words, beep-induced prolonged fixation times and subsequent large saccade amplitudes mediate fast detection of target presence, yielding the “pip-and-pop” effect. This pattern also suggests that the oculomotor scanning strategy can affect the rate of information processing, as evidenced by increased information uptake per fixation for the beep-present relative to the beep-absent condition. Another very recent study (Zang et al. Reference Zang, Jia, Müller and Shi2015) on context-based guidance of visual search also revealed a beneficial effect of extended fixation duration on task performance. In this study observers were first trained with an artificial FVF size, implemented by a gaze-contingent tunnel-viewing technique. With 4–5 items visible inside of the FVF, the mean fixation duration was already extended in the training session for repeated “old,” compared to randomly generated “new,” display (item) layouts. Further, the scan path for old relative to new displays was closer to the optimal scan path, indicating that learned context improves the efficiency of oculomotor scanning. Increased fixational dwell times and shortened scan paths for old relative to new displays remained evident even after the constraining tunnel view was removed from the task. Such dynamic adjustments of fixation duration and saccade amplitude are quite common during scene search. It has been shown, for instance, that fixation duration and saccade amplitude gradually change over the first few seconds, and then approach their asymptotic levels (Unema et al. Reference Unema, Pannasch, Joos and Velichkovsky2005). Both asymptotes, however, depend on the number of objects in the scene, which indicates that the complexity of the scene, too, changes oculomotor scanning.
These findings, amongst others, provide converging evidence that the size of the FVF and fixation duration are not determined by visual discriminability alone, as assumed by H&O. Rather, oculomotor scanning is dynamic in that the size of the FVF and fixation duration must be considered together to discern moment-by-moment adjustments of information processing. Despite the H&O conceptual framework's current lack of flexible oculomotor parameters, the idea of fixation as a central processing unit of visual search remains very promising. However, to incorporate the above findings of dynamic interactions between fixation duration and saccade amplitude, we propose that fixational eye movements are best characterized by both spatial (i.e., the size of FVF in H&O terms) and temporal (i.e., fixation duration) factors. Combining the two could provide insight into how oculomotor scanning strategies influence the fixation-by-fixation information processing rate, which might turn out to be the distinguishing feature for comparing different visual search tasks.