Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-02-12T05:45:51.883Z Has data issue: false hasContentIssue false

Gaze-contingent manipulation of the FVF demonstrates the importance of fixation duration for explaining search behavior

Published online by Cambridge University Press:  24 May 2017

Jochen Laubrock
Affiliation:
Department of Psychology, University of Potsdam, 14476 Potsdam, Germanyjochen.laubrock@uni-potsdam.deralf.engbert@uni-potsdam.deanke.cajar@uni-potsdam.dehttp://mbd.uni-potsdam.de/EngbertLab/Welcome.html
Ralf Engbert
Affiliation:
Department of Psychology, University of Potsdam, 14476 Potsdam, Germanyjochen.laubrock@uni-potsdam.deralf.engbert@uni-potsdam.deanke.cajar@uni-potsdam.dehttp://mbd.uni-potsdam.de/EngbertLab/Welcome.html
Anke Cajar
Affiliation:
Department of Psychology, University of Potsdam, 14476 Potsdam, Germanyjochen.laubrock@uni-potsdam.deralf.engbert@uni-potsdam.deanke.cajar@uni-potsdam.dehttp://mbd.uni-potsdam.de/EngbertLab/Welcome.html

Abstract

Hulleman & Olivers' (H&O's) model introduces variation of the functional visual field (FVF) for explaining visual search behavior. Our research shows how the FVF can be studied using gaze-contingent displays and how FVF variation can be implemented in models of gaze control. Contrary to H&O, we believe that fixation duration is an important factor when modeling visual search behavior.

Type
Open Peer Commentary
Copyright
Copyright © Cambridge University Press 2017 

Hulleman & Olivers (H&O) criticize the visual search literature for having focused largely on the individual item as the primary unit of selection. As an alternative to this view, the authors propose that (1) visual sampling during fixations is a critical process in visual search, and that (2) factors in addition to items determine the selection of upcoming fixation locations. H&O developed a very parsimonious simulation model, in which the size of the functional visual field (FVF) adapts to search difficulty. Items within the FVF are processed in parallel. Consequently, when search difficulty is very high, the FVF shrinks to a size of one item, effectively producing serial search. When search difficulty is lower, more items are processed in parallel within the FVF. These modeling assumptions were sufficient to qualitatively reproduce much of the canonical data pattern obtained in visual search tasks.

We applaud H&O for acknowledging the important and long-neglected contribution of eye movement control in guiding the search process, because we believe that many attentional phenomena can be explained by considering oculomotor activity (e.g., Laubrock et al. Reference Laubrock, Engbert and Kliegl2005; Reference Laubrock, Engbert and Kliegl2008). Although not all attention shifts are overt, the neural underpinnings of covert attention shifts are largely identical to those of eye movement control (Corbetta et al. Reference Corbetta, Akbudak, Conturo, Snyder, Ollinger, Drury, Linenweber, Petersen, Raichle, Essen and Shulman1998). Attention research should therefore be able to profit from the advanced models of the spatiotemporal evolution of activations in visual and oculomotor maps as well as from the methods for directly manipulating the FVF.

Gaze-contingent displays are a method to directly manipulate the FVF. For example, in the moving-window technique (McConkie & Rayner Reference McConkie and Rayner1975) information is only visible within a window of variable size that moves in real-time with the viewer's gaze. Visual information outside of the window is either completely masked or attenuated. A very robust result from studies using this technique is that FVF size is globally adjusted to processing difficulty. In reading research, the size of the FVF is often called the perceptual span, which has been shown to increase with reading development (Rayner Reference Rayner1986; Sperlich et al. Reference Sperlich, Schad and Laubrock2015) and to be dynamically adjusted, for example, when viewing difficult words (Henderson & Ferreira Reference Henderson and Ferreira1990; Schad & Engbert Reference Schad and Engbert2012). In scene perception parametrically increasing peripheral processing difficulty, by, for example, selectively removing parts of the spatial frequency spectrum from the peripheral visual field (Fig. 1, top), leads to corresponding reductions in saccade amplitudes (Cajar et al. Reference Cajar, Engbert and Laubrock2016a; Loschky & McConkie Reference Loschky and McConkie2002), suggesting a smaller FVF. These modulations are stronger when broad features are removed than when fine details are removed (Cajar et al. Reference Cajar, Engbert and Laubrock2016a), reflecting the low spatial resolution of peripheral vision. Conversely, when the filter is applied to the central visual field (Fig. 1, bottom) saccade amplitudes increase–particularly if fine detail is removed, corresponding to the high spatial resolution of foveal vision. Cajar et al. (Reference Cajar, Schneeweiß, Engbert and Laubrock2016b) show that these very robust modulations of mean saccade amplitude are directly correlated with the distribution of attention (i.e., the perceptual span).

Figure 1. Illustration of gaze-contingent spatial frequency filtering in real-world scenes. The white cross indicates the current gaze position of the viewer. Top: Peripheral low-pass filtering attenuates high spatial frequencies (i.e., fine-grained information) in the peripheral visual field. Bottom: Central high-pass filtering attenuates low spatial frequencies (i.e., coarse-grained information) in the central visual field.

Are existing models of saccadic selection compatible with a variable FVF? In biologically plausible models, a critical feature is a spatial map with a limited spotlight of attention (i.e., an FVF-like representation). Additionally, a simple memory mechanism (called inhibitory tagging) prevents the model from getting stuck by continually selecting the point of highest saliency. Engbert and colleagues implemented such a dynamic model of eye guidance in scene viewing (Engbert et al. Reference Engbert, Trukenbrod, Barthelmé and Wichmann2015), based on an earlier model of fixational eye movements (Engbert et al. Reference Engbert, Mergenthaler, Sinn and Pikovsky2011). The combination of two interacting attentional and inhibitory maps could reproduce a broad range of spatial statistics in scene viewing. Whereas these models do explain the selection of fixation locations fairly well, an additional mechanism that adjusts the zoom lens of attention with respect to foveal processing difficulty (Schad & Engbert Reference Schad and Engbert2012) is necessary to capture modulations of fixation duration.

In comparison to the complexity of these detailed dynamic models, the H&O model has the advantage of simplicity. However, this comes at a cost of somewhat unrealistic assumptions. For example, H&O assume that fixations have a constant duration of 250 ms and that only the number and distribution of fixations adapt to search difficulty. The authors justify this decision with previous research that barely found effects of target discriminability on fixation durations in typical search displays. However, at least for visual search in complex real-world scenes, research shows that fixation durations are indeed affected by search difficulty (e.g., Malcolm & Henderson Reference Malcolm and Henderson2009; Reference Malcolm and Henderson2010).

Thus, not only selection of fixation locations, but also control of fixation duration is influenced by the FVF. In particular, mean fixation duration increases when visual information accumulation in regions of the visual field is artificially impaired by means of gaze-contingent spatial filtering (Laubrock et al. Reference Laubrock, Cajar and Engbert2013; Loschky et al. Reference Loschky, McConkie, Yang and Miller2005; Nuthmann Reference Nuthmann2014; Shioiri & Ikeda Reference Shioiri and Ikeda1989). However, this effect is observed only when filtering does not completely remove useful information – otherwise, default timing takes over, meaning that fixation durations fall back to the level observed during unfiltered viewing (e.g., Laubrock et al. Reference Laubrock, Cajar and Engbert2013). This might explain why effects of visual search difficulty are more often reported for number of fixations rather than fixation duration. A critical aspect of a model of fixation duration in visual scenes is parallel and partially independent processing of foveal and peripheral information (Laubrock et al. Reference Laubrock, Cajar and Engbert2013). Given that both FVF size and fixation duration adapt to task difficulty, an important research goal of the future is to integrate models of fixation location and fixation duration.

References

Cajar, A., Engbert, R. & Laubrock, J. (2016a) Spatial frequency processing in the central and peripheral visual field during scene viewing. Vision Research 127:186–97. doi: 10.1016/j.visres.2016.05.008.CrossRefGoogle ScholarPubMed
Cajar, A., Schneeweiß, P., Engbert, R. & Laubrock, J. (2016b) Coupling of attention and saccades when viewing scenes with central and peripheral degradation. Journal of Vision 16(2):8, 119.CrossRefGoogle ScholarPubMed
Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Ollinger, J. M., Drury, H. A., Linenweber, M. R., Petersen, S. E., Raichle, M. E., Essen, D. C. V. & Shulman, G. L. (1998) A common network of functional areas for attention and eye movements. Neuron 21:761–73.CrossRefGoogle ScholarPubMed
Engbert, R., Mergenthaler, K., Sinn, P. & Pikovsky, A. (2011) An integrated model of fixational eye movements and microsaccades. Proceedings of the National Academy of Sciences of the United States of America 108:E765–70.Google ScholarPubMed
Engbert, R., Trukenbrod, H. A., Barthelmé, S. & Wichmann, F. A. (2015) Spatial statistics and attentional dynamics in scene viewing. Journal of Vision 15(1):14, 117.CrossRefGoogle ScholarPubMed
Henderson, J. M. & Ferreira, F. (1990) Effects of foveal processing difficulty on the perceptual span in reading: Implications for attention and eye movement control. Journal of Experimental Psychology: Learning, Memory, and Cognition 16:417–29.Google ScholarPubMed
Laubrock, J., Cajar, A. & Engbert, R. (2013) Control of fixation duration during scene viewing by interaction of foveal and peripheral processing. Journal of Vision 13(12):11, 120.CrossRefGoogle ScholarPubMed
Laubrock, J., Engbert, R. & Kliegl, R. (2005) Microsaccade dynamics during covert attention. Vision Research 45:721–30.CrossRefGoogle ScholarPubMed
Laubrock, J., Engbert, R. & Kliegl, R. (2008) Fixational eye movements predict the perceived direction of ambiguous apparent motion. Journal of Vision 8(14):13, 117.CrossRefGoogle ScholarPubMed
Loschky, L. C. & McConkie, G. W. (2002) Investigating spatial vision and dynamic attentional selection using a gaze-contingent multiresolutional display. Journal of Experimental Psychology: Applied 8:99117.Google ScholarPubMed
Loschky, L. C., McConkie, G. W., Yang, J. & Miller, M. E. (2005) The limits of visual resolution in natural scene viewing. Visual Cognition 12:1057–92.CrossRefGoogle Scholar
Malcolm, G. L. & Henderson, J. M. (2009) The effects of target template specificity on visual search in real-world scenes: Evidence from eye movements. Journal of Vision 9(11):8, 113.CrossRefGoogle ScholarPubMed
Malcolm, G. L. & Henderson, J. M. (2010) Combining top-down processes to guide eye movements during real-world scene search. Journal of Vision 10(2):4, 111.CrossRefGoogle ScholarPubMed
McConkie, G. W. & Rayner, K. (1975) The span of the effective stimulus during a fixation in reading. Perception and Psychophysics 17:578–86. doi: 10.3758/BF03203972.CrossRefGoogle Scholar
Nuthmann, A. (2014) How do the regions of the visual field contribute to object search in real-world scenes? Evidence from eye movements. Journal of Experimental Psychology: Human Perception and Performance 40:342–60.Google ScholarPubMed
Rayner, K. (1986) Eye movements and the perceptual span in beginning and skilled readers. Journal of Experimental Child Psychology 41:211–36.CrossRefGoogle ScholarPubMed
Schad, D. J. & Engbert, R. (2012) The zoom lens of attention: Simulating shuffled versus normal text reading using the SWIFT model. Visual Cognition 20:391421.CrossRefGoogle ScholarPubMed
Shioiri, S. & Ikeda, M. (1989) Useful resolution for picture perception as a function of eccentricity. Perception 18:347–61.CrossRefGoogle ScholarPubMed
Sperlich, A., Schad, D. J. & Laubrock, J. (2015) When preview information starts to matter: Development of the perceptual span in German beginning readers. Journal of Cognitive Psychology 27:511–30.CrossRefGoogle Scholar
Figure 0

Figure 1. Illustration of gaze-contingent spatial frequency filtering in real-world scenes. The white cross indicates the current gaze position of the viewer. Top: Peripheral low-pass filtering attenuates high spatial frequencies (i.e., fine-grained information) in the peripheral visual field. Bottom: Central high-pass filtering attenuates low spatial frequencies (i.e., coarse-grained information) in the central visual field.