Jeffery et al. propose that the cognitive maps used by vertebrates are bicoded rather than fully volumetric, insofar as they represent a two-dimensional metric map in combination with a non-metric representation of the third dimension. In some cases, the two-dimensional metric surface might be a vertical surface. Additionally, the maps are piecemeal. This is an important hypothesis that will help to focus future research in the field, but it raises questions concerning the relationship between metric representations in perception and those in cognition. Our commentary will propose possible links between metric perceptual experience and bicoded cognitive maps.
The visual world is normally understood as being sensed through a two-dimensional medium (e.g., the retina; pace Gibson Reference Gibson1979) that provides direct (angular) access to the vertical dimension and to the portion of the horizontal that is frontal to the observer. In contrast, the information most relevant for route planning might be said to be roughly along the line of sight itself – the horizontal depth axis, or the ground plane, which is highly compressed on the retina. A large number of visual (and non-visual) sources of information may be used to locate things in depth, but foremost among these, for locomotion, is information about angle of regard with respect to the ground surface (whether this is horizontal, slanted, or vertical) and eye-height.
Relating the two-dimensional frontal retinal image to a cognitive metric surface map is an important achievement. The information evident to perceptual experience about surface layout in locomotor space includes information about surface inclination (i.e., slant relative to the gravitationally specified horizontal plane), as well as information about surface extent and direction. Over the past three decades, several competing claims about surface layout perception have been advanced, including affine models of depth-axis recovery (Wagner Reference Wagner1985), well-calibrated performance at egocentric distance tasks (Loomis et al. Reference Loomis, Da Silva, Fujita and Fukusima1992), energy-based models of slant perception (Proffitt et al. Reference Proffitt, Bhalla, Gossweiler and Midgett1995), intrinsic bias models of distance (Ooi & He Reference Ooi and He2007), and our own angular scale expansion model of slant and distance (Durgin & Li Reference Durgin and Li2011; Hajnal et al. Reference Hajnal, Abdul-Malak and Durgin2011; Li & Durgin Reference Li and Durgin2012).
The common observations that many of these models seek to address include perceptual facts that are superficially inconsistent with accurate metric spatial representations: Distances are underestimated (Foley et al. Reference Foley, Ribeiro and Da Silva2004; Kelly et al. Reference Kelly, Loomis and Beall2004), surface slant relative to horizontal is overestimated (Li & Durgin Reference Li and Durgin2010; Proffitt et al. Reference Proffitt, Bhalla, Gossweiler and Midgett1995), and object height is consistently overestimated relative to egocentric distance (Higashiyama & Ueyama Reference Higashiyama and Ueyama1988; Li et al. Reference Li, Phillips and Durgin2011). Alongside these facts are the observations that angular variables, such as angular declination (or gaze declination) relative to visible or implied horizons, seem to control both perceptual experience and motor action (Loomis & Philbeck Reference Loomis and Philbeck1999; Messing & Durgin Reference Messing and Durgin2005; Ooi et al. Reference Ooi, Wu and He2001), emphasizing the importance of angular variables in the perceptual computation of egocentric distance.
Because the perceptual input to vision is specified in a polar coordinate system (the retinal “image,” and its polar transformations afforded by eye, head, and body movements), a basic issue for relating the bicoded maps proposed by Jeffery et al. to the perceptual input is to understand the transformation from spherical coordinates to metric space. One argument that we have made in various forms is that metric spatial coding of distance might be incorrectly scaled in perception without producing a cost for action. Because our actions occur in the same perceptual space as our other perceptions (i.e., we see our actions; Powers Reference Powers1973), they can be calibrated to whatever scaling we perceive. Metric representation is important to successful action (Loomis et al. Reference Loomis, Da Silva, Fujita and Fukusima1992), but perceived egocentric distance can be mis-scaled as long as the scaling is stable and consistent.
We have observed that angular variables (including slant) seem to be coded in a way that exaggerates deviations from horizontal; this scaling could have the felicitous informational consequences of retaining greater coding precision while producing perceptual underestimation of egocentric ground distance (Durgin & Li Reference Durgin and Li2011), to which locomotor action can nonetheless be calibrated (Rieser et al. Reference Rieser, Pick, Ashmead and Garing1995). Believing that the resulting cognitive maps are metric does not require that perception also be metric in the same way (there is substantial evidence that egocentric horizontal extents appear shorter than frontal horizontal extents; see, e.g., Li et al. Reference Li, Sun, Strawser, Spiegel, Klein and Durgin2013), though it suggests that anisotropies in perceptual experience are overcome in cognitive maps. However, the notion that the vertical dimension is coded separately is intriguing and likely reflects a later transformation.
The perception of surface slant appears to require integration of vertical and horizontal metrics, and the systematic distortions evident in surface slant cannot be explained in detail merely by assuming that vertical scaling is expanded relative to horizontal scaling. Rather, the most elegant quantitative model of slant perception available suggests that perceived slant may represent the proportion of surface extent that is vertical (i.e., the sine of actual slant; see Durgin & Li Reference Durgin, Li and El Saddik2012). Importantly, the misperception of surface orientation shows constancy across changes in viewing direction (e.g., Durgin et al. Reference Durgin, Li and Hajnal2010b) that implies that the coding of slant in perceptual experience is with respect to a gravitational (rather than an egocentric) reference frame, even when no horizontal ground surface is visible in the scene. It would therefore be of interest, based on the suggestion that the two-dimensional cognitive map may sometimes be vertical, to determine how a vertical planar reference frame (e.g., a vertical navigation surface) affects the human interpretation of relative surface orientation.
The transformations between retinal input and cognitive maps probably involve an intermediate stage of perceptual representations of space that resembles neither. This perceptual stage clearly overcomes the superficial limitations of retinal images, yet it integrates vertical and horizontal (in the form of perceived surface orientation) in a way that cognitive maps may not. The selection of a two-dimensional metric representation is probably highly determined by information processing constraints in spatial cognition. Such ideas are also at the heart of scale expansion theory.
Jeffery et al. propose that the cognitive maps used by vertebrates are bicoded rather than fully volumetric, insofar as they represent a two-dimensional metric map in combination with a non-metric representation of the third dimension. In some cases, the two-dimensional metric surface might be a vertical surface. Additionally, the maps are piecemeal. This is an important hypothesis that will help to focus future research in the field, but it raises questions concerning the relationship between metric representations in perception and those in cognition. Our commentary will propose possible links between metric perceptual experience and bicoded cognitive maps.
The visual world is normally understood as being sensed through a two-dimensional medium (e.g., the retina; pace Gibson Reference Gibson1979) that provides direct (angular) access to the vertical dimension and to the portion of the horizontal that is frontal to the observer. In contrast, the information most relevant for route planning might be said to be roughly along the line of sight itself – the horizontal depth axis, or the ground plane, which is highly compressed on the retina. A large number of visual (and non-visual) sources of information may be used to locate things in depth, but foremost among these, for locomotion, is information about angle of regard with respect to the ground surface (whether this is horizontal, slanted, or vertical) and eye-height.
Relating the two-dimensional frontal retinal image to a cognitive metric surface map is an important achievement. The information evident to perceptual experience about surface layout in locomotor space includes information about surface inclination (i.e., slant relative to the gravitationally specified horizontal plane), as well as information about surface extent and direction. Over the past three decades, several competing claims about surface layout perception have been advanced, including affine models of depth-axis recovery (Wagner Reference Wagner1985), well-calibrated performance at egocentric distance tasks (Loomis et al. Reference Loomis, Da Silva, Fujita and Fukusima1992), energy-based models of slant perception (Proffitt et al. Reference Proffitt, Bhalla, Gossweiler and Midgett1995), intrinsic bias models of distance (Ooi & He Reference Ooi and He2007), and our own angular scale expansion model of slant and distance (Durgin & Li Reference Durgin and Li2011; Hajnal et al. Reference Hajnal, Abdul-Malak and Durgin2011; Li & Durgin Reference Li and Durgin2012).
The common observations that many of these models seek to address include perceptual facts that are superficially inconsistent with accurate metric spatial representations: Distances are underestimated (Foley et al. Reference Foley, Ribeiro and Da Silva2004; Kelly et al. Reference Kelly, Loomis and Beall2004), surface slant relative to horizontal is overestimated (Li & Durgin Reference Li and Durgin2010; Proffitt et al. Reference Proffitt, Bhalla, Gossweiler and Midgett1995), and object height is consistently overestimated relative to egocentric distance (Higashiyama & Ueyama Reference Higashiyama and Ueyama1988; Li et al. Reference Li, Phillips and Durgin2011). Alongside these facts are the observations that angular variables, such as angular declination (or gaze declination) relative to visible or implied horizons, seem to control both perceptual experience and motor action (Loomis & Philbeck Reference Loomis and Philbeck1999; Messing & Durgin Reference Messing and Durgin2005; Ooi et al. Reference Ooi, Wu and He2001), emphasizing the importance of angular variables in the perceptual computation of egocentric distance.
Because the perceptual input to vision is specified in a polar coordinate system (the retinal “image,” and its polar transformations afforded by eye, head, and body movements), a basic issue for relating the bicoded maps proposed by Jeffery et al. to the perceptual input is to understand the transformation from spherical coordinates to metric space. One argument that we have made in various forms is that metric spatial coding of distance might be incorrectly scaled in perception without producing a cost for action. Because our actions occur in the same perceptual space as our other perceptions (i.e., we see our actions; Powers Reference Powers1973), they can be calibrated to whatever scaling we perceive. Metric representation is important to successful action (Loomis et al. Reference Loomis, Da Silva, Fujita and Fukusima1992), but perceived egocentric distance can be mis-scaled as long as the scaling is stable and consistent.
We have observed that angular variables (including slant) seem to be coded in a way that exaggerates deviations from horizontal; this scaling could have the felicitous informational consequences of retaining greater coding precision while producing perceptual underestimation of egocentric ground distance (Durgin & Li Reference Durgin and Li2011), to which locomotor action can nonetheless be calibrated (Rieser et al. Reference Rieser, Pick, Ashmead and Garing1995). Believing that the resulting cognitive maps are metric does not require that perception also be metric in the same way (there is substantial evidence that egocentric horizontal extents appear shorter than frontal horizontal extents; see, e.g., Li et al. Reference Li, Sun, Strawser, Spiegel, Klein and Durgin2013), though it suggests that anisotropies in perceptual experience are overcome in cognitive maps. However, the notion that the vertical dimension is coded separately is intriguing and likely reflects a later transformation.
The perception of surface slant appears to require integration of vertical and horizontal metrics, and the systematic distortions evident in surface slant cannot be explained in detail merely by assuming that vertical scaling is expanded relative to horizontal scaling. Rather, the most elegant quantitative model of slant perception available suggests that perceived slant may represent the proportion of surface extent that is vertical (i.e., the sine of actual slant; see Durgin & Li Reference Durgin, Li and El Saddik2012). Importantly, the misperception of surface orientation shows constancy across changes in viewing direction (e.g., Durgin et al. Reference Durgin, Li and Hajnal2010b) that implies that the coding of slant in perceptual experience is with respect to a gravitational (rather than an egocentric) reference frame, even when no horizontal ground surface is visible in the scene. It would therefore be of interest, based on the suggestion that the two-dimensional cognitive map may sometimes be vertical, to determine how a vertical planar reference frame (e.g., a vertical navigation surface) affects the human interpretation of relative surface orientation.
The transformations between retinal input and cognitive maps probably involve an intermediate stage of perceptual representations of space that resembles neither. This perceptual stage clearly overcomes the superficial limitations of retinal images, yet it integrates vertical and horizontal (in the form of perceived surface orientation) in a way that cognitive maps may not. The selection of a two-dimensional metric representation is probably highly determined by information processing constraints in spatial cognition. Such ideas are also at the heart of scale expansion theory.