Current page: Information->Indexed and Annotated Bibliography
 
ECVision indexed and annotated bibliography of cognitive computer vision publications
This bibliography was created by Hilary Buxton and Benoit Gaillard, University of Sussex, as part of ECVision Specific Action 8-1
The complete text version of this BibTeX file is available here: ECVision_bibliography.bib


M. C. Mozer and M. Sitton
Computational modeling of spatial attention

ABSTRACT

If we had really huge brains, say the size of watermelons, attention would play a much smaller role in our behavior. Its significance stems primarily from limitations in our processing hardware. We simply do not have sufficient brain capacity to analyze all information that passes through our sense organs, to reason exhaustively about all possible courses of action, and to maintain multiple interpretations of the world. Attentional selection is needed to determine what information will be processed by the available hardware. Consider the task of recognizing objects in a visual scene. What sort of processing resources would be required to identify all objects in parallel, regardless of their positions, orientations, and size in the scene? If we are familiar with o different objects, and any object can appear in any of p horizontal or vertical positions and r orientations and s scales, the number of different object instantiations is op 2 rs. This number would be far larger still if the objects are not rigid. Regardless of the nature of the recognition process, the number of possible object instantiations roughly determines the amount of processing resources required. You can plug in reasonable guesses as to how many object instantiations are possible; 100 million might be a reasonable ballpark figure. If we limit ourselves to one object at a time, however, and the object's position, orientation, and scale are computed first, then the number of object instantiations that have to considered at once is only o, or a number more like 10,000. Ballard (1986) and Tsotsos (1990, 1991) have presented computational complexity analyses of this sort to argue that the combinatorics of vision require some type of attentional selection to reduce the number of possibilities that need to be considered, and that attention can be particularly beneficial when exploiting knowledge of the particular task being faced by the visual system. In accord with the computational arguments, human vision shows strong limitations on how many objects can be processed and identified in parallel (e.g., Duncan, 1987; Mozer, 1983, 1989; Pashler & Badgio, 1987; Shiffrin & Gardner, 1972; Schneider & Shiffrin, 1977; Treisman & Schmidt, 1982). In general terms, one can conceive of processing of a visual stimulus as occurring along a certain neural pathway. If the processing pathways for two stimuli are nonoverlapping, then processing can take place in parallel. But if the pathways cross---i.e., they share common resources or hardware---the stimuli will interact or interfere with one another. One role of attention is to reduce this interference by restricting the amount of information that is processed at once. In this chapter, we examine the role of spatial attention from a computational perspective. Because the function of attention can be understood only in its relation to visual information processing, we must model not only the attentional system itself, but also the process of object recognition. We begin by presenting a basic model of object recognition, showing that interference prevents the system from reliably processing multiple, complex stimuli, and then we show how a simple mechanism of attentional selection can reduce this interference. Our initial goal will be to present a model that is computationally adequate, that is, a model that has the computational power to perform the sort of visual information processing tasks that people do. Psychologists are most concerned with another issue: whether the model can explain various experimental findings and whether it has any ability to predict the outcome of further experiments. In our view, the demands of computational adequacy and explanatory/predictive power are complementary, and a compelling account should satisfy both, and in so doing, allow one to understand the mechanisms that underlie information processing.


Site generated on Friday, 06 January 2006