Current page: Information->Indexed and Annotated Bibliography
H. Murase and S. K. Nayar
ABSTRACT
Traditional recognition approaches in computer vision have emphasized the geometric model of an object. Such approaches are impractical when dealing with large sets of objects or objects with complex geometric properties. In this paper, an automatic learning and recognition method is developed by Hiroshi Murase and Shree K. Nayar. This method is based on the eigenspace representation of object appearance, parametrized by the variables of object pose and illumination. Under the assumption that the object of interest in a given image is not occluded by other objects and can be segmented from the remaining scene, the learning and recognition procedure is described as follows. Before the actual learning, each sample object is first segmented from the background, and then resampled to normalize its scale. The invariance of the recognition system to the intensity of illumination or the aperture of imaging system is achieved by normalizing the total energy of the image to unity. The learning procedure is implemented by compressing the large image sets into low-dimensional eigenspaces that capture the gross appearance characteristics of objects. First, a universal eigenspace is computed from images of all objects using the spatial-temporal adaptive algorithm. Then all learning samples in the image set of each object are projected to the universal eigenspace to obtain a set of discrete points that describes a smoothly varying manifold. A standard cubic-spline interpolation algorithm is used to compute the manifold of each object with varying pose and illumination as parameters. By an identical procedure, the object eigenspace of each object is also computed and a manifold in each object eigenspace constructed. After showing that the Euclidean distance between two points in eigenspace is an approximation of the ``sum-of-squared difference'' between brightness values in the images, the authors goes on to describe the object recognition and pose estimation scheme. After a same scale and brightness normalization, an input image is projected to the universal eigenspace to find the stored object that gives the minimum distance between its manifold and the input object. Then, similarly, the input image is projected to this stored object's eigenspace for pose and illumination estimation by finding the manifold point closest to it. To improve the inefficiency of the exhaustive searching algorithm, two alternative schemes are suggested for finding the closest manifold: one is the multi-dimensional binary searching approach, the other the three-layered radial basis function networks. Experiments are conducted and the sensitivity of recognition rate to the number of eigenspace dimensions, and the number of object poses used for learning, is illustrated. An application to moving object recognition is also made. Finally, the authors conclude with a discussion on the advantages and limitations of this approach. 
ECVision indexed and annotated bibliography of cognitive computer vision publications
This bibliography was created by Hilary Buxton and Benoit Gaillard, University of Sussex, as part of ECVision Specific Action 8-1
The complete text version of this BibTeX file is available here: ECVision_bibliography.bib
Visual learning and recognition of {3-D} objects from appearanceSite generated on Friday, 06 January 2006