next up previous
Next: Data Compression. Up: No Title Previous: History of KL

Signal Analysis and Pattern Recognition.

For the data analysis problem, the KL transform is the optimal linear method (in the least mean squared error sense) for reducing redundancy in a dataset (Fukunaga, [28]). The basic idea behind KL transform is to transform possibly correlated variables in a data set into a minimal number of uncorrelated variables.

Recognition using Eigenspace

The basic idea behind the eigenspace method is to represent a large number of 'training' images with a lower-dimensional subspace. When a new image is seen, it can be classified as being similar or very different from the training images with just a few operations in the pre-computed subspace. Once these training images are distilled using the KL transform, any new image can be classified.

One of the first application of KL expansion for a pattern recognition was done by Watanabe in 1965 [25]. He applied it for speech recognition.

In [24] the general methodology of KL representation of any given large database is discussed. The optimality of KL method is illustrated by a variety of examples (pattern recognition, turbulent flow, physiology and oceanographic flow).

Turk and Pentland [33], [34] considered the approach of treating face recognition as a two-dimensional recognition problem, taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characteristic views. Face images were projected onto a feature space ('face space') that best encodes the variation among known face images. The face space was defined then by the 'eigenfaces', which are the eigenvectors of the set of faces; they did not necessarily correspond to isolated features such as eyes, ears, and noses. The framework provided the ability to learn to recognize new faces in an unsupervised manner.

Identifying the content of digital data forms can be done most reliably by hand, but the large volume of data that is now available makes autonomous techniques necessary. Very few techniques have been applied to the problem of extracting and storing information from video clips. A new approach for tagging semantic information in video clips, based on KL transform was proposed by Griffioen et al [35]. The key idea to the efficiency of the approach is to exploit the fact that video clips are stored digitally in a transformed, compressed format.

In the field of object recognition, Murase and Nayar [11] addressed the problem of automatically learning object models for recognition and pose estimation. In contrast to the traditional approach, the recognition problem was formulated as one of matching appearance rather than shape. For each object of interest, a large set of images was obtained by automatically varying pose and illumination. This image set was compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object was represented as a manifold. Given an unknown input image, the recognition system projected the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image. A near real-time recognition system with 20 complex objects in the database has been developed. The paper was concluded with a discussion on various issues related to the proposed learning and recognition methodology.

Gorecki [26] applied KL expansion to classify rough surfaces by analyzing the reduced number of Fourier power spectrum coefficients of the surface image. Also KL methods reduce the dimensionality of the sampled data and compresses the input data.

Everson and Sirovich [18] applied KL transform to recover gappy data. They showed that KL procedure gives an unbiased estimate of the data that lie in the gaps and permits gaps to be filled in reasonable matter. As an example, 50 grey pictures of similar faces with about 25% of lost information were successfully recovered using this procedure. Also in [24] they exploited natural symmetries in a family of patterns (e.g. human faces) for more efficient process of recognition. Improved approximation was reported.

Rigalia [14] proposed an adaptive unit norm filter estimating all eigenvalues and eigenvectors of the input signal autocorrelation matrix.

next up previous
Next: Data Compression. Up: No Title Previous: History of KL

stanislav gordeyev
Sun Feb 2 17:37:56 EST 1997