Kevin W. Bowyer - Data Mining and Classifier Ensembles

  • Detecting and ordering salient regions,
    Larry Shoemaker, Robert Banfield, Lawrence O. Hall, Kevin W. Bowyer and W. Philip Kegelmeyer,
    Data Mining and Knowledge Discovery 12 (1-2), January 2011, 259-290.
    pdf of this paper.
    We describe an ensemble approach to learning salient regions from arbitrarily partitioned data. ... We combine a fast ensemble learning algorithm with scaled probabilistic majority voting in order to learn an accurate classifier ...

  • A Region Ensemble for 3D Face Recognition,
    Timothy Faltemier, Kevin W. Bowyer and Patrick J. Flynn,
    IEEE Transactions on Information Forensics and Security, 3(1):62-73, March 2008.
    DOI link.
    ... we introduce a new system for 3D face recognition based on the fusion of results from a committee of regions that have been independently matched. ... Rank-one recognition rates of 97.2% and verification rates of 93.2% at 0.1% false accept rate are reported and compared to other methods published on the Face Recognition Grand Challenge v2 data set.

  • Using Classifier Ensembles to Label Spatially Disjoint Data,
    Larry Shoemaker, Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer and W. Philip Kegelmeyer,
    Information Fusion 9(1), 120-133, January 2008.
    pdf of this paper.
    We describe an ensemble approach to learning from arbitrarily partitioned data. ... We combine a fast ensemble learning algorithm with probabilistic majority voting in order to learn an accurate classifier from such data. ...

  • Learning to Predict Gender from Irises,
    Vince Thomas, Nitesh V. Chawla, Kevin W. Bowyer and Patrick J. Flynn,
    IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 07), September 2007.
    pdf of this paper.
    This paper employs machine learning techniques to develop models that predict gender based on the iris texture features. ...

  • Actively Exploring Face Space(s) for Improved Face Recognition,
    Nitesh V. Chawla and Kevin W. Bowyer,
    AAAI 2007, Vancouver, July 2007.
    pdf of this paper.
    We propose a learning framework that actively explores creation of face space(s) by selecting images that are complementary to the images already represented in the face space. We also construct ensembles of classifiers learned from such actively sampled image sets, which further provides improvement in the recognition rates. ...

  • Boosting Lite - Handling Larger Datasets and Slower Base Classifiers,
    Lawrence O. Hall, Robert E. Banfield, Kevin W. Bowyer and W. Philip Kegelmeyer.
    Multiple Classifier Systems (MCS) 2007, Prague, May 2007.
    pdf of this paper.
    ... we examine ensemble algorithms (Boosting Lite and Ivoting) that provide accuracy approximating a single classifier, but which require significantly fewer training examples. ...

  • A Comparison of Decision Tree Ensemble Creation Techniques,
    Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (1), 173-180, January 2007.
    pdf of this paper. appendix to the paper.
    We experimentally evaluate bagging and seven other randomization-based approaches to creating an ensemble of decision tree classifiers. Statistical tests were performed on experimental results from 57 publicly available data sets. ...

  • Multiple Nose Region Matching for 3D Face Recognition Under Varying Facial Expression,
    Kyong I. Chang, Kevin W. Bowyer, and Patrick J. Flynn,
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (10), 1695-1700, October 2006.
    pdf of this paper.
    An algorithm is proposed for 3D face recognition in the presence of varied facial expressions. It is based on combining the match scores from matching multiple overlapping regions around the nose. Experimental results are presented using the largest database employed to date in 3D face recognition studies, over 4,000 scans of 449 subjects. ...

  • Ensembles of Classifiers from Spatially Disjoint Data,
    Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer,
    Springer-Verlag LNCS 3541: 6th International Workshop on Multiple Classifier Systems (MCS 2005), Monterey, CA, June 2005, 196-205.
    pdf of this paper.
    ... We describe an ensemble learning approach that accurately learns from data which has been partitioned according to the arbitrary spatial requirements of a large-scale simulation wherein classifiers may be trained only the data local to a given partition. As a result, the class statistics can vary from partition to partition; some classes may even be missing from some partitions.

  • Random Subspaces and Subsampling for 2-D Face Recognition,
    Nitesh V. Chawla and Kevin W. Bowyer,
    Computer Vision and Pattern Recognition (CVPR 2005) , San Diego, June 2005, II: 582-589.
    pdf of this paper.
    Random subspaces are a popular ensemble construction technique that improves the accuracy of weak classifiers. It has been shown, in different domains, that random subspaces combined with weak classifiers such as decision trees and nearest neighbor classifiers can provide an improvement in accuracy. In this paper, we apply the random subspace methodology to the 2-D face recognition task. The main goal of the paper is to see if the random subspace methodology can do as well, if not better, than the single classifier constructed on the tuned face space. We also propose the use of a validation set for tuning the face space, to avoid bias in the accuracy estimation. In addition, we also compare the random subspace methodology to an ensemble of subsamples of image data. This work shows that a random subspaces ensemble can outperform a well-tuned single classifier for a typical 2-D face recognition problem. The random subspaces approach has the added advantage of requiring less careful tweaking.

  • Ensemble Diversity Measures and Their Application to Thinning,
    Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer,
    Information Fusion 6 (1), March 2005, 49-62.
    pdf of this paper.
    ... We evaluate thinning algorithms on ensembles created by several techniques on 22 publicly available datasets. When compared to other methods, our percentage correct diversity measure algorithm shows a greater correlation between the increase in voted ensemble accuracy and the diversity value. ... Finally, the methods proposed for thinning again show that ensembles can be made smaller without loss in accuracy.

  • Comments on "A Parallel Mixture of SVMs for Very Large Scale Problems,"
    Xiaomei Liu, Lawrence O. Hall, and Kevin W. Bowyer,
    Neural Computation 16 (7), July 2004, 1345-1351.
    pdf of this paper.
    ... Experiments on the Forest Cover data set show that this parallel mixture is more accurate than a single SVM, with 90.72% accuracy reported on an independent test set. While this accuracy is impressive, the referenced paper does not consider alternative types of classifiers. In this comment, we show that a simple ensemble of decision trees results in a higher accuracy, 94.75%, and is computationally efficient. This result is somewhat surprising and illustrates the general value of experimental comparisons using different types of classifiers.

  • Learning Ensembles from Bites: a Scalable and Accurate Approach,
    Nitesh Chawla, Lawrence O. Hall, Kevin W. Bowyer and W. Philip Kegelmeyer,
    Journal of Machine Learning Research 5, April 2004, 421-451.
    pdf of this paper.
    ... Voting many classifiers built on small subsets of data is a promising approach for learning from massive data sets, one that can utilize the power of boosting and bagging. We propose a framework for building hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach is fast, accurate, and scalable.

  • Is Error-based Pruning Redeemable?,
    Lawrence O. Hall, Kevin W. Bowyer, Robert E. Banfield and Steven Eschrich, and Richard Collins,
    International Journal of Artificial Intelligence Tools, 12 (3), September 2003, 249-264.
    pdf of this paper.

  • Distributed Learning with Bagging-like Performance,
    Nitesh Chawla, Thomas E. Moore, Lawrence O. Hall, Kevin W. Bowyer, W. Philip Kegelmeyer, and Clayton Springer,
    Pattern Recognition Letters 24 (1-3), 2003, 455-471.
    pdf of this paper.
    Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments with decision tree and neural network classifiers on various datasets show that, given the same size partitions and bags, disjoint partitions result in performance equivalent to, or better than, bootstrap aggregates (bags). Many applications (e.g., protein structure prediction) involve use of datasets that are too large to handle in the memory of the typical computer. Hence, bagging with samples the size of the data is impractical. Our results indicate that, in such applications, the simple approach of creating a committee of n classifiers from disjoint partitions each of size 1/n (which will be memory resident during learning) in a distributed way results in a classifier which has a bagging-like performance gain. The use of distributed disjoint partitions in learning is significantly less complex and faster than bagging.

  • SMOTE: Synthetic Minority Over-sampling TEchnique,
    Nitesh Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer,
    Journal of Artificial Intelligence Research 16, 2002, 321-357.
    pdf of this paper.
    (code for example SMOTE implementation)
    This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples.

  • Combination of Multiple Classifiers Using Local Accuracy Estimates,
    Kevin S. Woods, W. Philip Kegelmeyer and Kevin W. Bowyer,
    IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (4), 405-410, April 1997.
    pdf of this paper.
    This paper presents a method for combining classifiers that uses estimates of each individual classifier's local accuracy in small regions of the feature space surrounding an unknown sample. An empirical evaluation using five real data sets confirms the validity of our approach compared to some other combination of multiple classifiers algorithms. We also suggest a methodology for determining the best mix of individual classifiers.

Home

Research

Teaching

Professional

Personal