Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Nov;61(1-3):71-103.
doi: 10.1007/s10994-005-1123-6. Epub 2005 Jun 8.

The Synergy Between PAV and AdaBoost

Affiliations

The Synergy Between PAV and AdaBoost

W John Wilbur et al. Mach Learn. 2005 Nov.

Abstract

Schapire and Singer's improved version of AdaBoost for handling weak hypotheses with confidence rated predictions represents an important advance in the theory and practice of boosting. Its success results from a more efficient use of information in weak hypotheses during updating. Instead of simple binary voting a weak hypothesis is allowed to vote for or against a classification with a variable strength or confidence. The Pool Adjacent Violators (PAV) algorithm is a method for converting a score into a probability. We show how PAV may be applied to a weak hypothesis to yield a new weak hypothesis which is in a sense an ideal confidence rated prediction and that this leads to an optimal updating for AdaBoost. The result is a new algorithm which we term PAV-AdaBoost. We give several examples illustrating problems for which this new algorithm provides advantages in performance.

Keywords: boosting; convergence; document classification; isotonic regression; k nearest neighbors.

PubMed Disclaimer

Figures

Figure 1
Figure 1
PAV-AdaBoost with Cmass as weak learner. Precision is 11-point average precision.
Figure 2
Figure 2
Boosting Nscore with PAV-AdaBoost compared with several different attempts to boost Nscore with linear AdaBoost.
Figure 3
Figure 3
Probability of label class +1 as a function of score as estimated by PAV and by sigmoid curves. The curve marked Sigmoid is the estimate implicitly used by linear AdaBoost and is clearly not the optimal sigmoid minimizing Zt. SigmoidOpt is the result of optimization over both ω and b in definition (52).
Figure 4
Figure 4
Boosting naïve Bayes (binary form) with three different algorithms. PAV-AdaBoost gives the lowest error on the training space (lower panel), but optimal linear AdaBoost gives the best generalizability (upper panel).

Similar articles

Cited by

References

    1. Apte C, Damerau F, Weiss S. Text mining with decision rules and decision trees. Conference Proceedings The Conference on Automated Learning and Discovery; CMU; 1998.
    1. Aslam J. Improving algorithms for boosting. Conference Proceedings 13th COLT; Palo Alto, California. 2000.
    1. Ayer M, Brunk HD, Ewing GM, Reid WT, Silverman E. An empirical distribution function for sampling with incomplete information. Annals of Mathematical Statistics. 1954;26:641–647.
    1. Bennett KP, Demiriz A, Shawe-Taylor J. A column generation algorithm for boosting. Conference Proceedings 17th ICML.2000.
    1. Buja A, Hastie T, Tibshirani R. Linear smoothers and additive models. The Annals of Statistics. 1989;17(2):453–555.
-