Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 26;19(6):1476.
doi: 10.3390/s19061476.

Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

Affiliations

Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

Kewen Li et al. Sensors (Basel). .

Abstract

The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.

Keywords: Adaptive Boosting; Area Under Curve; Particle Swarm Optimization; imbalanced data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The AUC of AdaBoost Algorithm on Vehicle Training Set.
Figure 2
Figure 2
Performance Comparison on Vehicle Test Set.
Figure 3
Figure 3
The Error Comparison of AdaBoost and AdaBoost-A on Vehicle Dataset.
Figure 4
Figure 4
The AUC Comparison of AdaBoost and AdaBoost-A on Vehicle Dataset.
Figure 5
Figure 5
The AUC of AdaBoost Algorithm on KC1 Training Set.
Figure 6
Figure 6
Performance Comparison on KC1 Test Set.
Figure 7
Figure 7
Error Comparison of AdaBoost and AdaBoost-A on KC1 Dataset.
Figure 8
Figure 8
The AUC Comparison of AdaBoost and AdaBoost-A on KC1 Dataset.
Figure 9
Figure 9
Performance Comparison of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A on Horse Colic Dataset.
Figure 10
Figure 10
Performance Comparison of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A on Ionosphere Dataset.
Figure 11
Figure 11
Performance Comparison of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A on JM1 Dataset.
Figure 12
Figure 12
Performance Comparison of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A on KC1 Dataset.
Figure 13
Figure 13
Performance Comparison of the AdaBoost, PSO-AdaBoost-A, and PSOPD-AdaBoost-A on Statlog Dataset.
Figure 14
Figure 14
Comparison of AUC on the Vehicle Dataset.
Figure 15
Figure 15
Comparison of AUC on the PC3 Dataset.
Figure 16
Figure 16
Comparison of AUC on the PC5 Dataset.
Figure 17
Figure 17
Comparison of AUC on the CM1 Dataset.

Similar articles

Cited by

References

    1. Weiss G. Mining with rarity: A unifying framework. SIGKDD Explor. 2004;6:7–19. doi: 10.1145/1007730.1007734. - DOI
    1. Prachuabsupakij W. CLUS: A new hybrid sampling classification for imbalanced data; Proceedings of the 12th International Joint Conference on Computer Science and Software Engineering (JCSSE); Hat Yai, Thailand. 22–24 July 2015; pp. 281–286.
    1. Maloof M.A., Langley P., Binford T.O. Improved rooftop detection in aerial images with machine learning. Mach. Learn. 2003;53:157–191. doi: 10.1023/A:1025623527461. - DOI
    1. Huang K.Z., Yang H.Q., King I. Learning classifiers from imbalanced data based on biased minimax probability machine; Proceedings of the Conference on Computer Vision and Pattern Recognition; Washington, DC, USA. 27 June–2 July 2004; pp. 558–563.
    1. Viola P., Jones M. Fast and robust classification using asymmetric AdaBoost and a detector cascade. Adv. Neural Inf. Process. Syst. 2002;14:1311–1318.

LinkOut - more resources

-