Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
- PMID: 30917599
- PMCID: PMC6471212
- DOI: 10.3390/s19061476
Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data
Abstract
The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.
Keywords: Adaptive Boosting; Area Under Curve; Particle Swarm Optimization; imbalanced data.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
![Figure 1](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g001.gif)
![Figure 2](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g002.gif)
![Figure 3](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g003.gif)
![Figure 4](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g004.gif)
![Figure 5](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g005.gif)
![Figure 6](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g006.gif)
![Figure 7](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g007.gif)
![Figure 8](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g008.gif)
![Figure 9](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g009.gif)
![Figure 10](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g010.gif)
![Figure 11](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g011.gif)
![Figure 12](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g012.gif)
![Figure 13](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g013.gif)
![Figure 14](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g014.gif)
![Figure 15](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g015.gif)
![Figure 16](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g016.gif)
![Figure 17](https://www.ncbi.nlm.nih.gov/pmc/articles/instance/6471212/bin/sensors-19-01476-g017.gif)
Similar articles
-
Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node Classification.Front Neurorobot. 2021 Nov 25;15:775688. doi: 10.3389/fnbot.2021.775688. eCollection 2021. Front Neurorobot. 2021. PMID: 34899230 Free PMC article.
-
A comprehensive data level analysis for cancer diagnosis on imbalanced data.J Biomed Inform. 2019 Feb;90:103089. doi: 10.1016/j.jbi.2018.12.003. Epub 2019 Jan 3. J Biomed Inform. 2019. PMID: 30611011 Review.
-
Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.J Theor Biol. 2017 Dec 21;435:208-217. doi: 10.1016/j.jtbi.2017.09.018. Epub 2017 Sep 20. J Theor Biol. 2017. PMID: 28941868 Review.
-
A Novel AdaBoost Framework With Robust Threshold and Structural Optimization.IEEE Trans Cybern. 2018 Jan;48(1):64-76. doi: 10.1109/TCYB.2016.2623900. Epub 2016 Nov 24. IEEE Trans Cybern. 2018. PMID: 27898387
-
Online Adaboost-Based Parameterized Methods for Dynamic Distributed Network Intrusion Detection.IEEE Trans Cybern. 2014 Jan;44(1):66-82. doi: 10.1109/TCYB.2013.2247592. Epub 2013 Mar 27. IEEE Trans Cybern. 2014. PMID: 23757534
Cited by
-
Analysis of the fatigue status of medical security personnel during the closed-loop period using multiple machine learning methods: a case study of the Beijing 2022 Olympic Winter Games.Sci Rep. 2024 Apr 18;14(1):8987. doi: 10.1038/s41598-024-59397-6. Sci Rep. 2024. PMID: 38637575 Free PMC article.
-
Multi-classification of national fitness test grades based on statistical analysis and machine learning.PLoS One. 2023 Dec 22;18(12):e0295674. doi: 10.1371/journal.pone.0295674. eCollection 2023. PLoS One. 2023. PMID: 38134133 Free PMC article.
-
Machine Learning Models for Predicting the Type and Outcome of Ureteral Stones Treatments.Adv Biomed Res. 2023 Oct 28;12:234. doi: 10.4103/abr.abr_121_23. eCollection 2023. Adv Biomed Res. 2023. PMID: 38073755 Free PMC article.
-
Artificial intelligence-driven radiomics study in cancer: the role of feature engineering and modeling.Mil Med Res. 2023 May 16;10(1):22. doi: 10.1186/s40779-023-00458-8. Mil Med Res. 2023. PMID: 37189155 Free PMC article. Review.
-
Data-Driven Estimation of a Driving Safety Tolerance Zone Using Imbalanced Machine Learning.Sensors (Basel). 2022 Jul 15;22(14):5309. doi: 10.3390/s22145309. Sensors (Basel). 2022. PMID: 35890990 Free PMC article.
References
-
- Weiss G. Mining with rarity: A unifying framework. SIGKDD Explor. 2004;6:7–19. doi: 10.1145/1007730.1007734. - DOI
-
- Prachuabsupakij W. CLUS: A new hybrid sampling classification for imbalanced data; Proceedings of the 12th International Joint Conference on Computer Science and Software Engineering (JCSSE); Hat Yai, Thailand. 22–24 July 2015; pp. 281–286.
-
- Maloof M.A., Langley P., Binford T.O. Improved rooftop detection in aerial images with machine learning. Mach. Learn. 2003;53:157–191. doi: 10.1023/A:1025623527461. - DOI
-
- Huang K.Z., Yang H.Q., King I. Learning classifiers from imbalanced data based on biased minimax probability machine; Proceedings of the Conference on Computer Vision and Pattern Recognition; Washington, DC, USA. 27 June–2 July 2004; pp. 558–563.
-
- Viola P., Jones M. Fast and robust classification using asymmetric AdaBoost and a detector cascade. Adv. Neural Inf. Process. Syst. 2002;14:1311–1318.
Grants and funding
LinkOut - more resources
Full Text Sources