Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms
Jinyan Li1; Simon Fong1; Sabah Mohammed2; Jinan Fiaidhi2
Source PublicationJournal of Supercomputing

Classification which is a popular supervised machine learning method has many applications in computational biology, where data samples are automatically categorized into predefined labels with the aid of data mining. Often the training samples contain very few instances of interest (e.g., medical anomalies, rare disease in a population, and unusual syndromes, etc.), but many normal instances. Such imbalanced ratio of data distributions among the target labels hampers the efficacy of classification algorithms, because the induced model has not been trained with sufficient amount of instances of the interesting label(s), but overwhelmed with ordinary training records. Traditional remedies attempt to rebalance the data distributions of the target classes, by inflating the interesting instances artificially, reducing the majority of the common instances or a combination of both. Though the fundamental concept is effective, there is no clear guideline on how to strike a balance between fabricating the rare samples and reducing the norms, with the purpose of maximizing the classification accuracy. In this paper, an optimization model using different swarm strategies (Bat-inspired algorithm and PSO) is proposed for adaptively balancing the increase/decrease of the class distribution, depending on the properties of the biological datasets. The optimization is extended for achieving the highest possible accuracy and Kappa statistics at the same time as well. The optimization model is tested on five imbalanced medical datasets, which are sourced from lung surgery logs and virtual screening of bioassay data. Computer simulation results show that the proposed optimization model outperforms other class balancing methods in medical data classification. 

KeywordImbalanced Biological Data Medical Classification Parameter Optimization Swarm Algorithm
URLView the original
Indexed BySCI
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS IDWOS:000385417400004
The Source to ArticleScopus
Fulltext Access
Citation statistics
Cited Times [WOS]:16   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
Corresponding AuthorJinyan Li; Simon Fong; Sabah Mohammed; Jinan Fiaidhi
Affiliation1.Department of Computer and Information Science, University of Macau, Taipa, Macau SAR
2.Department of Computer Science, Lakehead University, Taipa, Macau SAR
First Author AffilicationUniversity of Macau
Corresponding Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Jinyan Li,Simon Fong,Sabah Mohammed,et al. Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms[J]. Journal of Supercomputing,2016,72(10):3708–3728.
APA Jinyan Li,Simon Fong,Sabah Mohammed,&Jinan Fiaidhi.(2016).Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms.Journal of Supercomputing,72(10),3708–3728.
MLA Jinyan Li,et al."Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms".Journal of Supercomputing 72.10(2016):3708–3728.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Sabah Mohammed]'s Articles
Baidu academic
Similar articles in Baidu academic
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Sabah Mohammed]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Jinyan Li]'s Articles
[Simon Fong]'s Articles
[Sabah Mohammed]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.