Negative samples reduction in cross-company software defects prediction
Lin Chen1; Bin Fang1; Zhaowei Shang1; Yuanyan Tang1,2
Source PublicationInformation and Software Technology

Context: Software defect prediction has been widely studied based on various machine-learning algorithms. Previous studies usually focus on within-company defects prediction (WCDP), but lack of training data in the early stages of software testing limits the efficiency of WCDP in practice. Thus, recent research has largely examined the cross-company defects prediction (CCDP) as an alternative solution. Objective: However, the gap of different distributions between cross-company (CC) data and withincompany (WC) data usually makes it difficult to build a high-quality CCDP model. In this paper, a novel algorithm named Double Transfer Boosting (DTB) is introduced to narrow this gap and improve the performance of CCDP by reducing negative samples in CC data. Method: The proposed DTB model integrates two levels of data transfer: first, the data gravitation method reshapes the whole distribution of CC data to fit WC data. Second, the transfer boosting method employs a small ratio of labeled WC data to eliminate negative instances in CC data. Results: The empirical evaluation was conducted based on 15 publicly available datasets. CCDP experiment results indicated that the proposed model achieved better overall performance than compared CCDP models. DTB was also compared to WCDP in two different situations. Statistical analysis suggested that DTB performed significantly better than WCDP models trained by limited samples and produced comparable results to WCDP with sufficient training data. Conclusions: DTB reforms the distribution of CC data from different levels to improve the performance of CCDP, and experimental results and analysis demonstrate that it could be an effective model for early software defects detection. 

KeywordCross-company Defects Prediction Software Fault Prediction Transfer Learning
URLView the original
Indexed BySCI
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems ; Computer Science, Software Engineering
WOS IDWOS:000353179000004
The Source to ArticleScopus
Fulltext Access
Citation statistics
Cited Times [WOS]:52   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
CollectionUniversity of Macau
Corresponding AuthorLin Chen; Bin Fang; Zhaowei Shang; Yuanyan Tang
Affiliation1.Department of Computer Science, Chongqing University, Chongqing 400030, China
2.Faculty of Science and Technology, University of Macau, Macau, China
Corresponding Author AffilicationFaculty of Science and Technology
Recommended Citation
GB/T 7714
Lin Chen,Bin Fang,Zhaowei Shang,et al. Negative samples reduction in cross-company software defects prediction[J]. Information and Software Technology,2015,62(1):67-77.
APA Lin Chen,Bin Fang,Zhaowei Shang,&Yuanyan Tang.(2015).Negative samples reduction in cross-company software defects prediction.Information and Software Technology,62(1),67-77.
MLA Lin Chen,et al."Negative samples reduction in cross-company software defects prediction".Information and Software Technology 62.1(2015):67-77.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Lin Chen]'s Articles
[Bin Fang]'s Articles
[Zhaowei Shang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Lin Chen]'s Articles
[Bin Fang]'s Articles
[Zhaowei Shang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Lin Chen]'s Articles
[Bin Fang]'s Articles
[Zhaowei Shang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.