UM  > 科技學院
Aerial Root Classifiers for Predicting Missing Values in Data Stream Decision Tree Classification
Hang, Yang; Fong, Simon; Chen, Wei
2011-04-30
Conference Name2011 SIAM International Conference on Data Mining
Source PublicationProceedings of 2011 SIAM International Conference on Data Mining
Conference DateApril 28-30, 2011
Conference PlaceMesa, Arizona, U.S.A.
Abstract

Data Stream Mining (DSM) is a new breed of data mining algorithms that handles continuous data streams and predicts (or classifies) a target value on the fly. Such data streams are inevitably prone to have missing values. Some common examples include temporary malfunction of a sensor that feeds continuous data streams; and interruption on a flow of data communication signals may give rise to missing data in the input of a data stream miner. Consequently, the missing data lead to deterioration on the accuracy of the data stream miner. Several techniques exist for dealing with missing data in traditional data mining algorithms, such as setting a default value or eliminating the records that have missing data. Another classical technique is to estimate or predict a missing value by statistically computing the mean of all other values of the attribute. This does not work for DSM because the training and testing by DSM is dynamically done over a moving stream of data instead of scanning through a complete dataset (as in traditional data mining). Inspired by the aerial root in biology, we propose a method that combines sliding window technique, feature selection, Hoeffding tree classification as well as adventitious root concept to deal with missing values. As a spontaneous sidekick to the main DSM classifier, Aerial Root Classifier (ARC) is implemented with sliding window for predicting missing values, which may work even if concept-drift happens. A row of ARC’s and HTA are running in parallel, with one ARC corresponds to an attribute of the data stream. For efficient operation, only a partial set of ARC’s are chosen to be activated via dynamic Feature Selection. We built a JAVA-based simulation system for conducting experiments with various types of datasets. Improved accuracy was observed by applying this new ARC algorithm

URLView the original
Language英语
Fulltext Access
Document TypeConference paper
CollectionFaculty of Science and Technology
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
AffiliationUniversity of Macau
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Hang, Yang,Fong, Simon,Chen, Wei. Aerial Root Classifiers for Predicting Missing Values in Data Stream Decision Tree Classification[C],2011.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Hang, Yang]'s Articles
[Fong, Simon]'s Articles
[Chen, Wei]'s Articles
Baidu academic
Similar articles in Baidu academic
[Hang, Yang]'s Articles
[Fong, Simon]'s Articles
[Chen, Wei]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Hang, Yang]'s Articles
[Fong, Simon]'s Articles
[Chen, Wei]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.