JCP 2013 Vol.8(11): 2873-2879 ISSN: 1796-203X
doi: 10.4304/jcp.8.11.2873-2879
doi: 10.4304/jcp.8.11.2873-2879
A Semi-supervised Ensemble Approach for Mining Data Streams
Jing Liu, Guo-sheng Xu, Da Xiao, Li-ze Gu, and Xin-xin Niu
Information Security Center, Beijing University of Posts and Telecommunications, Beijing 100876,China; National Engineering Laboratory for Disaster Backup and Recovery, Beijing University of Posts and Telecommunications, Beijing 100876,China
Abstract—There are many challenges in mining data streams, such as infinite length, evolving nature and lack of labeled instances. Accordingly, a semi-supervised ensemble approach for mining data streams is presented in this paper. Data streams are divided into data chunks to deal with the infinite length. An ensemble classification model E is trained with existing labeled data chunks and decision boundary is constructed using E for detecting novel classes. New labeled data chunks are used to update E while unlabeled ones are used to construct unsupervised models. Classes are predicted by a semi-supervised model Ex which is consist of E and unsupervised models in a maximization consensus manner, so better performance can be achieved by using the constraints from unsupervised models with limited labeled instances. Experiments with different datasets demonstrate that our method outperforms conventional methods in mining data streams.
Index Terms—data stream mining, semi-supervised learning, novel class, concept drifting
Abstract—There are many challenges in mining data streams, such as infinite length, evolving nature and lack of labeled instances. Accordingly, a semi-supervised ensemble approach for mining data streams is presented in this paper. Data streams are divided into data chunks to deal with the infinite length. An ensemble classification model E is trained with existing labeled data chunks and decision boundary is constructed using E for detecting novel classes. New labeled data chunks are used to update E while unlabeled ones are used to construct unsupervised models. Classes are predicted by a semi-supervised model Ex which is consist of E and unsupervised models in a maximization consensus manner, so better performance can be achieved by using the constraints from unsupervised models with limited labeled instances. Experiments with different datasets demonstrate that our method outperforms conventional methods in mining data streams.
Index Terms—data stream mining, semi-supervised learning, novel class, concept drifting
Cite: Jing Liu, Guo-sheng Xu, Da Xiao, Li-ze Gu, and Xin-xin Niu, " A Semi-supervised Ensemble Approach for Mining Data Streams," Journal of Computers vol. 8, no. 11, pp. 2873-2879, 2013.
General Information
ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
Nov 14, 2019 News!
Vol 14, No 11 has been published with online version [Click]
Mar 20, 2020 News!
Vol 15, No 2 has been published with online version [Click]
Dec 16, 2019 News!
Vol 14, No 12 has been published with online version [Click]
Sep 16, 2019 News!
Vol 14, No 9 has been published with online version [Click]
Aug 16, 2019 News!
Vol 14, No 8 has been published with online version [Click]
- Read more>>