Volume 7 Number 12 (Dec. 2012)
Home > Archive > 2012 > Volume 7 Number 12 (Dec. 2012) >
JCP 2012 Vol.7(12): 2913-2920 ISSN: 1796-203X
doi: 10.4304/jcp.7.12.2913-2920

An Improved Random Forest Classifier for Text Categorization

Baoxun Xu1, Xiufeng Guo2, Yunming Ye1, Jiefeng Cheng3
1Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen 518055, China
2Department of Computer Science, Henan Business College, Zhengzhou 450045, China
3Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China


Abstract—This paper proposes an improved random forest algorithm for classifying text data. This algorithm is particularly designed for analyzing very high dimensional data with multiple classes whose well-known representative data is text corpus. A novel feature weighting method and tree selection method are developed and synergistically served for making random forest framework well suited to categorize text documents with dozens of topics. With the new feature weighting method for subspace sampling and tree selection method, we can effectively reduce subspace size and improve classification performance without increasing error bound. We apply the proposed method on six text data sets with diverse characteristics. The results have demonstrated that this improved random forests outperformed the popular text classification methods in terms of classification performance.

Index Terms—Random forest, text categorization, random subspace, decision tree.

[PDF]

Cite: Baoxun Xu, Xiufeng Guo, Yunming Ye, Jiefeng Cheng, "An Improved Random Forest Classifier for Text Categorization," Journal of Computers vol. 7, no. 12, pp. 2913-2920, 2012.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat, CNKI,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>