JCP 2013 Vol.8(1): 170-177 ISSN: 1796-203X
doi: 10.4304/jcp.8.1.170-177
doi: 10.4304/jcp.8.1.170-177
A Novel Method for Disease Prediction: Hybrid of Random Forest and Multivariate Adaptive Regression Splines
Dengju Yao1, Jing Yang1, Xiaojuan Zhan2
1 College of Computer Science and Technology, Harbin Engineering University, Harbin Heilongjiang, China
2 Department of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin Heilongjiang, China
Abstract—Using data mining technology for disease prediction and diagnosis has become the focus of attention. Data mining technology provides an important means for extracting valuable medical rules hidden in medical data and acts as an important role in disease prediction and clinical diagnosis. This paper surveys some kind of popular data mining techniques for disease prediction and diagnosis, such as decision tree, associated rule analysis and clustering analysis. Then, a novel hybrid method of random forest and multivariate adaptive regression splines is proposed for building disease prediction model. Firstly, random forest algorithm is used to perform a preliminary screening of variables and to gain an importance ranks. Then, the new dataset selected by top-k important predictors is input into the MARS procedure, which is responsible for building interpretable models for predicting disease survivability. The capability of this combination method is evaluated using basic performance measurements (e.g., accuracy, sensitivity, and specificity) along with a 10-fold crossvalidation. Experimental results show that the proposed method provides a higher accuracy and a relatively simple model.
Index Terms—data mining, medical data, random forest, multivariate adaptive regression splines
2 Department of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin Heilongjiang, China
Abstract—Using data mining technology for disease prediction and diagnosis has become the focus of attention. Data mining technology provides an important means for extracting valuable medical rules hidden in medical data and acts as an important role in disease prediction and clinical diagnosis. This paper surveys some kind of popular data mining techniques for disease prediction and diagnosis, such as decision tree, associated rule analysis and clustering analysis. Then, a novel hybrid method of random forest and multivariate adaptive regression splines is proposed for building disease prediction model. Firstly, random forest algorithm is used to perform a preliminary screening of variables and to gain an importance ranks. Then, the new dataset selected by top-k important predictors is input into the MARS procedure, which is responsible for building interpretable models for predicting disease survivability. The capability of this combination method is evaluated using basic performance measurements (e.g., accuracy, sensitivity, and specificity) along with a 10-fold crossvalidation. Experimental results show that the proposed method provides a higher accuracy and a relatively simple model.
Index Terms—data mining, medical data, random forest, multivariate adaptive regression splines
Cite: 作, " A Novel Method for Disease Prediction: Hybrid of Random Forest and Multivariate Adaptive Regression Splines," Journal of Computers vol. 8, no. 1, pp. 170-177, 2013.
General Information
ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
-
Nov 14, 2019 News!
Vol 14, No 11 has been published with online version [Click]
-
Mar 20, 2020 News!
Vol 15, No 2 has been published with online version [Click]
-
Dec 16, 2019 News!
Vol 14, No 12 has been published with online version [Click]
-
Sep 16, 2019 News!
Vol 14, No 9 has been published with online version [Click]
-
Aug 16, 2019 News!
Vol 14, No 8 has been published with online version [Click]
- Read more>>