JCP 2013 Vol.8(1): 85-90 ISSN: 1796-203X
doi: 10.4304/jcp.8.1.85-90
doi: 10.4304/jcp.8.1.85-90
Ontology-Based Information Extraction of Crop Diseases on Chinese Web Pages
Bo Jiang, Meng-xia Zhu, Jia-le Wang
Department of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou, China
Abstract—This paper proposes a method for extracting information of crop diseases on Chinese web pages. First, we define some special labels of the DOM tree[1] to partition the web page into some content blocks. Then the noise content in the web pages is eliminated according to the location and the word number of a content block. We employ an ontology-based way to implement information extraction from the content blocks. A top-down method is adopted to construct the ontology of crop diseases. In the extraction process, the concepts, relations and instances of ontology is used to extract the entities. The event is extracted by an optimal classification of paragraph groups in a content block. Experiments demonstrate the performance of the proposed method is satisfactory.
Index Terms—Domain Ontology, Event Extraction, Web Page Partition
Abstract—This paper proposes a method for extracting information of crop diseases on Chinese web pages. First, we define some special labels of the DOM tree[1] to partition the web page into some content blocks. Then the noise content in the web pages is eliminated according to the location and the word number of a content block. We employ an ontology-based way to implement information extraction from the content blocks. A top-down method is adopted to construct the ontology of crop diseases. In the extraction process, the concepts, relations and instances of ontology is used to extract the entities. The event is extracted by an optimal classification of paragraph groups in a content block. Experiments demonstrate the performance of the proposed method is satisfactory.
Index Terms—Domain Ontology, Event Extraction, Web Page Partition
Cite: Bo Jiang, Meng-xia Zhu, Jia-le Wang, " Ontology-Based Information Extraction of Crop Diseases on Chinese Web Pages," Journal of Computers vol. 8, no. 1, pp. 85-90, 2013.
General Information
ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
-
Nov 14, 2019 News!
Vol 14, No 11 has been published with online version [Click]
-
Mar 20, 2020 News!
Vol 15, No 2 has been published with online version [Click]
-
Dec 16, 2019 News!
Vol 14, No 12 has been published with online version [Click]
-
Sep 16, 2019 News!
Vol 14, No 9 has been published with online version [Click]
-
Aug 16, 2019 News!
Vol 14, No 8 has been published with online version [Click]
- Read more>>