Volume 9 Number 4 (Apr. 2014)
Home > Archive > 2014 > Volume 9 Number 4 (Apr. 2014) >
JCP 2014 Vol.9(4): 867-874 ISSN: 1796-203X
doi: 10.4304/jcp.9.4.867-874

Recognizing Chinese Number and Quantifier Prefix to Enhance Statistical Parser in Machine Translation

Wen Xiong1, 2, Yaohong Jin1 and Zhiying Liu1
1Beijing Normal University, Institute of Chinese Information Processing, Beijing, China
2The China Patent Information Center, the State Intellectual Property Office, Beijing, China


Abstract—To study recognition of Chinese number and quantifier prefix (CNQP) in Chinese-English machine translation, which was used for improving the results of a statistical parser, this paper proposes a method for the recognition of CNQPs based on rules and independent of word segmentation. First, it analyzed the Components of CNQPs, and offered samples of each component. In addition, it supplied Backus-Naur Forms (BNF) to express more complex components, which were composed by other components and smaller components. Then, it gave ten production rules for the recognition of CNQPs, and 277 words for substance quantifiers, 17 words for action quantifiers, 13 words for time quantifiers, and 129 words for measurement quantifiers in appendix, which were important resources for the recognition method. Afterwards, it described the algorithm, and illustrated the processing flow, which used the components, BNFs, and ten rules. To avoid the word segmentation noise, the algorithm took the numeral as the active information, and utilized a forward maximum matching method to obtain the compositions of the CNQPs, which can be fed into the Chinese parser to enhance the parsing results. The experimental results indicate the proposed method can be integrated into the statistical parser as a pre-processing module without retraining on experimental data constructed manually, which can further boost the translation quality.

Index Terms—machine translation, statistical parser, Chinese number and quantifier prefix, Backus-Naur form

[PDF]

Cite: Wen Xiong, Yaohong Jin and Zhiying Liu, "Recognizing Chinese Number and Quantifier Prefix to Enhance Statistical Parser in Machine Translation," Journal of Computers vol. 9, no. 4, pp. 867-874, 2014.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>