Volume 6 Number 2 (Feb. 2011)
Home > Archive > 2011 > Volume 6 Number 2 (Feb. 2011) >
JCP 2011 Vol.6(2): 178-183 ISSN: 1796-203X
doi: 10.4304/jcp.6.2.178-183

Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition

Yan Deng, Wei-Qiang Zhang, Yan-Min Qian, and Jia Liu
Tsinghua National Laboratory for Information Science and Technology Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

Abstract—The performance of the phonotactic system for language recognition depends on the quality of the phone recognizers. To improve the performance of the recognizers, this paper investigates the use of new acoustic features and discriminative training techniques for phone recognizers. The commonly used features are static ceptral coefficients appended with their first and second order deltas. This configuration may be not optimal for phone recognition in phonotactic language recognition systems. In this paper, a time-frequency cepstral (TFC) feature is proposed based on our previous work in acoustic language recognition systems. The feature is extracted as follows: first a temporal discrete cosine transform (DCT) is carried out on the cepstrum matrix, and then select the transformed elements in a specific area using the variance maximization criterion. Different parameters are tested to obtain the optimal configuration. Also, we adopt the feature minimum phone error (fMPE) method for discriminative training of phone models to obtain better phone recognition results for further improvement. The effectiveness of the two techniques is demonstrated on the NIST Language Recognition Evaluation (LRE) 2007database, including the 30 second, 10 second and 3 second closed-set test conditions.

Index Terms—phonotactic language recognition, phone recognizer, time-frequency cepstrum (TFC), feature minimum phone error (fMPE)

[PDF]

Cite: Yan Deng, Wei-Qiang Zhang, Yan-Min Qian, and Jia Liu, "Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition," Journal of Computers vol. 6, no. 2, pp. 178-183, 2011.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>