Title: "Enhancing Arabic Phoneme Recognizer using Duration Modeling Techniques"

DOI: 10.15224/978-1-63248-113-9-53
Page(s): 69 - 73


In some languages like Classical Arabic (The language of the Holy Quran), phoneme duration is considered as a distinguishing cue in Quranic phonology. Phonological variation of phonemes occurrences contributes to an inaccurate pronunciation of phonemes and therefore inaccurate ASR system. Thus a good phonemes duration modeling can be an essential issue. Currently, the most effective models used in automatic speech recognition (ASR) systems are based on statistical approaches namely Hidden Markov Model (HMM). In standard HMM speech recognition framework, the duration information is poorly employed. However, previous studies have demonstrated that using an HMM with explicit duration modeling techniques have improved the recognition performances in many targeted languages. This paper presents an important phase of our ongoing work which aims to build an accurate Arabic recognizer for teaching and learning purposes. It presents an implementation of an HsMM model (Hidden semi-Markov Model) whose main role is enhancing the classical HMM duration behavior. In this model, both Gamma and Gaussian distributions are used for modeling state durations and compared with the standard geometrical distribution. Experiments have been conducted on a particular database of ten speakers and more than eight hours of speech collected from recitations of the Holy Quran in which all classical Arabic sounds are covered. Results show an accuracy improvement of about 1% over the baseline HMM-based recognizer, which proves the suitability of Gamma distribution in state duration modeling.