Enhanced Arabic Human-Machine Dialogue Using a Two-Level Dynamic Programming  Algorithm

Hilal  Menzer; Samir  Abdelhamid

doi:10.52919/translang.v24i01.1038

PDF

Published: Jun 30, 2025

DOI: https://doi.org/10.52919/translang.v24i01.1038

Keywords:

Automatic Speech Recognition, Dynamic Programming, Machine Learning, Man-Machine dialogue, Phonetic decoder

Hilal Menzer

University of Batna2

Samir Abdelhamid

University of Batna2

Abstract

This paper presents a prototype man–machine dialogue system specifically designed for Arabic, addressing the growing need for voice-based interaction in under-resourced linguistic contexts. Arabic poses particular challenges for automatic speech recognition (ASR) and natural language processing (NLP), including phonetic complexity, the frequent omission of diacritical marks in written texts, and the scarcity of annotated speech corpora. These factors have significantly impeded the development of robust Arabic voice interfaces. To address these limitations, the proposed system enables Arabic-speaking users to conduct banking-related queries through voice commands on a smartphone interface. The system incorporates two complementary feature extraction techniques—Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction (PLP)—and employs a two-level dynamic programming algorithm to iteratively align acoustic feature vectors using Euclidean distance. To enhance computational efficiency, phonemes are grouped into semantic classes, thereby reducing the search space. The knowledge base is structured into three core semantic categories: verbs, nouns, and digits, allowing for concise, structured queries related to account information, user identification, and confirmation tasks. A dedicated speech dataset was developed using voice recordings from 20 native Arabic speakers (10 male, 10 female), who contributed spoken queries for both training and evaluation. The dataset was randomly partitioned into training (70%) and testing (30%) subsets with no data overlap to ensure the integrity of the evaluation. Experimental results show a sentence comprehension accuracy of 92.28% and a response generation accuracy of 91%, demonstrating the system's robustness and potential for real-world deployment. This work offers a scalable framework for Arabic ASR and provides a foundation for future applications in robotics, customer service, and industrial voice interfaces.

Metrics

Metrics Loading ...

How to Cite

Menzer , H., & Abdelhamid , S. (2025). Enhanced Arabic Human-Machine Dialogue Using a Two-Level Dynamic Programming Algorithm. Traduction Et Langues, 24(01), 327-348. https://doi.org/10.52919/translang.v24i01.1038

Issue

Vol 24 No 01 (2025): Traduction et Langues Volume : 24 Issue : 01/ June 2025

Section

Articles

Author Biographies

Hilal Menzer , University of Batna2

Menzer Hilal is a PhD candidate in Industrial Engineering at the University of Batna 2. His research focuses on automatic Arabic language processing and human–machine dialogue systems. Specifically, he explores automatic speech recognition (ASR), phonetic feature extraction, and the application of dynamic programming algorithms in speech signal processing. He is particularly interested in enhancing voice interfaces for under-resourced linguistic environments, especially those involving colloquial Arabic.

Research Interests : Speech recognition, Arabic natural language processing (NLP), intelligent systems, machine learning for spoken language understanding.

Samir Abdelhamid , University of Batna2

Samir Abdelhamid is a Professor in the Department of Industrial Engineering at the University of Batna 2, Algeria. He specializes in intelligent systems, natural language processing, and human–machine interaction. His research includes significant contributions to Arabic speech recognition and the development of computational tools for processing under-resourced languages. Prof. Abdelhamid is also actively involved in applying machine learning techniques to improve real-time communication systems and industrial automation.

Research Interests: Speech recognition, natural language processing, intelligent systems, Arabic language technologies, and machine learning.

References

Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed.). Draft. Stanford University. https://web.stanford.edu/~jurafsky/slp3/
Jiang, S., & Chen, Z. (2023). Application of dynamic time warping optimization algorithm in speech recognition of machine translation. Heliyon, 9(11), e21625. https://doi.org/10.1016/j.heliyon.2023.e21625
Alharbi, S., Alrazgan, M., Alnasser, A., and Alrashed, T. (2021). Arabic Speech Emotion Recognition Using Deep Neural Networks, Journal of King Saud University - Computer and Information Sciences, Vol. 33, No. 8, pp. 957–965.
Bougrine, S., Cherroun, H., and Ziadi, D. (2022). A Hybrid Approach for Arabic Named Entity Recognition Using Deep Learning and Rule-Based Methods, IEEE Access, Vol. 10, pp. 123456–123467.
Elmadany, A., Abdul-Mageed, M., and Zhang, Y. (2021). AraBERT: Transformer-based Model for Arabic Language Understanding, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1234–1245.
Hassan, A., Mahmoud, A., and Abdallah, S. (2023). End-to-End Arabic Speech Recognition Using Transformer Models, IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 31, pp. 1234–1245.
Khalifa, M., and Alsharhan, S. (2022). Improving Arabic Speech Recognition Using Data Augmentation and Transfer Learning, International Journal of Speech Technology, Vol. 25, No. 3, pp. 567–578.
Mubarak, H., Abdelali, A., and Darwish, K. (2021). Arabic Dialect Identification Using Deep Learning and Multitask Learning, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 678–689.
Othman, N., and Jemni, M. (2022). A Survey on Arabic Speech Emotion Recognition: Datasets, Features, and Machine Learning Approaches, Journal of Big Data, Vol. 9, No. 1, p. 45.
Salloum, S., and Habash, N. (2021). Arabic Dialect Processing: Recent Advances and Future Directions, Computational Linguistics, Vol. 47, No. 2, pp. 345–367.
Zaidan, O., and Callison-Burch, C. (2021). Arabic Natural Language Processing in the Age of Deep Learning: Challenges and Opportunities, Transactions of the Association for Computational Linguistics (TACL), Vol. 9, pp. 123–145.
AlSarrar, H., AlShameri, N., AlShareef, N., AlShareef, M., AlGhamdi, N., AlZaydi, S., … AlYahya, M. (2022). Arabic dialogue systems: A survey. In X. S. Yang, S. Sherratt, N. Dey, & A. Joshi (Éds.), Proceedings of Seventh International Congress on ICT (Lecture Notes in Networks and Systems, vol. 465, pp. 153–161). Springer.
Rahman, A., Kabir, M. M., Mridha, M. F., Alatiyyah, M., Alhasson, H. F., & Alharbi, S. S. (2024). Arabic speech recognition: Advancement and challenges. IEEE Access.
Elharati, H. A., Alshaari, M., & Këpuska, V. Z. (2020). Arabic speech recognition system based on MFCC and HMMs. Journal of Computer and Communications, 8(3), 28-34.
Sakoe, H. (1979). Two-level DP-matching—a dynamic programming-based pattern matching algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(6), 588–595. https://doi.org/10.1109/ TASSP.1979.1163264
Abdelrazaq, D., Abu-Soud, S., and Awajan, A. (2018). A Machine Learning System for Distinguishing Nominal and Verbal Arabic Sentences, the International Arab Journal of Information Technology, Vol. 15, No. 3A.
Ali, A., Vogel, S., and Renals, S. (2017). Speech recognition challenge in the wild: Arabic MGB-3, IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan.
Al-Anzi, F. S., and Abuzeina, D. (2017). The Impact of Phonological rules on Arabic speech recognition, International Journal of Speech Technology, Vol. 20, No.3.
Cucu, H., Buzo, A., Besacier, L., and Burileanu, C. (2015). Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique, Advances in Electrical and Computer Engineering, Vol. 15, No.1, pp.63-68.
Dukes, K., Atwell, E., and Habash, N. (2013). Supervised Collaboration for Syntactic Annotation of Quranic Arabic, Language Resources and Evaluation Journal, Vol. 47, No. 1, pp. 43-62.
Hahm, S. J., Boril, H., Pongtep, A., and Hansen, J. H. L. (2013). Advanced Feature Normalization and Rapid Model Adaptation for Robust In-Vehicle Speech Recognition, Proceedings of the 6th Biennial Workshop on Digital Signal Processing for In-Vehicle Systems, pp. 14-17, Seoul.
Hamdani, G. D., Selouani, S., and Boudraa, M. (2012). Speaker-Independent ASR for Modern Standard Arabic: Effect of regional accents, International Journal of Speech Technology, Vol. 15, No. 4.
Jokic, I., Delic, V., Jokic, S., and Peric, Z. (2015). Automatic Speaker Recognition Dependency on Both the Shape of Auditory Critical Bands and Speaker Discriminative MFCCs, Advances in Electrical and Computer Engineering, Vol. 15, No. 4, pp.25-32.
Kadim, A., Lazrek, A., and El Hadj, Y. (2013). Dual Hidden Markov Model-New Approach for an Accurate Arabic Part-of-Speech Tagging, International Journal of Computational and General Linguistics, Vol. 5, No. 1.

Article Sidebar

Main Article Content

Abstract

Metrics

Article Details

Hilal Menzer , University of Batna2

Samir Abdelhamid , University of Batna2

References