Software Products

1.    Corpora:

  • High Quality Speech Corpus of a Part of the Holy Quran: Speech data has been collected from Quranic recitations of eleven chosen reciters (one of them is Sheikh Ali Abdelrahman Alhudhaifi, one of the famous reciters in the Islamic world) and then accurately segmented and labeled on three levels: allophone, phoneme, and word. A labeling scheme covering all Quranic Sounds and their phonological variations is used in the phonetic transcription. An appropriate textual version of the Holy Quran is used to build text files associated with the sound ones.
  • Textual Corpus of the Holy Quran: A fully-diacritized textual version of the Holy Quran has been prepared and then morphologically analyzed. Each word is split in four parts (prefix, suffix, stem and root) and kept in its original context (Quranic verses).
  • Small Corpus of Traditional Texts: some texts extracted from old books (255 Hijri). Although these texts are not so old (third century Hijri), their styles may vary greatly from those of MSA employed nowadays. A manual morphological analysis followed by a POS tagging of this corpus has been performed.

2.    Software :

  • E-Halagat: an E-Learning System for Teaching the Holy Quran. System devoted to teaching how to recite the noble Quran and to memorize it in a manner similar to the usual way followed in the Quranic schools and rings at mosques, known in Arabic as "halagat" (حلقات).
  • E-Taj: an E-Learning System for Tajweed. System for self learning of tajweed that allows a full dynamic interaction with the learner to practice tajweed rules. It provides different options allowing learners to get maximum benefit.
  • Quranic Similarity Engine: tool for determining the similarity (tashaboh) between verses (ayah) of the noble Quran.
  • Part-of-Speech Tagger for Arabic Texts: a tagging system that combines morphological analysis with Hidden Markov Models (HMMs) and uses the Arabic sentence structure to determine the appropriate tags.

