Abstract | ||
---|---|---|
Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike. This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language. In this research, we present a direct classification of Arabic discourse document while writing. This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification. To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion. System performance was analysed, in terms of accuracy, recall, precision, and F-measure. The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%. |
Year | DOI | Venue |
---|---|---|
2020 | 10.34028/iajit/17/3/13 | INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY |
Keywords | DocType | Volume |
Text mining, Arabic discourse, text classification, topic modling, n-gram language model, topical coherence | Journal | 17 |
Issue | ISSN | Citations |
3 | 1683-3198 | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Khalid M. O Nahar | 1 | 2 | 2.07 |
Ra’ed M. Al-Khatib | 2 | 3 | 2.06 |
Moy'awiah Al-Shannaq | 3 | 0 | 0.34 |
Mohammad Daradkeh | 4 | 0 | 0.34 |
Rami Malkawi | 5 | 0 | 1.01 |