Title
Exploring and exploiting a historical corpus for Arabic
Abstract
This paper presents a historical Arabic corpus named HAC. At this early embryonic stage of the project, we report about the design, the architecture and some of the experiments which we have conducted on HAC. The corpus, and accordingly the search results, will be represented using a primary XML exchange format. This will serve as an intermediate exchange tool within the project and will allow the user to process the results offline using some external tools. HAC is made up of Classical Arabic texts that cover 1600 years of language use; the Quranic text, Modern Standard Arabic texts, as well as a variety of monolingual Arabic dictionaries. The development of this historical corpus assists linguists and Arabic language learners to effectively explore, understand, and discover interesting knowledge hidden in millions of instances of language use. We used techniques from the field of natural language processing to process the data and a graph-based representation for the corpus. We provided researchers with an export facility to render further linguistic analysis possible.
Year
DOI
Venue
2016
10.1007/s10579-015-9304-9
Language Resources and Evaluation
Keywords
Field
DocType
Historical Arabic corpus, Corpus tools, Natural language processing, Arabic word usage over time, Semantic change
Early embryonic stage,Architecture,Arabic,XML,Classical Arabic,Computer science,Speech recognition,Modern Standard Arabic,Artificial intelligence,Natural language processing,Corpus linguistics,Semantic change
Journal
Volume
Issue
ISSN
50
4
1574-0218
Citations 
PageRank 
References 
0
0.34
13
Authors
4
Name
Order
Citations
PageRank
Bassam Hammo1556.46
sane m yagi200.34
omaima ismail300.34
Mohammad A. M. Abushariah4476.02