Title
Studying the impact of language-independent and language-specific features on hybrid Arabic Person name recognition.
Abstract
In this paper, extensive experiments are conducted to study the impact of features of different categories, in isolation and gradually in an incremental manner, on Arabic Person name recognition. We present an integrated system that employs the rule-based approach with the machine learning (ML)-based approach in order to develop a consolidated hybrid system. Our feature space is comprised of language-independent and language-specific features. The explored features are naturally grouped under six categories: Person named entity tags predicted by the rule-based component, word-level features, POS features, morphological features, gazetteer features, and other contextual features. As decision tree algorithm has proved comparatively higher efficiency as a classifier in current state-of-the-art hybrid Named Entity Recognition for Arabic, it is adopted in this study as the ML technique utilized by the hybrid system. Therefore, the experiments are focused on two dimensions: the standard dataset used and the set of selected features. A number of standard datasets are used for the training and testing of the hybrid system, including ACE (2003---2004) and ANERcorp. The experimental analysis indicates that both language-independent and language-specific features play an important role in overcoming the challenges posed by Arabic language and have demonstrated critical impact on optimizing the performance of the hybrid system.
Year
DOI
Venue
2017
10.1007/s10579-016-9376-1
Language Resources and Evaluation
Keywords
Field
DocType
Named entity recognition,Information extraction,Rule-based approach,Machine learning,Hybrid approach,Natural language processing
Feature vector,Arabic,Computer science,Named entity,Speech recognition,Information extraction,Artificial intelligence,Natural language processing,Classifier (linguistics),Named-entity recognition,Hybrid system,Decision tree learning
Journal
Volume
Issue
ISSN
51
2
1574-020X
Citations 
PageRank 
References 
0
0.34
35
Authors
2
Name
Order
Citations
PageRank
Mai Oudah1403.01
Khaled F. Shaalan250639.80