Title
Improved Letter Weighting Feature Selection on Arabic Script Language Identification
Abstract
Language identification is the process identifying predefined language in a document automatically; we focused on the web documents in this paper. Initially, we have applied the letter frequency as features combine with neural networks in Arabic script language identification. However, reliability of selected letters of the features is a major issue to be overcome. Therefore, we propose an improved letter weighting feature selection in order to enhance the effectiveness of language identification. It is based on the concept letter frequency document frequency. From the experiments, we have found that the improved letter weighting feature selection achieve the highest accuracy 99.75% on Arabic script language identification.
Year
DOI
Venue
2009
10.1109/ACIIDS.2009.33
ACIIDS
Keywords
Field
DocType
neural nets,computational modeling,information systems,testing,natural languages,information retrieval,frequency,data mining,feature extraction,encoding,neural network,language identification,natural language processing,neural networks,accuracy,computer science,scripting language,feature selection,database systems
Data mining,Weighting,Feature selection,Computer science,Natural language processing,Artificial intelligence,Artificial neural network,Arabic script,Letter frequency,Feature extraction,Speech recognition,Language identification,Document handling,Machine learning
Conference
Citations 
PageRank 
References 
2
0.36
11
Authors
2
Name
Order
Citations
PageRank
Choon-Ching Ng1396.64
Ali Selamat271777.40