Document Similarity for Arabic and Cross-Lingual Web Content - Citegraph

Paper Info

Title
Document Similarity for Arabic and Cross-Lingual Web Content

Abstract
Document similarity is basic for Information Retrieval. Cross Lingual (CL) similarity is important for many data processing tasks such as CL palgiarism detection and retrieval and document quality assessment. We study CL similarity based on the Explicit Semantic Association (ESA) adapted to a cross lingual setting with focus on Arabic. We compare the degree to which CL similarity testing performs where one of the language is Arabic with its monolingual counterpart for various text chunk sizes. We describe the used infrastructure and report on some of the testing results, study the possible sources of encountered weaknesses and point to the possible directions for improvement.

Year	DOI	Venue
2017	10.1007/978-3-319-73500-9_10	Communications in Computer and Information Science
Keywords	DocType	Volume
Cross lingual information retrieval,Document similarity Explicit Semantic Association,CL-ESA,Arabic information retrieval	Conference	782
ISSN	Citations	PageRank
1865-0929	0	0.34
References	Authors
0	2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ali Salhi	1	1	0.69
Adnan H. Yahya	2	0	0.34

1