Title
Detecting hidden structures from Arabic electronic documents: Application to the legal field
Abstract
Dealing with unstructured information is currently a hot research topic since most documents exist in an unstructured form. The effective exploitation of unstructured document, although intricate, is of paramount importance to Information Retrieval (IR). The key to using unstructured data set is to identify the hidden structures within the data set. In this paper, we present an approach to recognize the semantic structure of documents in Arabic legal data. Several main concepts of a document are expressed in this structure, which includes title, the headings of the chapters, sections, subsections, etc. This structural information is employed to obtain a richer and more fine-grained annotation of documents forming a useful and coherent infrastructure ready for IR. Some experiments were conducted in order to evaluate our approach. The initial results seem promising.
Year
DOI
Venue
2016
10.1109/SERA.2016.7516131
2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA)
Keywords
Field
DocType
document ontology,document structure extraction,document annotation,legal information retrieval
Data mining,Well-formed document,Document clustering,Computer science,Unstructured data,Artificial intelligence,Natural language processing,Document retrieval,Ontology (information science),Annotation,Information retrieval,Legal information retrieval,Semantics
Conference
ISBN
Citations 
PageRank 
978-1-5090-0810-0
1
0.36
References 
Authors
7
2
Name
Order
Citations
PageRank
Imen Bouaziz Mezghanni172.27
Faïez Gargouri224492.29