Title
Traitements Automatiques pour la Migration de Documents Numériques vers XML
Abstract
More and more companies are migrating their legacy document management sys- tems toward XML format, the industrial standard for data exchange. In order to reduce the migration cost we propose an approach aimed at automating the conversion of layout-oriented documents to semantic-oriented annotations. The conversion module uses supervised machine learning techniques to learn a conversion model for a collection of documents. The conver- sion is achieved through a semantic annotation of the document content and structuring the annotations, accordingly to a XML schema that specify the class of target documents.
Year
DOI
Venue
2006
10.3166/dn.9.1.9-24
Document Numérique
Keywords
Field
DocType
xml.,extraction d'informations,xml. keywords:machine learning,mots-clés :apprentissage supervisé,information extraction,document management,machine learning,xml schema,data exchange
XML,Computer science,Document Structure Description,Electronic document,Humanities,XML schema,Automatic processing,Linguistics,Markup language
Journal
Volume
Issue
Citations 
9
1
0
PageRank 
References 
Authors
0.34
10
3
Name
Order
Citations
PageRank
Jérôme Fuselier1283.63
Boris Chidlovskii241152.58
Domaine Universitaire3193.45