Title
Boosting text segmentation via progressive classification
Abstract
A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual fragments not reconciled at the end of the previous steps. Classification is accomplished by an ad hoc exploitation of traditional association mining algorithms, and is supported by a data transformation scheme which takes advantage of domain-specific dictionaries/ontologies. A key feature is the capability of progressively enriching the available ontology with the results of the previous stages of classification, thus significantly improving the overall classification accuracy. An extensive experimental evaluation shows the effectiveness of our approach.
Year
DOI
Venue
2008
10.1007/s10115-007-0085-3
Knowl. Inf. Syst.
Keywords
Field
DocType
schema reconciliation · text segmentation · classification,novel approach,progressive classification,previous step,available ontology,multi-stage classification scheme,previous stage,free text,overall classification accuracy,available text,boosting text segmentation,data transformation scheme,text segmentation,data transformation,classification
Ontology (information science),Ontology,Data mining,Tuple,Segmentation,Computer science,Supervised learning,Text segmentation,Artificial intelligence,Boosting (machine learning),Classifier (linguistics),Machine learning
Journal
Volume
Issue
ISSN
15
3
0219-3116
Citations 
PageRank 
References 
15
0.61
16
Authors
5
Name
Order
Citations
PageRank
Eugenio Cesario118821.63
Francesco Folino220221.57
Antonio Locane3181.45
Giuseppe Manco491868.94
Riccardo Ortale528227.46