Title
From the Texts to the Contexts They Contain: A Chain of Linguistic Treatments
Abstract
Abstract The text-mining system we are building deals with the specic,problem of identifying the instances of relevant concepts present in the texts. Our system relies therefore on interactions between a eld expert and the various linguistic modules we use, often adapted from existing ones, such as Brill’s tagger or CMU’s Link parser. We have developed learning procedures adapted to various steps of the linguistic treatment, mainly for grammatical tagging, terminology, and concept learning. Our interaction with the expert differs from classical supervised learning, in that the expert is not simply a resource who is only able to provide examples, and unable to provide the formalized knowledge underlying these examples. We are developing specic,programming languages which enable the eld expert to intervene directly in some of the linguistic tasks. Our approach is thus devoted to help one expert in one eld to detect the concepts relevant for his/her eld, using a large amount of texts. Our approach is made of two steps. The rst one is an automatic approach that nds,relevant and novel sentences in the texts. The second one is based on the expert’s knowledge and nds,more specic,relevant sentences. Working on 50 different domains without an expert has been a challenge in itself, and explains our relatively poor results for the rst Novelty task.
Year
Venue
Keywords
2004
TREC
concept learning,supervised learning,programming language,text mining
Field
DocType
Citations 
Computer science,Natural language processing,Artificial intelligence,Linguistics
Conference
0
PageRank 
References 
Authors
0.34
9
5
Name
Order
Citations
PageRank
Ahmed Amrani1102.28
Jérôme Azé27315.66
Thomas Heitz301.01
Yves Kodratoff4581172.25
Mathieu Roche59624.74