Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets - Citegraph

Paper Info

Title
Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets

Abstract
Various information extraction (IE) systems for corporate usage exist. However, none of them target the product development and/or customer service domain, despite significant application potentials and benefits. This domain also poses new scientific challenges, such as the lack of external knowledge resources, and irregularities like ungrammatical constructs in textual data, which compromise successful information extraction. To address these issues, we describe the development of Textractor; an application for accurately extracting relevant concepts from irregular textual narratives in datasets of product development and/or customer service organizations. The extracted information can subsequently be fed to a host of business intelligence activities. We present novel algorithms, combining both statistical and linguistic approaches, for the accurate discovery of relevant domain concepts from highly irregular/ungrammatical texts. Evaluations on real-life corporate data revealed that Textractor extracts domain concepts, realized as single or multi-word terms in ungrammatical texts, with high precision.

Year	DOI	Venue
2010	10.1007/978-3-642-12814-1_7	Lecture Notes in Computer Science
Keywords	Field	DocType
Natural Language processing,term extraction,information extraction,corporate industrial data,product development,customer service	Customer service,Computer science,Narrative,Information extraction,Compromise,Business intelligence,Marketing,New product development	Conference
Volume	ISSN	Citations
47	1865-1348	4
PageRank	References	Authors
0.43	12	4

Authors (4 rows)

Cited by (4 rows)

References (12 rows)

Name	Order	Citations	PageRank
Ashwin Ittoo	1	61	6.58
Laura Maruster	2	942	55.97
Hans Wortmann	3	77	11.13
Gosse Bouma	4	483	70.88

1