Extracting Precise Link Context Using NLP Parsing Technique - Citegraph

Paper Info

Title
Extracting Precise Link Context Using NLP Parsing Technique

Abstract
Link context has been exploited extensively ever since the advent of the World Wide Web, but the approach to extracting precise link context has not been fully explored and many state-of-the-art extraction methods are based on simplistic heuristics and require ad-hoc parameters. In this paper, we propose a novel two-step extraction model, which aims to systematically derive link context of quality as high as anchor text. In the macroscopic analysis step, a systematic web page structure analysis is performed to locate the content cohesive text region and potential relevant header or header like tags. In the microscopic extraction step, an English parser is used to extract the relevant sentence fragments in the text region and the nearest heading text is encompassed if the need arises. Preliminary experimental results proved our approach's effectiveness.

Year	DOI	Venue
2004	10.1109/WI.2004.10164	Web Intelligence
Keywords	Field	DocType
link context,state-of-the-art extraction method,text region,derive link context,extracting precise link context,nlp parsing technique,novel two-step extraction model,microscopic extraction step,precise link context,macroscopic analysis step,content cohesive text region,anchor text,web pages,information retrieval,html,structure analysis,microscopy,context modeling,world wide web,data mining	Data mining,Web page,Computer science,Context model,Anchor text,Web modeling,Heuristics,Natural language processing,Artificial intelligence,Header,Information retrieval,Parsing,Sentence	Conference
ISBN	Citations	PageRank
0-7695-2100-2	2	0.40
References	Authors
14	2

Authors (2 rows)

Cited by (2 rows)

References (14 rows)

Name	Order	Citations	PageRank
Qingyang Xu	1	15	2.85
Wanli Zuo	2	342	42.73

1