Title
Research on discovering deep web entries.
Abstract
Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain-Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases.
Year
DOI
Venue
2011
10.2298/CSIS100322028W
COMPUTER SCIENCE AND INFORMATION SYSTEMS
Keywords
Field
DocType
Deep Web,ontology,WPC,FSC,FCC
Ontology,Data mining,World Wide Web,Crawling,Web page,Computer science,Deep Web,Focused crawler,Classifier (linguistics)
Journal
Volume
Issue
ISSN
8
3
1820-0214
Citations 
PageRank 
References 
3
0.39
5
Authors
6
Name
Order
Citations
PageRank
Ying Wang1354.53
Huilai Li231.41
Wanli Zuo334242.73
Fengling He4687.88
Xin Wang5417.32
Kerui Chen6162.78