Abstract | ||
---|---|---|
Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain-Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases. |
Year | DOI | Venue |
---|---|---|
2011 | 10.2298/CSIS100322028W | COMPUTER SCIENCE AND INFORMATION SYSTEMS |
Keywords | Field | DocType |
Deep Web,ontology,WPC,FSC,FCC | Ontology,Data mining,World Wide Web,Crawling,Web page,Computer science,Deep Web,Focused crawler,Classifier (linguistics) | Journal |
Volume | Issue | ISSN |
8 | 3 | 1820-0214 |
Citations | PageRank | References |
3 | 0.39 | 5 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ying Wang | 1 | 35 | 4.53 |
Huilai Li | 2 | 3 | 1.41 |
Wanli Zuo | 3 | 342 | 42.73 |
Fengling He | 4 | 68 | 7.88 |
Xin Wang | 5 | 41 | 7.32 |
Kerui Chen | 6 | 16 | 2.78 |