Abstract | ||
---|---|---|
Web page recognition is a problem in the design of web crawler in theme search engine. This paper designs a web page recognition algorithm based on link analysis to solve this problem. The main idea of this algorithm is to get the relevant web page recognition model through a combination of link analysis and theme URL knowledge base, based on the idea of statistics and social network analysis. Through the experiment, the precision rate of this algorithm is over 93 percent, and the recall rate is up to 85.4 percent. So the experiment is significant, better than other web page recognition algorithm. Experimental results show the feasibility and effectiveness of this algorithm. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/CGC.2012.42 | CGC |
Keywords | Field | DocType |
web page recognition,pattern recognition,statistics,theme search engine,link analysis,web page recognition algorithm,relevant web page recognition,web crawler design,theme knowledge recognition,theme url knowledge base,main idea,web design,precision rate,recall rate,social network analysis,search engines,web crawler | Web design,Web search engine,Static web page,Data mining,HITS algorithm,Information retrieval,Web page,Computer science,Rewrite engine,Backlink,Web crawler | Conference |
ISBN | Citations | PageRank |
978-1-4673-3027-5 | 1 | 0.34 |
References | Authors | |
7 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zude Chen | 1 | 1 | 0.34 |
Jianxun Liu | 2 | 640 | 67.12 |
Haijun Zhai | 3 | 62 | 7.40 |
Lei Jiang | 4 | 10 | 3.59 |
Buqing Cao | 5 | 200 | 23.96 |