Abstract | ||
---|---|---|
Blog is becoming more and more popular with the rapid development of Internet. It needs to find an automatic way to distinguish the blog pages from ordinary Web pages for the content extraction of blog pages and the blog community discovered. Some basic concepts and ideas in the area of blog was described in this paper, and a method on the blog pages identification is proposed, which is based on the blog pages structure and blog content. The experimentation shows that a high result can be achieved in precision. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1109/FSKD.2008.371 | FSKD (2) |
Keywords | Field | DocType |
blog pages structure,blog,high result,web pages,blog structure and content,rapid development,narrow blog,blog page,blog community,blog content,web sites,content extraction,internet,ordinary web page,blog pages identification,basic concept,broad blog,information services,navigation | Content extraction,Information system,World Wide Web,Web page,Information retrieval,Computer science,Spam blog,The Internet | Conference |
Volume | ISBN | Citations |
2 | 978-0-7695-3305-6 | 2 |
PageRank | References | Authors |
0.38 | 4 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Feng Yu | 1 | 36 | 10.95 |
Dequan Zheng | 2 | 74 | 21.56 |
Tiejun Zhao | 3 | 643 | 102.68 |
Xiao Cheng | 4 | 2 | 0.38 |