Structure and Content Based Blog Pages Identification - Citegraph

Paper Info

Title
Structure and Content Based Blog Pages Identification

Abstract
Blog is becoming more and more popular with the rapid development of Internet. It needs to find an automatic way to distinguish the blog pages from ordinary Web pages for the content extraction of blog pages and the blog community discovered. Some basic concepts and ideas in the area of blog was described in this paper, and a method on the blog pages identification is proposed, which is based on the blog pages structure and blog content. The experimentation shows that a high result can be achieved in precision.

Year	DOI	Venue
2008	10.1109/FSKD.2008.371	FSKD (2)
Keywords	Field	DocType
blog pages structure,blog,high result,web pages,blog structure and content,rapid development,narrow blog,blog page,blog community,blog content,web sites,content extraction,internet,ordinary web page,blog pages identification,basic concept,broad blog,information services,navigation	Content extraction,Information system,World Wide Web,Web page,Information retrieval,Computer science,Spam blog,The Internet	Conference
Volume	ISBN	Citations
2	978-0-7695-3305-6	2
PageRank	References	Authors
0.38	4	4

Authors (4 rows)

Cited by (2 rows)

References (4 rows)

Name	Order	Citations	PageRank
Feng Yu	1	36	10.95
Dequan Zheng	2	74	21.56
Tiejun Zhao	3	643	102.68
Xiao Cheng	4	2	0.38

1