Abstract | ||
---|---|---|
The internet is a library of a huge amount of information and there is a need for categorize its content based on web page classification. Classification of web page content can improve the quality of web search and its accuracy. Unfortunately the high dimensionality of the web pages dataset has made the process of classification difficult. The use of an automatic method for web page classification can simplify the whole process and assist the search engine in getting more relevant results. Nowadays information on the web is generally structured and formatted in a not formal way. This absence of semantics leads to create formal methods to provide more semantics information into web page. Search engines including Bing, Google, Yahoo! and Yandex formed collection of schemas Schema.org to support web page semantics and improve their search results. This paper explores the use of formal source code structure for classifying a large collection of the web content. Is focused on use of schemas collection Schema.org to classify web pages and categorize them unambiguously. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/CASoN.2012.6412428 | Computational Aspects of Social Networks |
Keywords | Field | DocType |
Internet,Web sites,pattern classification,search engines,search problems,source coding,Bing,Google,Internet,Web page content classification,Web page dataset dimensionality,Web page semantics,Web search quality,Yahoo!,Yandex,content categorization,formal methods,formal source code structure,information library,schema.org collection,search engine,semantics information,Collection of schemas Schema.org,Genres,Microformats,Microgenres,Web Page Clasification | Web search engine,Web development,Static web page,Data mining,World Wide Web,Information retrieval,Web page,Computer science,Web modeling,Backlink,Page view,Web crawler | Conference |
ISSN | ISBN | Citations |
2155-7047 | 978-1-4673-4793-8 | 3 |
PageRank | References | Authors |
0.41 | 4 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jonas Krutil | 1 | 3 | 0.41 |
Milos Kudelka | 2 | 116 | 23.81 |
Václav Snasel | 3 | 1261 | 210.53 |