Title
Discovering interesting information with advances in web technology
Abstract
The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which defines meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-specific markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.
Year
DOI
Venue
2012
10.1145/2481244.2481255
SIGKDD Explorations
Keywords
DocType
Volume
semantic web,xml,deep web
Journal
14
Issue
Citations 
PageRank 
2
3
0.39
References 
Authors
73
4
Name
Order
Citations
PageRank
Richi Nayak170679.67
Pierre Senellart294663.47
Fabian M. Suchanek33900188.75
Aparna S. Varde418828.71