Title
Analysis of semantic URLs to support automated linking of structured data on the web
Abstract
A growing amount of structured data can be found embedded in web pages using formats such as RDFa, JSON-LD and Microdata. Although such data is indexed by search engines and sometimes replicated in centralised knowledge bases, application scenarios exist in which there is a need to discover such data on-the-fly, for example when using the follow-your-nose principle of accessing Linked Open Data, or in applications where the velocity at which data changes can result in centralised repositories being out of date. In this paper we demonstrate two complementary techniques for aiding such applications by analysing URLs. Firstly, we demonstrate that machine learning can be of benefit in predicting, from previously encountered URLs, the likelihood of encountering structured data in an unseen URL. This can be applied within applications that encounter large number of possible URLs to dereference, and must implement some priority scheme to choose relevant URLs. Secondly, we demonstrate that association rule mining can be of use in linking existing resources in a knowledge base, such as DBpedia, to URLs that follow common schemes, such as Semantic (search engine friendly) URLs.
Year
DOI
Venue
2017
10.1145/3102254.3102265
WIMS
Field
DocType
ISBN
Data mining,World Wide Web,Web page,Information retrieval,Computer science,URL normalization,Linked data,Association rule learning,Microdata (HTML),Semantic URL,Knowledge base,Data model
Conference
978-1-4503-5225-3
Citations 
PageRank 
References 
0
0.34
9
Authors
1
Name
Order
Citations
PageRank
Steven Lynden1192.19