Title
Lessons learned in the generation of biomedical research datasets using Semantic Open Data technologies.
Abstract
Biomedical research usually requires combining large volumes of data from multiple heterogeneous sources. Such heterogeneity makes difficult not only the generation of research-oriented dataset but also its exploitation. In recent years, the Open Data paradigm has proposed new ways for making data available in ways that sharing and integration are facilitated. Open Data approaches may pursue the generation of content readable only by humans and by both humans and machines, which are the ones of interest in our work. The Semantic Web provides a natural technological space for data integration and exploitation and offers a range of technologies for generating not only Open Datasets but also Linked Datasets, that is, open datasets linked to other open datasets. According to the Berners-Lee's classification, each open dataset can be given a rating between one and five stars attending to can be given to each dataset. In the last years, we have developed and applied our SWIT tool, which automates the generation of semantic datasets from heterogeneous data sources. SWIT produces four stars datasets, given that fifth one can be obtained by being the dataset linked from external ones. In this paper, we describe how we have applied the tool in two projects related to health care records and orthology data, as well as the major lessons learned from such efforts.
Year
DOI
Venue
2015
10.3233/978-1-61499-512-8-165
Studies in Health Technology and Informatics
Keywords
Field
DocType
Semantic Web,Open Data,Ontology,Information processing
Data integration,Health care,Data science,Data mining,Open data,Information retrieval,Computer science,Semantic Web
Conference
Volume
ISSN
Citations 
210
0926-9630
0
PageRank 
References 
Authors
0.34
2
4