A practical experience concerning the parallel semantic annotation of a large-scale data collection - Citegraph

Paper Info

Title
A practical experience concerning the parallel semantic annotation of a large-scale data collection

Abstract
From a computational point of view, the semantic annotation of large-scale data collections is an extremely expensive task. One possible way of dealing with this drawback is to distribute the execution of the annotation algorithm in several computing environments. In this paper, we show how the problem of semantically annotating a large-scale collection of learning objects has been conducted. The terms related to each learning object have been processed. The output was an RDF graph computed from the DBpedia database. According to an initial study, the use of a sequential implementation of the annotation algorithm would require more than 1600 CPU-years to deal with the whole set of learning objects (about 15 millions). For this reason, a framework able to integrate a set of heterogeneous computing infrastructures has been used to execute a new parallel version of the algorithm. As a result, the problem was solved in 178 days.

Year	DOI	Venue
2013	10.1145/2506182.2506191	I-SEMANTICS
Keywords	Field	DocType
computational point,heterogeneous computing infrastructure,practical experience,large-scale collection,dbpedia database,computing environment,semantic annotation,whole set,annotation algorithm,parallel semantic annotation,rdf graph,large-scale data collection	Drawback,Data collection,Data mining,Annotation,Semantic annotation,Information retrieval,Computer science,Symmetric multiprocessor system,Learning object,Rdf graph	Conference
Citations	PageRank	References
1	0.37	19
Authors
6

Authors (6 rows)

Cited by (1 rows)

References (19 rows)

Name	Order	Citations	PageRank
Javier Fabra	1	53	10.12
Sergio Hernández	2	6	1.80
Pedro Álvarez	3	57	11.56
Estefanía Otero	4	1	0.37
Juan Carlos Vidal	5	63	9.75
Manuel Lama	6	383	34.84

1