Title
Efficient data management in a large-scale epidemiology research project.
Abstract
This article describes the concept of a "Central Data Management" (CDM) and its implementation within the large-scale population-based medical research project "Personalized Medicine". The CDM can be summarized as a conjunction of data capturing, data integration, data storage, data refinement, and data transfer. A wide spectrum of reliable "Extract Transform Load" (ETL) software for automatic integration of data as well as "electronic Case Report Forms" (eCRFs) was developed, in order to integrate decentralized and heterogeneously captured data. Due to the high sensitivity of the captured data, high system resource availability, data privacy, data security and quality assurance are of utmost importance. A complex data model was developed and implemented using an Oracle database in high availability cluster mode in order to integrate different types of participant-related data. Intelligent data capturing and storage mechanisms are improving the quality of data. Data privacy is ensured by a multi-layered role/right system for access control and de-identification of identifying data. A well defined backup process prevents data loss. Over the period of one and a half year, the CDM has captured a wide variety of data in the magnitude of approximately 5terabytes without experiencing any critical incidents of system breakdown or loss of data. The aim of this article is to demonstrate one possible way of establishing a Central Data Management in large-scale medical and epidemiological studies.
Year
DOI
Venue
2012
10.1016/j.cmpb.2010.12.016
Computer Methods and Programs in Biomedicine
Keywords
Field
DocType
central data management,data privacy,complex data model,data security,data refinement,data storage,large-scale epidemiology research project,efficient data management,participant-related data,data loss,data integration,intelligent data,individualized medicine,personalized medicine,electronic data capture
Data warehouse,Data modeling,Data administration,Data quality,Data governance,Computer science,Data element,Data virtualization,Database,Data migration
Journal
Volume
Issue
ISSN
107
3
1872-7565
Citations 
PageRank 
References 
8
1.00
3
Authors
6
Name
Order
Citations
PageRank
Jens Meyer1111.53
Stefan Ostrzinski281.00
Daniel Fredrich3111.53
Christoph Havemann4132.10
Janina Krafczyk581.00
Wolfgang Hoffmann6213.63