Title
First Study on Data Readiness Level.
Abstract
We introduce the idea of Data Readiness Level (DRL) to measure the relative richness of data to answer specific questions often encountered by data scientists. We first approach the problem in its full generality explaining its desired mathematical properties and applications and then we propose and study two DRL metrics. Specifically, we define DRL as a function of at least four properties of data: Noisiness, Believability, Relevance, and Coherence. The information-theoretic based metrics, Cosine Similarity and Document Disparity, are proposed as indicators of Relevance and Coherence for a piece of data. The proposed metrics are validated through a text-based experiment using Twitter data.
Year
Venue
Field
2017
arXiv: Information Retrieval
Data mining,Information retrieval,Cosine similarity,Computer science,Coherence (physics),Mathematical properties,Generality
DocType
Volume
Citations 
Journal
abs/1702.02107
0
PageRank 
References 
Authors
0.34
3
4
Name
Order
Citations
PageRank
Hui Guan112.04
Thanos Gentimis2262.90
Hamid Krim352059.69
James Keiser400.34