Title
EvalDNN: a toolbox for evaluating deep neural network models
Abstract
ABSTRACTRecent studies have shown that the performance of deep learning models should be evaluated using various important metrics such as robustness and neuron coverage, besides the widely-used prediction accuracy metric. However, major deep learning frameworks currently only provide APIs to evaluate a model's accuracy. In order to comprehensively assess a deep learning model, framework users and researchers often need to implement new metrics by themselves, which is a tedious job. What is worse, due to the large number of hyper-parameters and inadequate documentation, evaluation results of some deep learning models are hard to reproduce, especially when the models and metrics are both new. To ease the model evaluation in deep learning systems, we have developed EvalDNN, a user-friendly and extensible toolbox supporting multiple frameworks and metrics with a set of carefully designed APIs. Using EvalDNN, evaluation of a pre-trained model with respect to different metrics can be done with a few lines of code. We have evaluated EvalDNN on 79 models from TensorFlow, Keras, GluonCV, and PyTorch. As a result of our effort made to reproduce the evaluation results of existing work, we release a performance benchmark of popular models, which can be a useful reference to facilitate future research. The tool and benchmark are available at https://github.com/yqtianust/EvalDNN and https://yqtianust.github.io/EvalDNN-benchmark/, respectively. A demo video of EvalDNN is available at: https://youtu.be/v69bNJN2bJc.
Year
DOI
Venue
2020
10.1145/3377812.3382133
International Conference on Software Engineering
Keywords
DocType
ISSN
Deep Learning Model, Evaluation
Conference
0270-5257
ISBN
Citations 
PageRank 
978-1-7281-6528-8
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Yongqiang Tian1121.48
Zhihua Zeng210.68
Ming Wen313711.70
Yepang Liu441524.58
Tzu-yang Kuo531.10
S. C. Cheung62657162.89