Skipping RNN State Updates without Retraining the Original Model - Citegraph

Paper Info

Title
Skipping RNN State Updates without Retraining the Original Model

Abstract
Recurrent Neural Networks (RNNs) break a time-series input (or a sentence) into multiple time-steps (or words) and process it one time-step (word) at a time. However, not all of these time-steps (words) need to be processed to determine the final output accurately. Prior work has exploited this intuition by incorporating an additional predictor in front of the RNN model to prune time-steps that are not relevant. However, they jointly train the predictor and the RNN model, allowing one to learn from the mistakes of the other. In this work we present a method to skip RNN time-steps without retraining or fine tuning the original RNN model. Using an ideal predictor, we show that even without retraining the original model, we can train a predictor to skip 45% of steps for the SST dataset and 80% of steps for the IMDB dataset without impacting the model accuracy. We show that the decision to skip is not easy by comparing against 5 different baselines based on solutions derived from domain knowledge. Finally, we present a case study about the cost and accuracy benefits of realizing such a predictor. This realistic predictor on the SST dataset is able to reduce the computation by more than 25% with at most 0.3% loss in accuracy while being 40× smaller than the original RNN model.

Year	DOI	Venue
2019	10.1145/3362743.3362965	Proceedings of the 1st Workshop on Machine Learning on Edge in Sensor Systems
Keywords	Field	DocType
RNN, RNN inference runtime, model acceleration, sentiment analysis	Computer science,Speech recognition,Retraining	Conference
ISBN	Citations	PageRank
978-1-4503-7011-0	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jin Tao	1	0	0.34
Urmish Thakker	2	1	3.74
Ganesh S. Dasika	3	387	24.30
Jesse G. Beu	4	2	3.41

1