Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks - Citegraph

Paper Info

Title
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Abstract
Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements. Moreover, it was used succesfully in our winning entry to the MSCOCO image captioning challenge, 2015.

Year	Venue	Field
2015	Annual Conference on Neural Information Processing Systems	Sequence prediction,Suzuki-Kasami algorithm,Closed captioning,Inference,Computer science,Machine translation,Recurrent neural network,Speech recognition,Artificial intelligence,Sampling (statistics),Security token,Machine learning
DocType	Volume	ISSN
Journal	abs/1506.03099	1049-5258
Citations	PageRank	References
208	6.26	17
Authors
4

Search Limit

100208

Authors (4 rows)

Cited by (100 rows)

References (17 rows)

Name	Order	Citations	PageRank
Samy Bengio	1	7213	485.82
Oriol Vinyals	2	9419	418.45
Navdeep Jaitly	3	2988	166.08
Noam Shazeer	4	1089	43.70

1