Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset. - Citegraph

Paper Info

Title
Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset.

Abstract
YouTube-8M is the largest video dataset for multi-label video classification. In order to tackle the multi-label classification on this challenging dataset, it is necessary to solve several issues such as temporal modeling of videos, label imbalances, and correlations between labels. We develop a deep neural network model, which consists of four components: the frame encoder, the classification layer, the label processing layer, and the loss function. We introduce our newly proposed methods and discusses how existing models operate in the YouTube-8M Classification Task, what insights they have, and why they succeed (or fail) to achieve good performance. Most of the models we proposed are very high compared to the baseline models, and the ensemble of the models we used is 8th in the Kaggle Competition.

Year	Venue	Field
2017	arXiv: Computer Vision and Pattern Recognition	Pattern recognition,Computer science,Encoder,Temporal modeling,Artificial intelligence,Prior probability,Artificial neural network,Machine learning,Encoding (memory)
DocType	Volume	Citations
Journal	abs/1706.07960	1
PageRank	References	Authors
0.35	10	5

Authors (5 rows)

Cited by (1 rows)

References (10 rows)

Name	Order	Citations	PageRank
Seil Na	1	13	0.86
Youngjae Yu	2	4	2.42
Sangho Lee	3	144	17.97
Jisung Kim	4	13	1.87
Gunhee Kim	5	1	3.73

1