Title
Improving Action Recognition with Valued Patches Exploiting
Abstract
Recent human action recognition methods mainly model a two-stream or multi-stream deep neural network, with which human spatiotemporal features can be exploited effectively. However, due to the ignoring of interactive scenes, most of these methods cannot achieve impressive performance. In this paper, we propose a novel multi-stream fusion framework based on discriminative scene patches and motion patches. Unlike existing two-stream or multi-stream methods, our work improves the accuracy by 1) Attaching more attention to the exploiting of discriminative scene patches and motion patches. 2) Proposing a novel 2D+3D multi-stream feature aggregation mechanism: 2D features from RGB images and 3D features of valued patches are combined to improve the representation of spatiotemporal features. Our framework is evaluated on three widely used video action benchmarks, where it outperforms other state-of-the-art recognition approaches by a significant margin: the accuracy up to 85.7% at JHMDB, 87.7% at HMDB51, and 98.6% at UCF101, respectively.
Year
DOI
Venue
2019
10.1109/BigMM.2019.00-27
2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM)
Keywords
Field
DocType
Action recognition, 2D+3D multi steam, valued patches, feature aggregation
Pattern recognition,Computer science,Action recognition,Feature extraction,RGB color model,Artificial intelligence,Artificial neural network,Feature aggregation,Discriminative model,Optical imaging
Conference
ISBN
Citations 
PageRank 
978-1-7281-5528-9
0
0.34
References 
Authors
4
5
Name
Order
Citations
PageRank
Luo Wu144.54
Chongyang Zhang28421.63
Weiwei Liu300.34
Jintao Wu400.34
Weiyao Lin573268.05