Title
Long Activity Video Understanding using Functional Object-Oriented Network.
Abstract
Video understanding is one of the most challenging topics in computer vision. In this paper, a four-stage video understanding pipeline is presented to simultaneously recognize all atomic actions and the single ongoing activity in a video. This pipeline uses objects and motions from the video and a graph-based knowledge representation network as prior reference. Two deep networks are trained to identify objects and motions in each video sequence associated with an action and low level image features are used to identify objects of interest in the video sequence. Confidence scores are assigned to objects of interest to represent their involvement in the action and to motion classes based on results from a deep neural network that classifies an ongoing action in video into motion classes. Confidence scores are computed for each candidate functional unit to associate them with an action using a knowledge representation network, object confidences, and motion confidences. Each action, therefore, is associated with a functional unit, and the sequence of actions is evaluated to identify the sole activity occurring in the video. The knowledge representation used in the pipeline is called the functional object-oriented network, which is a graph-based network useful for encoding knowledge about manipulation tasks. Experiments are performed on a dataset of cooking videos to test the proposed algorithm with action inference and activity classification. Experiments show that using a functional object-oriented network improves video understanding significantly.
Year
DOI
Venue
2018
10.1109/TMM.2018.2885228
IEEE Transactions on Multimedia
Keywords
Field
DocType
Knowledge representation,Knowledge based systems,Object oriented modeling,Activity recognition,Pipelines,Object recognition,Task analysis
Knowledge representation and reasoning,Activity recognition,Pattern recognition,Object-oriented programming,Computer science,Feature (computer vision),Knowledge-based systems,Artificial intelligence,Artificial neural network,Encoding (memory),Cognitive neuroscience of visual object recognition
Journal
Volume
Issue
ISSN
abs/1807.00983
7
1520-9210
Citations 
PageRank 
References 
4
0.40
0
Authors
3
Name
Order
Citations
PageRank
Ahmad Babaeian Jelodar161.45
David Paulius262.16
Yu Sun320835.82