Title
Gaze-Driven video streaming with saliency-based dual-stream switching
Abstract
The ability of a person to perceive image details falls precipitously with larger angle away from his visual focus. At any given bitrate, perceived visual quality can be improved by employing region-of-interest (ROI) coding, where higher encoding quality is judiciously applied only to regions close to a viewer's focal point. Straight-forward matching of viewer's focal point with ROI coding using a live encoder, however, is computation-intensive. In this paper, we propose a system that supports ROI coding without the need of a live encoder. The system is based on dynamic switching between two pre-encoded streams of the same content: one at high quality (HQ), and the other at mixed quality (MQ), where quality of a spatial region depends on its pre-computed visual saliency values. Distributed source coding (DSC) frames are periodically inserted to facilitate switching. Using a Hidden Markov Model (HMM) to model a viewer's temporal gaze movement, MQ stream is pre-encoded based on ROI coding to minimize the expected streaming rate, while keeping the probability of a viewer observing low quality (LQ) spatial regions below an application-specific ϵ. At stream time, the viewer's gaze locations are collected and transmitted to server for intelligent stream switching. In particular, server employs MQ stream only if: i) viewer's tracked gaze location falls inside the high-saliency regions, and ii) the probability that a viewer's gaze point will soon move outside high-saliency regions, computed using tracked gaze data and updated saliency values, is below ϵ. Experiments showed that video streaming rate can be reduced by up to 44%, and subjective quality is noticeably better than a competing scheme at the same rate where the entire video is encoded using equal quantization.
Year
DOI
Venue
2012
10.1109/VCIP.2012.6410732
VCIP
Keywords
Field
DocType
high-saliency regions,roi coding,mixed quality stream,region-of-interest coding,hq stream,dynamic switching,visual saliency,dsc frames,viewer focal point,encoding quality,precoding,equal quantization,live encoder,temporal gaze movement,straight-forward matching,quantisation (signal),hmm,region-of-interest encoding,gaze-driven video streaming,updated saliency values,source coding,perceived visual quality,video coding,mq stream,intelligent stream switching,pre-computed visual saliency values,high-quality stream,tracked gaze data,pre-encoded streams,streaming rate minimization,video streaming,viewer observing low-quality spatial region probability,hidden markov models,viewer gaze location,distributed source coding,hidden markov model,visual focus,saliency-based dual-stream switching
Computer vision,Gaze,Computer science,Salience (neuroscience),Coding (social sciences),Artificial intelligence,Encoder,Distributed source coding,Quantization (signal processing),Hidden Markov model,Encoding (memory)
Conference
ISBN
Citations 
PageRank 
978-1-4673-4406-7
0
0.34
References 
Authors
6
4
Name
Order
Citations
PageRank
Yun-long Feng112511.69
Gene Cheung Connie Chan21387121.82
Wai-tian Tan367278.92
Yusheng Ji41459162.16