Gaze-Driven video streaming with saliency-based dual-stream switching - Citegraph

Paper Info

Title
Gaze-Driven video streaming with saliency-based dual-stream switching

Abstract
The ability of a person to perceive image details falls precipitously with larger angle away from his visual focus. At any given bitrate, perceived visual quality can be improved by employing region-of-interest (ROI) coding, where higher encoding quality is judiciously applied only to regions close to a viewer's focal point. Straight-forward matching of viewer's focal point with ROI coding using a live encoder, however, is computation-intensive. In this paper, we propose a system that supports ROI coding without the need of a live encoder. The system is based on dynamic switching between two pre-encoded streams of the same content: one at high quality (HQ), and the other at mixed quality (MQ), where quality of a spatial region depends on its pre-computed visual saliency values. Distributed source coding (DSC) frames are periodically inserted to facilitate switching. Using a Hidden Markov Model (HMM) to model a viewer's temporal gaze movement, MQ stream is pre-encoded based on ROI coding to minimize the expected streaming rate, while keeping the probability of a viewer observing low quality (LQ) spatial regions below an application-specific ϵ. At stream time, the viewer's gaze locations are collected and transmitted to server for intelligent stream switching. In particular, server employs MQ stream only if: i) viewer's tracked gaze location falls inside the high-saliency regions, and ii) the probability that a viewer's gaze point will soon move outside high-saliency regions, computed using tracked gaze data and updated saliency values, is below ϵ. Experiments showed that video streaming rate can be reduced by up to 44%, and subjective quality is noticeably better than a competing scheme at the same rate where the entire video is encoded using equal quantization.

Year	DOI	Venue
2012	10.1109/VCIP.2012.6410732	VCIP
Keywords	Field	DocType
high-saliency regions,roi coding,mixed quality stream,region-of-interest coding,hq stream,dynamic switching,visual saliency,dsc frames,viewer focal point,encoding quality,precoding,equal quantization,live encoder,temporal gaze movement,straight-forward matching,quantisation (signal),hmm,region-of-interest encoding,gaze-driven video streaming,updated saliency values,source coding,perceived visual quality,video coding,mq stream,intelligent stream switching,pre-computed visual saliency values,high-quality stream,tracked gaze data,pre-encoded streams,streaming rate minimization,video streaming,viewer observing low-quality spatial region probability,hidden markov models,viewer gaze location,distributed source coding,hidden markov model,visual focus,saliency-based dual-stream switching	Computer vision,Gaze,Computer science,Salience (neuroscience),Coding (social sciences),Artificial intelligence,Encoder,Distributed source coding,Quantization (signal processing),Hidden Markov model,Encoding (memory)	Conference
ISBN	Citations	PageRank
978-1-4673-4406-7	0	0.34
References	Authors
6	4

Authors (4 rows)

Cited by (0 rows)

References (6 rows)

Name	Order	Citations	PageRank
Yun-long Feng	1	125	11.69
Gene Cheung Connie Chan	2	1387	121.82
Wai-tian Tan	3	672	78.92
Yusheng Ji	4	1459	162.16

1