Abstract | ||
---|---|---|
The ability of a person to perceive image details falls precipitously with larger angle away from his visual focus. At any given bitrate, perceived visual quality can be improved by employing region-of-interest (ROI) coding, where higher encoding quality is judiciously applied only to regions close to a viewer's focal point. Straight-forward matching of viewer's focal point with ROI coding using a live encoder, however, is computation-intensive. In this paper, we propose a system that supports ROI coding without the need of a live encoder. The system is based on dynamic switching between two pre-encoded streams of the same content: one at high quality (HQ), and the other at mixed quality (MQ), where quality of a spatial region depends on its pre-computed visual saliency values. Distributed source coding (DSC) frames are periodically inserted to facilitate switching. Using a Hidden Markov Model (HMM) to model a viewer's temporal gaze movement, MQ stream is pre-encoded based on ROI coding to minimize the expected streaming rate, while keeping the probability of a viewer observing low quality (LQ) spatial regions below an application-specific ϵ. At stream time, the viewer's gaze locations are collected and transmitted to server for intelligent stream switching. In particular, server employs MQ stream only if: i) viewer's tracked gaze location falls inside the high-saliency regions, and ii) the probability that a viewer's gaze point will soon move outside high-saliency regions, computed using tracked gaze data and updated saliency values, is below ϵ. Experiments showed that video streaming rate can be reduced by up to 44%, and subjective quality is noticeably better than a competing scheme at the same rate where the entire video is encoded using equal quantization. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/VCIP.2012.6410732 | VCIP |
Keywords | Field | DocType |
high-saliency regions,roi coding,mixed quality stream,region-of-interest coding,hq stream,dynamic switching,visual saliency,dsc frames,viewer focal point,encoding quality,precoding,equal quantization,live encoder,temporal gaze movement,straight-forward matching,quantisation (signal),hmm,region-of-interest encoding,gaze-driven video streaming,updated saliency values,source coding,perceived visual quality,video coding,mq stream,intelligent stream switching,pre-computed visual saliency values,high-quality stream,tracked gaze data,pre-encoded streams,streaming rate minimization,video streaming,viewer observing low-quality spatial region probability,hidden markov models,viewer gaze location,distributed source coding,hidden markov model,visual focus,saliency-based dual-stream switching | Computer vision,Gaze,Computer science,Salience (neuroscience),Coding (social sciences),Artificial intelligence,Encoder,Distributed source coding,Quantization (signal processing),Hidden Markov model,Encoding (memory) | Conference |
ISBN | Citations | PageRank |
978-1-4673-4406-7 | 0 | 0.34 |
References | Authors | |
6 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yun-long Feng | 1 | 125 | 11.69 |
Gene Cheung Connie Chan | 2 | 1387 | 121.82 |
Wai-tian Tan | 3 | 672 | 78.92 |
Yusheng Ji | 4 | 1459 | 162.16 |