Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore Visual Media Coding and Transmission

Visual Media Coding and Transmission

Published by Willington Island, 2021-07-26 02:21:34

Description: Visual Media Coding and Transmission is an output of VISNET II NoE, which is an EC IST-FP6 collaborative research project by twelve esteemed institutions from across Europe in the fields of networked audiovisual systems and home platforms. The authors provide information that will be essential for the future study and development of visual media communications technologies. The book contains details of video coding principles, which lead to advanced video coding developments in the form of Scalable Coding, Distributed Video Coding, Non-Normative Video Coding Tools and Transform Based Multi-View Coding. Having detailed the latest work in Visual Media Coding, networking aspects of Video Communication is detailed. Various Wireless Channel Models are presented to form the basis for both link level quality of service (QoS) and cross network transmission of compressed visual data. Finally, Context-Based Visual Media Content Adaptation is discussed with some examples.

MEDIA DOODLE

Search

Read the Text Version

Non normative Video Coding Tools 177 In some extreme cases, notably for small buffer sizes, this soft SP-level VBV control may lead to SP bit allocations near imminent violations of the VBV mechanism; therefore, whenever this situation occurs, a further adjustment is performed in order to guarantee that the SP bit allocation will keep the VBV occupancy within the nominal VBV operation area defined by: bL Â BS B bU Â BS; with bL ¼ 0:05 and bU ¼ 1:0 ð5:23Þ (Reproduced by Permission of Ó2007 IEEE.) 5.4.3 Performance Evaluation The performance for MVO encoding of the proposed rate control solution (so called IST solution) is compared with the MPEG-4 VM5 rate control algorithm initially described in [32]. Two random access conditions are tested: (a) one random access point (I-VOP) every second: label IP ¼ 1 s; and(b) one single random access point at the beginning of the sequence (IPP. . .): label IP ¼ 10 s. The VBV buffer size is set numerically to R/2 bits (R: target bit rate). Four representative test sequences at 30 Hz and with 300 frames have been selected: Stefan and Bream with two VOs; and Coastguard and News with four VOs. These sequences can be grouped according to their motion activity into: (a) high-motion video sequences (Stefan and Coastguard); and (b) low-motion video sequences (Bream and News). The two rate control solutions are compared in terms of the so-called average scene quality, measured as the luminance average scene PSNR between the original and the reconstructed video frames at the decoder using the tool for compactly comparing two PSNR curves developed by the ITU-T Video Coding Experts Group [40] (see Table 5.3). The so-called scene PSNR variation is also used to assess the quality smoothness between the various VOs in the scene; it is computed as the ratio between the average scene PSNR difference and the average scene PSNR, where the first is the weighted sum of the absolute difference between each VO PSNR and the scene PSNR for each SP (weighted by the relative size of each VO). Table 5.3 illustrates the PSNR gains and bit-rate reductions for the proposed solution in two different conditions: (a) proposed MVO RC solution with VM5 VBV control, i.e. without Section 5.4.2.2; (b) proposed MVO RC solution with the VBV control proposed in Section 5.4.2.2 (IST label). These results support the following conclusions: Table 5.3 Average PSNR and bit rate gains (CIF at 15 Hz) Sequence PSNR (dB) Bit rate (%) IP ¼ 1 s IP ¼ 10 s IP ¼ 1 s IP ¼ 10 s a ba b a b a b Stefan 1.98 2.57 1.91 2.15 30.4 38.8 34.5 37.7 Coastguard 0.61 1.13 0.58 0.55 12.6 21.8 11.8 10.9 Bream 0.33 0.80 0.13 0.06 6.2 15.5 2.5 1.1 News 1.07 2.50 0.78 0.49 54.7 33.9 16.8 8.9 0.83 1.75 0.40 0.54 22.9 27.5 6.8 9.7

178 Visual Media Coding and Transmission Stefan MVO – IP = 1 s 34 Stefan QCIF 15 Hz – 256 kbit/s – IP = 1 s 30 32 IST – Scene 30 VM5 – Scene 29 28 26 30 60 90 120 150 180 210 240 270 300 28 24 VOP 22 Scene PSNR Y (dB)27 20 (b) PSNR Y (dB) 18 26 16 14 25 0 24 VM5 QCIF 7.5 Hz 23 IST QCIF 7.5 Hz VM5 QCIF 15 Hz 22 IST QCIF 15 Hz VM5 CIF 15 Hz 21 IST CIF 15 Hz VM5 CIF 30 Hz IST CIF 30 Hz 20 0 128 256 384 512 640 768 896 1024 1152 1280 Bit Rate (kbit/s) (a) Figure 5.3 Stefan (IP ¼ 1 s): (a) average scene PSNR versus bit rate; (b) scene PSNR evolution QCIF at 15 Hz (256 kbps). Reproduced by Permission of Ó2007 IEEE . Both cases (a) and (b) have higher PSNR gains (thus also bit-rate reductions) for IP ¼ 1 s tests due to the more efficient bit allocation and the finer QP (MB-level) control, as also illustrated in Figure 5.3(a) for Stefan under various encoding conditions. PSNR gains for case (b) can be as high as 2.6 dB. . For the less demanding scenarios, i.e. IP ¼ 10 s and low-motion sequences, VM5 performs slightly better than case (a) and close to case (b) in terms of average scene PSNR, due to the high coding quality of the easy-to-code background VOs. However, the scene PSNR variation is lower (smoother quality) for cases (a) and (b), as illustrated in Figure 5.4 for Bream (case (b)). Scene PSNR Y Variation 0.30 Bream MVO – IP = 10 s 0.25 0.20 VM5 QCIF 7.5 Hz IST QCIF 7.5 Hz VM5 QCIF 15 Hz IST QCIF 15 Hz VM5 CIF 15 Hz IST CIF 15 Hz VM5 CIF 30 Hz IST CIF 30 Hz 0.15 0.10 0.05 0.00 128 256 384 512 640 768 896 1024 1152 1280 0 Bit Rate (kbit/s) Figure 5.4 Bream (IP ¼ 10 s) scene PSNR variation versus bit rate. Reproduced by Permission of Ó2007 IEEE

Non normative Video Coding Tools 179 . Case (b) provides an additional PSNR gain, relatively to case (b), of approximately 0.9 dB on average (5% less bit rate), also reducing the number of skipped frames, as illustrated in Figure 5.3(b) for Stefan (shown as severe PSNR drops). 5.4.4 Conclusions This work proposes two improved feedback mechanisms (VO distortion feedback and video rate buffer feedback) for low-delay MVO MPEG-4 encoding. The proposed solution was compared with the MPEG-4 VM5 solution [32], with the main conclusion that the proposed MVO RC solution clearly outperforms the benchmarking solution in terms of the average quality and quality smoothness, resulting in a more efficient use of the available resources, i.e. the buffer space and the target bit rate. 5.5 Optimal Rate Allocation for H.264/AVC Joint MVS Transcoding (Portions reprinted, with permission, from “Joint bit-allocation for multi-sequence H.264/AVC video coding rate control”, Picture Coding Symposium, Lisbon, November 2007. Ó2007. EURASIP) 5.5.1 Problem Definition and Objectives Let us consider the scenario depicted in Figure 5.5, where the input is represented by a set of pre-encoded H.264/AVC sequences, each encoded in either VBR or CBR mode. The sequences are multiplexed into a single channel, characterized by a global rate constraint Rtot. Therefore, the problem is twofold: Rate controller H.264 bitstream H.264 H.264 decoder encoder H.264 bitstream H.264 H.264 decoder encoder ... H.264 bitstream H.264 H.264 decoder encoder Figure 5.5 Proposed transcoding architecture for multi sequence rate control. Reproduced by permis sion of Ó2007 EURASIP

180 Visual Media Coding and Transmission . The rate controller module is responsible for optimally allocating the bit budget to the output sequences. . Transcoding needs to be performed in order to meet a more stringent rate constraint and adjust the output bit rate of the sequences. In this work, the focus is on defining novel solutions for the rate control module, while a very simple transcoding algorithm is envisaged in order to serve as a “proof of concept”. Specifically, in order to avoid issues related to drift propagation, an explicit transcoder that decodes the sequence in the pixel domain and re-encodes it subject to the new, more stringent, rate constraint is adopted. In order to speed up the transcoding process, the encoders in Figure 5.5 borrow the mode decisions and the motion vectors obtained from the decoded bitstream. Although this approach is not optimal in general, a broadcast scenario where the video content is encoded at high rates is addressed. Under this hypothesis, mode decision and motion information typically represent a small fraction of the overall bit rate. Therefore, efficient rate control can be simply obtained by re-quantizing the DCT coefficients with a coarser quantization step. 5.5.2 Proposed Technical Solution (Portions reprinted, with permission, from Mariusz Jakubowski, Grzegorz Pastuszak, “Multi- path adaptive computation-aware search strategy for block-based motion estimation,” The International Conference on Computer as a Tool. EUROCON, 9 12 September. 2007 pages: 175 181. Ó2007 IEEE.) The rate control problem addressed in this work can be described as follows. Consider the problem of simultaneously transmitting S different video sources. Let R ¼ R1, R2,. . .,RS]T denote the rate allocation strategy, where Rs is expressed in bits/sample. Let D(R) ¼ [D1(R1), D2(R2)],. . .,DS(RS)]T denote the output distortion corresponding to the rate allocation strategy, R. The rate control module tries to minimize the average output distortion: R* ¼ arg min 1 XS DsðRsÞ ð5:24Þ S R s¼1 subject to the overall rate constraint: XS Rtot ð5:25Þ Rs s¼1 In order to find the optimal solution, RÃ, the problem is formulated in the r domain. In [43], it is shown that in any typical transform domain system, there is always a linear relationship between the coding bit rate, R, and the percentage of zeros among quantized transform coefficients, denoted by r, i.e.: RðrÞ ¼ uð1 À rÞ½bpsŠ ð5:26Þ where u is a constant parameter that depends on the source. Given a parametric model for the probability density function of transform coefficients, it is possible to express the distortion D


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook