The concept of transform coefficient partitioning is used for increasing the granularity of packet-based quality scalable coding:
Packet-based quality scalability (PQS) is a desirable feature in many video coding or transmission applications. Any realization of PQS in a hybrid video coding approach, however, requires suitable concepts for controlling drift and for generating sufficiently small increments in bit rate in order to allow progressive refinements of perceptual quality relative to a given base layer quality. We developed an algorithmically simple but yet remarkably well-performing method for packet-based fidelity scalability that is maximally consistent with the existing entropy coding design of H.264/AVC, allows sufficiently small increments in bit rate, and has been adopted as a normative element of SVC.
The SVC design includes two methods for quality scalable coding, so-called coarse granular or layer-based quality scalability (LQS) and medium granular or packet-based quality scalability (PQS). Conceptually, LQS is based on the multi-layer concept of SVC, which means that usually only a small number of discrete fidelity layers is supported. As an additional restriction, the configuration of fidelity layers in terms of supported target rate-distortion (R-D) points or potential switching points between them must be determined at the encoding time and is therefore fixed in advance. This may not be suitable for certain applications where, e.g., a higher flexibility in terms of bitstream adaptation is requested. Consequently, as a variation of LQS, PQS has been included in the SVC design, which allows switching between different fidelity layers at virtually any arbitrary point in the bitstream. However, as an important side condition for PQS, so-called key pictures have to be inserted as resynchronization points on a regular basis in order to limit the drift effects that may result from discarding fidelity refinement packets which have been included as a reference for motion-compensated prediction at the encoder.
When using inter-layer residual prediction in SVC, both LQS and PQS are typically realized by generating a quality scalable representation of the residual texture signal. This scalable representation is conceptually composed of a base layer and one or several enhancement layers. The base-layer residual texture signal is obtained by first quantizing the residual texture signal with a certain pre-determined quantization step size and corresponds to a reconstruction with the lowest supported quality. The additional quality enhancement layers on top of the base layer are generated by repeatedly re-quantizing the quantization error that results from a possible change in motion parameters and from the quantization of the base layer and, if more than one enhancement layer is involved, all preceding enhancement layers. Typically, the quantization step size is halved from one layer to the next in the corresponding encoding process in order to limit the bit rate overhead relative to single-layer coding. At the decoder, the received number of fidelity enhancement layers is added to the base layer in order to create a reconstruction of the residual texture signal. In this way, each of the fidelity enhancement layers represents a discrete step of quality improvement which has to be applied fully or not at all. But in some cases it could be desirable to scale the fidelity of the video signal in smaller steps than by using a whole fidelity enhancement layer.
In order to provide a finer granularity for quality scalable coding, we added the possibility to distribute the enhancement layer transform coefficients among several slices. To this end, the first and the last scan index for transform coefficients are signaled in the slice headers, and the slice data only include transform coefficient levels for scan indices inside the signaled range. Thus, the information for a quality refinement picture that corresponds to a certain quantization steps size can be distributed over several NAL units corresponding to different quality refinement layers with each of them containing refinement coefficients for particular transform basis functions only and the granularity of quality scalable coding can be increased. An example for the transform coefficient partitioning is illustrated in Figure 1.
References
- H. Kirchhoffer, D. Marpe, H. Schwarz, and T. Wiegand, "A low-complexity approach for increasing the granularity of packet-based fidelity scalability in scalable video coding," Picture Coding Symposium, 2007.
- H. Schwarz, D. Marpe, and T. Wiegand, "Overview of the Scalable Video Coding Extension of the H.264/AVC Standard," IEEE Trans. on Circuits and Systems for Video Technology, Sept. 2007.