Inter-Layer Prediction for Scalable Video Coding

For supporting spatial scalable and quality scalable coding, SVC follows the conventional approach of multi-layer coding. In each layer, motion-compensated prediction and intra prediction are employed as for single-layer coding. In addition, SVC provides so-called inter-layer prediction methods (see Figure 1), which allow an exploitation of the statistical dependencies between different layers for improving the coding efficiency (reducing the bit rate) of enhancement layers.

Figure 1: Multi-layer structure with additional inter-layer prediction (black arrows).

All inter-layer prediction tools (see Figure 2) can be chosen on a macroblock or sub-macroblock basis:

Inter-layer intra prediction: The macroblock prediction signal is completely inferred from the reconstructed co-located blocks in the reference layer. This mode is only available when the co-located reference layer blocks are intra-coded.
Inter-layer macroblock mode and motion prediction: The enhancement layer macroblock is inter-picture predicted as in single-layer H.264/AVC coding, but its partitioning and the associated motion parameters are completely derived from the co-located blocks in the reference layer.
Inter-layer residual prediction: The up-sampled residual of the co-located reference layer blocks is subtracted from the enhancement layer residual (difference between the original and the inter-picture prediction signal) and only the resulting difference, which often has a smaller energy then the original residual signal, is encoded using transform coding as specified in H.264/AVC.

Figure 2: Illustration of inter-layer prediction tools: (left) upsampling of intra-coded macroblock for inter-layer intra prediction, (middle) upsampling of macroblock partition in dyadic spatial scalability for inter-layer prediction of macroblock modes, (right) upsampling of residual signal for inter-layer residual prediction.

As an important feature of the SVC design, each spatial enhancement layer can be decoded with a single motion compensation loop. For the employed reference layers, only the intra-coded macroblocks and residual blocks that are used for inter-layer prediction need to be reconstructed (including the deblocking filter operation) and the motion vectors need to be decoded. The computationally complex operations of motion-compensated prediction and the deblocking of inter-picture predicted macroblocks only need to be performed for the target layer to be displayed.

References

H. Schwarz, T. Hinz, H. Kirchhoffer, D. Marpe, and T. Wiegand, Technical Description of the HHI proposal for SVC CE1, ISO/IEC JTC1/SC29/WG11, doc. M11244, Palma de Mallorca, Spain, October 2004.
H. Schwarz, T. Hinz, D. Marpe, and T. Wiegand, Constrained inter-layer prediction for single-loop decoding in spatial scalability, IEEE International Conference on Image Processing, September 2005.
H. Schwarz, D. Marpe, and T. Wiegand, Overview of the Scalable Video Coding Extension of the H.264/AVC Standard, IEEE Transactions on Circuits and Systems for Video Technology, September 2007.

Inter-Layer Prediction for Scalable Video Coding

References

Dr.-Ing. Detlev Marpe

Dr.-Ing. Heiko Schwarz

Prof. Dr.-Ing. Thomas Wiegand