We have been active in developing coding solutions for 3D video for several years and have successfully contributed to the major international standards for 3D video coding. With a new, emerging generation of 3D displays that do not require glasses, 3D video formats like multiview video plus depth (MVD) [1] have gained importance. We have played a leading role in developing the 3D video extension of H.265 / HEVC (3D HEVC) and a comprehensive overview of the technical features can be found in [2].
The role of coding and transmission in the 3D video processing chain is illustrated in figure 2. The acquisition of MVD consists of capturing a real world 3D scene by two or more cameras (multiview video), followed by the sender-side processing for obtaining sample-accurate depth maps, i.e., depth estimation. After encoding the MVD sequences, the bit stream is transmitted to the receiver-side. The scene geometry information provided by the depth maps enables rendering views of the scene from various additional perspectives via depth-image-based rendering (DIBR). Hence, the suitable set of views for different 3D displays and applications can be rendered from the decoded video and depth data.
3D-HEVC builds upon the multi-layer coding design of HEVC, where each video or depth sequence of each view represents a different layer and prediction between layers is enabled by so-called inter-layer dependencies. The different supported types of inter-layer prediction are illustrated in figure 3:
- Inter-view prediction between the different views of either video or depth (as known from Multiview Video Coding).
- Inter-component prediction between the video and depth component of the same view.
- Combined inter-view / inter-component prediction between video and depth of different views.
The base-layer (L0 in figure 3) corresponds to a video layer that has no inter-layer dependencies and is consequently fully compliant to HEVC, while additional block-level coding tools have been included for coding the dependent video and depth layers (L1 to L5 in figure 3). Compared to simulcast coding, bit rate savings of about 70% are achieved for the dependent layers [2].
We have developed a wide variety of 3D video coding tools that are designed for exploiting the statistical dependencies between video and depth views, and explicitly adapted to the specific properties of depth maps:
- Inter-view prediction of motion data [4]:
Disparity-compensated prediction of motion parameters and residual data for dependent video layers, using inter-view reference pictures and coded or estimated depth information.
- Geometry-based depth coding [5]-[7]:
New intra and inter-component prediction modes, using wedgelet and contour block partitions, and a complementary residual adaptation method in the spatial domain.
- Motion parameter inheritance for depth [8]:
Inter-component prediction of motion vectors and block partitioning, reusing the co-located information of a video reference picture.
For increasing the end-to-end quality of a 3D video coding system, we have also developed improvements for the decoder-side view synthesis (DIBR).
NOTE — Links to the specification text and the reference software of 3D-HEVC can be found on the HEVC support site.
References
- K. Müller, P. Merkle, T. Wiegand, 3-D Video Representation Using Depth Maps, Proceedings of the IEEE , vol.99, no.4, pp.643-656, April 2011.
- G. Tech, Y. Chen, K. Müller, J.-R. Ohm, A. Vetro, Y.-K. Wang, Overview of the Multiview and 3D Extensions of High Efficiency Video Coding, IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 35-49, January 2016, doi: 10.1109/TCSVT.2015.2477935.
- K. Müller, H. Schwarz, D. Marpe, C. Bartnik, S. Bosse, H. Brust, T. Hinz, H. Lakshman, P. Merkle, F.H. Rhee, G. Tech, M. Winken, T. Wiegand, 3D High-Efficiency Video Coding for Multi-View Video and Depth Data, IEEE Transactions on Image Processing, vol.22, no.9, pp.3366-3378, Sept. 2013.
- H. Schwarz, T. Wiegand, Inter-view prediction of motion data in multiview video coding, Picture Coding Symposium (PCS 2012), pp.101-104, 7-9 May 2012.
- P. Merkle, K. Müller, D. Marpe, T. Wiegand, Depth Intra Coding for 3D Video based on Geometric Primitives, IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 3, pp. 570-582, March 2016, doi: 10.1109/TCSVT.2015.2407791.
- P. Merkle, C. Bartnik, K. Müller, D. Marpe, T. Wiegand, 3D video: Depth coding based on inter-component prediction of block partitions, Picture Coding Symposium (PCS 2012), pp.149-152, 7-9 May 2012.
- P. Merkle, K. Müller, T. Wiegand, Coding of depth signals for 3D video using wedgelet block segmentation with residual adaptation, IEEE International Conference on Multimedia and Expo (ICME 2013), pp.1-6, 15-19 July 2013.
- M. Winken, H. Schwarz, T. Wiegand, Motion vector inheritance for high efficiency 3D video plus depth coding, Picture Coding Symposium (PCS 2012), pp.53-56, 7-9 May 2012.
- G. Tech, H. Schwarz, K. Müller, T. Wiegand, 3D video coding using the synthesized view distortion change, Picture Coding Symposium (PCS 2012), pp.25-28, 7-9 May 2012.
- G. Tech, H. Schwarz, K. Müller, T. Wiegand, Synthesized View Distortion Based 3D Video Coding for Extrapolation and Interpolation of Views, IEEE International Conference on Multimedia and Expo (ICME 2012), pp.634-639, 9-13 July 2012.
- S. Bosse, H. Schwarz, T. Hinz, T. Wiegand, Encoder control for renderable regions in high efficiency multiview video plus depth coding, Picture Coding Symposium (PCS 2012), pp.129-132, 7-9 May 2012.