Jump to Content

Debargha Mukherjee

Debargha Mukherjee received his M.S./Ph.D. degrees in ECE from University of California Santa Barbara in 1999. Thereafter, through 2009 he was with Hewlett Packard Laboratories, conducting research on video/image coding and processing. Since 2010 he has been with Google Inc., where he is currently involved with open-source video codec development. Prior to that he was responsible for video quality control and 2D-3D conversion on YouTube. Debargha has authored/co-authored more than 100 papers on various signal processing topics, and holds more than 40 US patents, with several more pending. He has delivered many workshops and talks on Google's VPx line of codecs since 2012. He currently serves as an Associate Editor of the IEEE Trans. on Circuits and Systems for Video Technology and has previously served as Associate Editor of the IEEE Trans. on Image Processing; he is also a member of the IEEE Image, Video, and Multidimensional Signal Processing Technical Committee (IVMSP TC).
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    AN OVERVIEW OF CORE CODING TOOLS IN THE AV1 VIDEO CODEC
    Adrian Grange
    Andrey Norkin
    Ching-Han Chiang
    Hui Su
    Jean-Marc Valin
    Luc Trudeau
    Nathan Egge
    Paul Wilkins
    Peter de Rivaz
    Sarah Parker
    Steinar Midtskogen
    Thomas Davies
    Zoe Liu
    The Picture Coding Symposium (PCS) (2018)
    Preview abstract AV1 is an emerging open-source and royalty-free video compression format, which is jointly developed and finalized in early 2018 by the Alliance for Open Media (AOMedia) industry consortium. The main goal of AV1 development is to achieve substantial compression gain over state-of-the-art codecs while maintaining practical decoding complexity and hardware feasibility. This paper provides a brief technical overview of key coding techniques in AV1 along with preliminary compression performance comparison against VP9 and HEVC. View details
    Novel modes and adaptive block scanning order for intra prediction in AV1
    Ofer Hadar
    Ariel Shleifer
    Itai Mazar
    Michael Yuzvinsky
    Nitzan Tavor
    Nati Itzhak
    Raz Birman
    SPIE Optical Engineering + Applications, vol. 10396 (2017), 10396 - 10396 - 10
    Preview abstract The demand for streaming video content is on the rise and growing exponentially. Networks bandwidth is very costly and therefore there is a constant effort to improve video compression rates and enable the sending of reduced data volumes while retaining quality of experience (QoE). One basic feature that utilizes the spatial correlation of pixels for video compression is Intra-Prediction, which determines the codec’s compression efficiency. Intra prediction enables significant reduction of the Intra-Frame (I frame) size and, therefore, contributes to efficient exploitation of bandwidth. In this presentation, we propose new Intra-Prediction algorithms that improve the AV1 prediction model and provide better compression ratios. Two (2) types of methods are considered: )1( New scanning order method that maximizes spatial correlation in order to reduce prediction error; and )2( New Intra-Prediction modes implementation in AVI. Modern video coding standards, including AVI codec, utilize fixed scan orders in processing blocks during intra coding. The fixed scan orders typically result in residual blocks with high prediction error mainly in blocks with edges. This means that the fixed scan orders cannot fully exploit the content-adaptive spatial correlations between adjacent blocks, thus the bitrate after compression tends to be large. To reduce the bitrate induced by inaccurate intra prediction, the proposed approach adaptively chooses the scanning order of blocks according to criteria of firstly predicting blocks with maximum number of surrounding, already Inter-Predicted blocks. Using the modified scanning order method and the new modes has reduced the MSE by up to five (5) times when compared to conventional TM mode / Raster scan and up to two (2) times when compared to conventional CALIC mode / Raster scan, depending on the image characteristics (which determines the percentage of blocks predicted with Inter-Prediction, which in turn impacts the efficiency of the new scanning method). For the same cases, the PSNR was shown to improve by up to 7.4dB and up to 4 dB, respectively. The new modes have yielded 5% improvement in BD-Rate over traditionally used modes, when run on K-Frame, which is expected to yield ~1% of overall improvement. View details
    Novel inter and intra prediction tools under consideration for the emerging AV1 video codec
    Sarah Parker
    Hui Su
    Angie Chiang
    Zoe Liu
    Chen Wang
    Emil Keyder
    SPIE Optical Engineering + Applications, vol. 10396 (2017), 10396 - 10396 - 13
    Preview abstract Google started the WebM Project in 2010 to develop open source, royalty-free video codecs designed specifically for media on the Web. The second generation codec released by the WebM project, VP9, is currently served by YouTube, and enjoys billions of views per day. Realizing the need for even greater compression efficiency to cope with the growing demand for video on the web, the WebM team embarked on an ambitious project to develop a next edition codec AV1, in a consortium of major tech companies called the Alliance for Open Media, that achieves at least a generational improvement in coding efficiency over VP9. In this paper, we focus primarily on new tools in AV1 that improve the prediction of pixel blocks before transforms, quantization and entropy coding are invoked. Specifically, we describe tools and coding modes that improve intra, inter and combined inter-intra prediction. Results are presented on standard test sets. View details
    The latest open-source video codec VP9 - An overview and preliminary results
    Jingning Han
    Jim Bankoski
    Ronald S Bultje
    Adrian Grange
    John Koleszar
    Paul Wilkins
    Yaowu Xu
    SMPTE Motion Imaging Journal, vol. 124 (2015)
    Preview
    Preview abstract The hybrid transform coding scheme that alternates amongst the asymmetric discrete sine transform (ADST) and the discrete cosine transform (DCT) depending on the boundary prediction conditions, is an efficient tool for video and image compression. It optimally exploits the statistical characteristics of prediction residual, thereby achieving significant coding performance gains over the conventional DCT-based approach. A practical concern lies in the intrinsic conflict between transform kernels of ADST and DCT, which prevents a butterfly structured implementation for parallel computing. Hence the hybrid transform coding scheme has to rely on matrix multiplication, which presents a speed-up barrier due to under-utilization of the hardware, especially for larger block sizes. In this work, we devise a novel ADST-like transform whose kernel is consistent with that of DCT, thereby enabling butterfly structured computation flow, while largely retaining the performance advantages of hybrid transform coding scheme in terms of compression efficiency. A prototype implementation of the proposed butterfly structured hybrid transform coding scheme is available in the VP9 codec repository. View details
    Video Description Length Guided Constant Quality Video Coding with Bitrate Constraint
    Lei Yang
    Dapeng Wu
    Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on, IEEE, 2001 L Street, NW. Suite 700 Washington, DC 20036-4910 USA, pp. 366-371
    Preview abstract In this paper, we propose a new video encoding strategy - Video description length guided Constant Quality video coding with Bitrate Constraint (V-CQBC), for large scale video transcoding systems of video charing websites with varying unknown video contents. It provides smooth quality and saves bitrate and computation for transcoding millions of videos in both real time and batch mode. The new encoding strategy is based on the average bitrate-quality regression model and adapt to the encoded videos. Furthermore, three types of video description length (VDL), describing the video overall, spatial and temporal content complexity, are proposed to guide video coding. Experimental results show that the proposed coding strategy with saved computation could achieve better or similar RD performance than other coding strategies. View details
    No Results Found